Régression linéaire simple en Python

Question

J'essaie d'implémenter cet algorithme pour trouver l'ordonnée à l'origine et la pente d'une variable unique:

Voici mon code Python pour mettre à jour l'interception et la pente. Mais cela ne converge pas. Le flux RSS augmente avec l'itération plutôt que de diminuer et après une certaine itération, il devient infini. Je ne trouve aucune erreur lors de la mise en œuvre de l'algorithme. Comment puis-je résoudre ce problème? J'ai également joint le fichier csv. Voici le code.

import pandas as pd import numpy as np #Defining gradient_decend #This Function takes X value, Y value and vector of w0(intercept),w1(slope) #INPUT FEATURES=X(sq.feet of house size) #TARGET VALUE=Y (Price of House) #W=np.array([w0,w1]).reshape(2,1) #W=[w0, # w1] def gradient_decend(X,Y,W): intercept=W[0][0] slope=W[1][0] #Here i will get a list #list is like this #Gd=[sum(predicted_value-(intercept+slope*x)), # sum(predicted_value-(intercept+slope*x)*x)] Gd=[sum(y-(intercept+slope*x) for x,y in Zip(X,Y)), sum(((y-(intercept+slope*x))*x) for x,y in Zip(X,Y))] return np.array(Gd).reshape(2,1) #Defining Resudual sum of squares def RSS(X,Y,W): return sum((y-(W[0][0]+W[1][0]*x))**2 for x,y in Zip(X,Y)) #Reading Training Data training_data=pd.read_csv("kc_house_train_data.csv") #Defining fixed parameters #Learning Rate n=0.0001 iteration=1500 #Intercept w0=0 #Slope w1=0 #Creating 2,1 vector of w0,w1 parameters W=np.array([w0,w1]).reshape(2,1) #Running gradient Decend for i in range(iteration): W=W+((2*n)* (gradient_decend(training_data["sqft_living"],training_data["price"],W))) print RSS(training_data["sqft_living"],training_data["price"],W)

Ici est le fichier CSV.

Kazi Nazmul Haque Shezan · Accepted Answer

J'ai résolu mon propre problème!

Voici le chemin résolu.

import numpy as np import pandas as pd import math from sys import stdout #function Takes the pandas dataframe, Input features list and the target column name def get_numpy_data(data, features, output): #Adding a constant column with value 1 in the dataframe. data['constant'] = 1 #Adding the name of the constant column in the feature list. features = ['constant'] + features #Creating Feature matrix(Selecting columns and converting to matrix). features_matrix=data[features].as_matrix() #Target column is converted to the numpy array output_array=np.array(data[output]) return(features_matrix, output_array) def predict_outcome(feature_matrix, weights): weights=np.array(weights) predictions = np.dot(feature_matrix, weights) return predictions def errors(output,predictions): errors=predictions-output return errors def feature_derivative(errors, feature): derivative=np.dot(2,np.dot(feature,errors)) return derivative def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance): converged = False #Initital weights are converted to numpy array weights = np.array(initial_weights) while not converged: # compute the predictions based on feature_matrix and weights: predictions=predict_outcome(feature_matrix,weights) # compute the errors as predictions - output: error=errors(output,predictions) gradient_sum_squares = 0 # initialize the gradient # while not converged, update each weight individually: for i in range(len(weights)): # Recall that feature_matrix[:, i] is the feature column associated with weights[i] feature=feature_matrix[:, i] # compute the derivative for weight[i]: #predict=predict_outcome(feature,weights[i]) #err=errors(output,predict) deriv=feature_derivative(error,feature) # add the squared derivative to the gradient magnitude gradient_sum_squares=gradient_sum_squares+(deriv**2) # update the weight based on step size and derivative: weights[i]=weights[i] - np.dot(step_size,deriv) gradient_magnitude = math.sqrt(gradient_sum_squares) stdout.write("\r%d" % int(gradient_magnitude)) stdout.flush() if gradient_magnitude < tolerance: converged = True return(weights) #Example of Implementation #Importing Training and Testing Data # train_data=pd.read_csv("kc_house_train_data.csv") # test_data=pd.read_csv("kc_house_test_data.csv") # simple_features = ['sqft_living', 'sqft_living15'] # my_output= 'price' # (simple_feature_matrix, output) = get_numpy_data(train_data, simple_features, my_output) # initial_weights = np.array([-100000., 1., 1.]) # step_size = 7e-12 # tolerance = 2.5e7 # simple_weights = regression_gradient_descent(simple_feature_matrix, output,initial_weights, step_size,tolerance) # print simple_weights

alvas · Answer

Premièrement, je trouve que lorsque vous écrivez du code d’apprentissage automatique, il est préférablePASd’utiliser une compréhension de liste complexe car tout ce que vous pouvez parcourir,

il est plus facile à lire si écrit lorsque les boucles normales et indentation et/ou
cela peut être fait avec numpy broadcast

Et utiliser des noms de variables appropriés peut vous aider à mieux comprendre le code. Utiliser Xs, Ys, Ws comme main courte n’est agréable que si vous êtes bon en maths. Personnellement, je ne les utilise pas dans le code, surtout lors de l'écriture en python. De import this: explicite vaut mieux qu'implicite.

Ma règle empirique est de rappeler que si j'écris du code, je ne peux pas lire une semaine plus tard, c'est du mauvais code.

Tout d’abord, décidons des paramètres d’entrée pour la descente de gradient. Vous aurez besoin des éléments suivants:

feature_matrix (La matrice X, tapez: numpy.array, une matrice de taille N * D, où N est le nombre de lignes/points de données et D le nombre de colonnes/entités)
output (Le vecteur Y, tapez: numpy.array, un vecteur de taille N)
initial_weights (type: numpy.array, un vecteur de taille D).

De plus, pour vérifier la convergence, vous aurez besoin de:

step_size (l’ampleur du changement lors d’une itération pour changer les poids; tapez: float, généralement un petit nombre)
Tolérance (le critère permettant de rompre les itérations, lorsque l’ampleur du gradient est inférieure à la tolérance, supposent que vos poids ont été respectés, tapez: float, généralement un petit nombre mais beaucoup plus grand que la taille du pas).

Passons maintenant au code.

def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance): converged = False # Set a boolean to check for convergence weights = np.array(initial_weights) # make sure it's a numpy array while not converged: # compute the predictions based on feature_matrix and weights. # iterate through the row and find the single scalar predicted # value for each weight * column. # hint: a dot product can solve this easily predictions = [??? for row in feature_matrix] # compute the errors as predictions - output errors = predictions - output gradient_sum_squares = 0 # initialize the gradient sum of squares # while we haven't reached the tolerance yet, update each feature's weight for i in range(len(weights)): # loop over each weight # Recall that feature_matrix[:, i] is the feature column associated with weights[i] # compute the derivative for weight[i]: # Hint: the derivative is = 2 * dot product of feature_column and errors. derivative = 2 * ???? # add the squared value of the derivative to the gradient magnitude (for assessing convergence) gradient_sum_squares += (derivative * derivative) # subtract the step size times the derivative from the current weight weights[i] -= (step_size * derivative) # compute the square-root of the gradient sum of squares to get the gradient magnitude: gradient_magnitude = ??? # Then check whether the magnitude is lower than the tolerance. if ???: converged = True # Once it while loop breaks, return the loop. return(weights)

J'espère que le pseudo-code étendu vous aidera à mieux comprendre la descente du gradient. Je ne remplirai pas le ??? afin de ne pas gâcher vos devoirs.

Notez que votre code RSS est également illisible et incontrôlable. C'est plus facile à faire:

>>> import numpy as np >>> prediction = np.array([1,2,3]) >>> output = np.array([1,1,5]) >>> residual = output - prediction >>> RSS = sum(residual * residual) >>> RSS 5

Parcourir les bases de numpy contribuera grandement à l'apprentissage automatique et à la manipulation de matrices vectorielles sans pour autant devenir fou avec les itérations: http://docs.scipy.org/doc/numpy-1.10.1/user/basics.html

Aslam Shaik · Answer

C'est si simple

def mean(values): return sum(values)/float(len(values)) def variance(values, mean): return sum([(x-mean)**2 for x in values]) def covariance(x, mean_x, y, mean_y): covar = 0.0 for i in range(len(x)): covar+=(x[i]-mean_x) * (y[i]-mean_y) return covar def coefficients(dataset): x = [] y = [] for line in dataset: xi, yi = map(float, line.split(',')) x.append(xi) y.append(yi) dataset.close() x_mean, y_mean = mean(x), mean(y) b1 = covariance(x, x_mean, y, y_mean)/variance(x, x_mean) b0 = y_mean-b1*x_mean return [b0, b1] dataset = open('trainingdata.txt') b0, b1 = coefficients(dataset) n=float(raw_input()) print(b0+b1*n)

référence: www.machinelearningmastery.com/implement-simple-linear-regear-scratch-python/