Linear regression serves as a foundational statistical and machine learning method employed to establish a connection between one or more independent variables (features) and a dependent variable (target). This technique operates under the assumption of a linear relationship, signifying that alterations in the dependent variable exhibit proportionality to changes in the independent variables.
For a simple linear regression with one independent variable:
Where,
Let’s take a look at the parameters required for the linear regression function.
Parameter | Description |
X | Input features |
y | Actual output values |
weights | Coefficients for each feature in the input data |
bias | Intercept term in the linear equation |
learning_rate | Step size to adjust weights and bias in each iteration |
epochs | Number of iterations the model goes through the dataset |
cost | Cost or loss function, measures the difference between predicted and actual values |
m | Number of samples in the training dataset |
n | Number of features in the input data |
Let’s break down the steps to implement linear regression from scratch:
Initialize the
weights = np.random.randn(X.shape[1], 1)bias = np.random.randn()
Use the current weights and bias to make predictions.
y_pred = np.dot(X, weights) + bias
Calculate the error or any suitable loss function to measure the difference between predictions and actual values.
loss = y_pred - y
Calculate the gradients of the loss with respect to the weights and bias.
dw = 2 / len(X) * np.dot(X.T, loss)db = 2 / len(X) * np.sum(loss)
Update the weights and bias using the gradients and a learning rate.
weights = weights - learning_rate * dwbias = bias - learning_rate * db
Repeat steps 2–5 for a predefined number of epochs.
import numpy as npdef linear_regression(X, y, learning_rate=0.01, epochs=1000):n_samples, n_features = X.shapeweights = np.zeros(n_features)bias = 0for epoch in range(epochs):y_pred = np.dot(X, weights) + biasloss = y_pred - ydw = 2 / len(X) * np.dot(X.T, loss)db = 2 / len(X) * np.sum(loss)weights = weights - learning_rate * dwbias = bias - learning_rate * dbreturn weights, bias
Here’s an example of how to implement linear regression in Python from scratch.
import numpy as npdef linear_regression(X, y, learning_rate=0.01, epochs=1000):# Initialize weights and biasweights = np.random.randn(X.shape[1], 1)bias = np.random.randn()# Perform gradient descentfor epoch in range(epochs):# Calculate predictionsy_pred = np.dot(X, weights) + bias# Calculate errorsloss = y_pred - y# Calculate gradientsdw = 2 / len(X) * np.dot(X.T, loss)db = 2 / len(X) * np.sum(loss)# Update weights and biasweights = weights - learning_rate * dwbias = bias - learning_rate * dbreturn weights, bias#X_bias = np.c_[np.ones((X.shape[0], 1)), X]np.random.seed(42)X_train = 2 * np.random.rand(100, 1)y_train = 4 + 3 * X_train + np.random.randn(100, 1)# Add a column of ones to X_train for the bias termX_train_bias = np.c_[np.ones((100, 1)), X_train]# Train the modeltrained_weights, trained_bias = linear_regression(X_train_bias, y_train)# Now, let's perform predictions using the trained modeldef predict(X, weights, bias):"""Predict output values using the trained linear regression model.Parameters:- X: Input features (matrix)- weights: Trained model coefficients- bias: Trained model interceptReturns:- y_pred: Predicted output values"""return np.dot(X, weights) + bias# Generate test data for predictionX_test = np.array([[1.5], [2.5]])# Add a column of ones to X_test for the bias termX_test_bias = np.c_[np.ones((len(X_test), 1)), X_test]# Perform predictionspredictions = predict(X_test_bias, trained_weights, trained_bias)# Display the predictionsprint("Predictions:", predictions)
Line 1: Import the necessary library, numpy
.
Line 3: Define the linear_regression
function with parameters X
(input features), y
(output), learning_rate
, and epochs
.
Lines 5–6: Initialize weights and bias with random values.
Line 9: Perform gradient descent for the specified number of epochs.
Lines 11–18: Calculate predictions, errors, and gradients.
Lines 21–22: Update weights and bias using the learning rate.
Line 27: Set a random seed for reproducibility.
Lines 28–29: Generate synthetic training data (X_train
, y_train
).
Line 32: Add a column of ones to X_train
for the bias term.
Line 35: Train the model using the linear_regression
function.
Lines 38–50: Define a function (predict
) for making predictions using the trained model.
Line 53: Generate test data (X_test
) for prediction.
Line 56: Add a column of ones to X_test
for the bias term.
Line 59: Perform predictions using the trained model.
Line 62: Display the predictions.
Here’s a quiz to test your knowledge.
In the given linear regression code, what does the variable dw
represent?
Model weights
Learning rate
Gradients of weights
Bias term
Free Resources