Implementation of polynomial regression
The following steps demonstrate the process of training and visualizing linear and polynomial regression models using the provided dataset.
Step 1 - Importing the libraries
In the first step, we import the necessary libraries.
import numpy as npimport matplotlib.pyplot as pltimport pandas as pd
Step 2 - Importing the dataset
After importing libraries, we load the dataset from a CSV file.
dataset = pd.read_csv('Data.csv')x = dataset.iloc[:, 1:2].valuesy = dataset.iloc[:, 2].values
Here, we use the iloc() function in python to assign the variables x and y the values of feature variable and the values of the target variable respectively from the dataset.
Step 3 - Training the linear regression model
In this step, we train the linear regression model on the entire dataset.
from sklearn.linear_model import LinearRegressionreg = LinearRegression()reg.fit(x,y)
Step 4 - Training the polynomial regression model
Here we train the polynomial regression model on the entire dataset.
from sklearn.preprocessing import PolynomialFeaturespoly = PolynomialFeatures(degree = 4)x_poly = poly.fit_transform(x)reg2 = LinearRegression()reg2.fit(x_poly,y)
Step 5 - The visualization of linear regression results
After training the model, we visualize the linear regression results by creating a scatter plot of the actual data points and then plotting the regression line using reg.predict(x) to predict y based on x.
plt.scatter(x, y, color= 'cadetblue')plt.plot(x, reg.predict(x), color = 'gray')plt.title('Linear regression')plt.xlabel('Position level')plt.ylabel('Salary')plt.show()
Step 6 - The visualization of polynomial regression results
Here, we create a more detailed visualization of the polynomial regression results by generating a range of values based on the minimum and maximum values of x for higher resolution and smoother curves in the plot.
visual = np.arange(min(x), max(x), 0.1)visual = visual.reshape((len(visual), 1))plt.scatter(x, y, color= 'cadetblue')plt.plot(visual, reg2.predict(poly.fit_transform(visual)), color = 'gray')plt.title('Polynomial regression')plt.xlabel('position level')plt.ylabel('salary')plt.show()
Step 7 - A new result prediction with linear regression
In this step, with the reg.predict() method, we predict with the trained linear regression model. It predicts the salary for a new position level of 6.5.
reg.predict([[6.5]])
Step 8 - A new result prediction with polynomial regression
Similarly, we predict with the trained polynomial regression model. It predicts the salary for a new position level of 6.5 using the polynomial features.
reg2.predict(poly.fit_transform([[6.5]]))
Code
# Importing the librariesimport numpy as npimport matplotlib.pyplot as pltimport pandas as pd# Importing the datasetdataset = pd.read_csv('Data.csv')x = dataset.iloc[:, 1:2].valuesy = dataset.iloc[:, 2].values#Fitting linear regression to the datasetfrom sklearn.linear_model import LinearRegressionreg = LinearRegression()reg.fit(x,y)#Fitting polynomial regression to the datasetfrom sklearn.preprocessing import PolynomialFeaturespoly = PolynomialFeatures(degree = 4)x_poly = poly.fit_transform(x)reg2 = LinearRegression()reg2.fit(x_poly,y)#Visualising the linear regression resultsplt.scatter(x, y, color= 'cadetblue')plt.plot(x, reg.predict(x), color = 'gray')plt.title('Linear regression')plt.xlabel('Position level')plt.ylabel('Salary')plt.savefig('output/linear.png')plt.show()plt.clf()#Visualising the polynomial regression resultsvisual = np.arange(min(x), max(x), 0.1)visual = visual.reshape((len(visual), 1))plt.scatter(x, y, color= 'cadetblue')plt.plot(visual, reg2.predict(poly.fit_transform(visual)), color = 'gray')plt.title('Polynomial regression')plt.xlabel('Position level')plt.ylabel('Salary')plt.savefig('output/polynomial.png')plt.show()# Predicting a new result with linear regressionreg.predict([[6.5]])# Predicting a new result with polynomial regressionreg2.predict(poly.fit_transform([[6.5]]))
Polynomial regression with varying degrees
We create four different polynomial regression models with increasing complexity by modifying the degree parameter in the PolynomialFeatures constructor to 2, 3, 4, and 5. By comparing the results of these models, we can evaluate how different degrees of polynomials capture the patterns in the data.
As the degree of the polynomial increases in polynomial regression models, the models become more flexible and capable of fitting complex patterns in the data. Higher degree polynomials can capture complex relationships between the independent and dependent variables. However, increasing the degree can also lead to overfitting, where the model becomes too sensitive to the training data and performs poorly on new, unseen data.
Free Resources