Search⌘ K
AI Features

Univariate Linear Regression

Explore univariate linear regression to predict an outcome using one independent variable. Understand the regression equation, cost function minimization, and how gradient descent optimizes parameters in machine learning.

Univariate linear regression

In univariate linear regression, we have one independent variable xx, which we use to predict a dependent variable yy.

We’ll use the tips dataset from seaborn’s datasets to illustrate theoretical concepts.



We’ll use the following columns from the dataset for univariate analysis:

  • Total_bill: It is the total bill of food served.

  • Tip: It is the tip given on the cost of food.


Goal of univariate linear regression: The goal is to predict the “tip” given on a “total_bill”. The regression model constructs an equation to do so.

If we plot the scatter plot between the independent variable (total_bill) and the dependent variable (tip), we’ll get the plot below:



  • We can see that the points in the scatter plot are mostly scattered along the diagonal.

  • This indicates that there may be a positive correlation between the total_bill and tip. This will be useful for modeling.


Working

The univariate linear regression model comes up with the following equation of the straight line:

y^=w0+w1x\hat{y} = w_0 + w_1 * x

Or

tip_predicted=w0+w1total_billtip\_predicted = w_0 + w_1 * total\_bill

Goal: Find the values of w0w_0 and w1w_1, where w0w_0 and w1w_1 are the parameters, so that the predicted tip (y^\hat{y}) is as close to the actual tip (yy) as possible. Mathematically, we can model the problem as seen below.

J(w0,w1)J(w_0, w_1) = 12mi=1m(y^iyi)2\frac{1}{2m}\sum_{i=1}^{m}(\hat{y}^i-y^i)^2

  • J(w0,w1)J(w_0, w_1) is the cost function, which an algorithm tries to minimize by finding the values of w0w_0 and w1w_1. These values give us the minimum value of the above function.

  • yiy^i is the actual output value of a training instance ii, where i=1,2,3..i =1,2,3 ..

  • y^i\hat{y}^i ...