Search⌘ K
AI Features

Multi-target Linear Regression

Explore the concept of multi-target linear regression where multiple output variables are predicted simultaneously from input features. Understand matrix formulations, the Frobenius norm, and how to implement and evaluate this model using Python and sklearn. Gain practical skills in handling multi-output data sets for real-world data science problems.

Multi-target data sets

It’s common in real applications to predict more than one target from given features. The data setUCI_Dataset we’ve discussed in the previous lesson comes from a similar scenario. We’ve already approximated a single target given its multiple features. In this lesson, we’ll predict both Y1 (heating load) and Y2 (cooling load), given X1 to X8 features. We can do so using multi-target linear regression. Formally, if x1,x2,...,xd\bold{x_1},\bold{x_2},...,\bold{x_d} are the feature columns and y1,y2,...,yc\bold{y_1},\bold{y_2},...,\bold{y_c} are the target columns, we can define data matrix as An×d=[x1x2...xd]A_{n \times d} = \begin{bmatrix}\bold{x_1}&\bold{x_2}&...&\bold{x_d}\end{bmatrix} and the target matrix as Yn×c=[y1y2...yc]Y_{n \times c} = \begin{bmatrix}\bold{y_1}&\bold{y_2}&...&\bold{y_c}\end{bmatrix}, where nn is the number of data points.

The linear system can then be modeled as:

AW=YAW=Y

where, W=[w1w2...wc] is a d×c matrix of parameters.W=\begin{bmatrix}\bold{w_1}&\bold{w_2}&...&\bold{w_c}\end{bmatrix} \text{ is a }d \times c \text{ matrix of parameters.}

Objective

The goal is to find WW that minimizes the objective AWYF2\|AW-Y\|_F^2, where the subscript FF represents the Frobenius norm.

Frobenius norm

A Frobenius norm of any matrix is defined as the square root of the sum of squares of all its elements. Formally, for any matrix, Am×nA_{m \times n}, the Frobenius norm is AF=i=1mj=1naij2\|A\|_F=\sqrt{\sum_{i=1}^m\sum_{j=1}^na_{ij}^2}.

Frobenius norm in Python

We can use np.linalg.norm to compute different types of norms of vectors and matrices. For matrix input, the default type is the Frobenius norm, and we can also set it explicitly as a parameter, ord='fro' as described in the code below:

Python
nrm1 = np.linalg.norm(A)
nrm2 = np.linalg.norm(A,ord ='fro')
nrm3 = np.sum(A.flatten()**2)**0.5

An executable and extendable example of computing the Frobenius norm is given below.

Python 3.8
import numpy as np
A = np.round(10*np.random.randn(2,3))
print(f'Matrix A = \n {A}')
print(f'Frobenius Norm of A = {np.linalg.norm(A)}')
print(f'Frobenius Norm of A = {np.linalg.norm(A,ord="fro")}')
print(f'Frobenius Norm of A = {np.sum(A.flatten()**2)**0.5}')

Note: AWAW and YY are matrices, and so is AWYAW-Y. Ideally, we want AWYAW-Y to be a zero matrix!

Implementation

We’ve already worked with a real dataset.

In this lesson, we’ll predict both Y1 and Y2 based on the rest of the features.

Changes in the code

Single target
def getAy(data):
    y = data.pop('Y2')
    y = np.array(y)[:,np.newaxis]
    A = np.array(data)
    A = np.hstack((np.ones((A.shape[0],1)),A))
    return A,y
Multiple targets
def getAY(data):
    y1 = data.pop('Y1')
    y1 = np.array(y1)[:,np.newaxis]
    y2 = data.pop('Y2')
    y2 = np.array(y2)[:,np.newaxis]
    Y = np.hstack((y1,y2))
    A = np.array(data)
    A = np.hstack((np.ones((A.shape[0],1)),A))
    return A,Y

Computing mean-squared error

For multiple targets, we define the mean-squared error as 1nYY^F2\frac{1}{n}\|Y-\hat Y\|^2_F.

mse = (np.linalg.norm(test_Y_pred-test_Y)**2)/test_Y.shape[0]

The code below is a complete example of computing the mean-squared error.

Python 3.8
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
# Function to separate features and outputs
def getAY(data):
y1 = data.pop('Y1')
y1 = np.array(y1)[:, np.newaxis]
y2 = data.pop('Y2')
y2 = np.array(y2)[:, np.newaxis]
Y = np.hstack((y1, y2))
A = np.array(data)
A = np.hstack((np.ones((A.shape[0], 1)), A))
return A, Y
# Prepare data
df = pd.read_csv('reg_data.csv').drop('Unnamed: 0', axis=1)
train, test = train_test_split(df, test_size=0.2)
train_A, train_Y = getAY(train)
test_A, test_Y = getAY(test)
# Estimate parameters using least squares and test the model
W = np.linalg.lstsq(train_A, train_Y, rcond=None)[0]
test_Y_pred = test_A.dot(W)
mse = (np.linalg.norm(test_Y_pred-test_Y)**2)/test_Y.shape[0]
print(f'Mean Squared Error = {mse}')

Visualization

We can make separate visualizations for each target attribute. However, a more intuitive way is to visualize the error of the prediction vector. If [y^i1y^i2]\begin{bmatrix}\hat y_{i1}\\\hat y_{i2}\end{bmatrix} is the prediction vector for the ithi^{th} test point, then we can define a difference vector as [yi1y^i1yi2y^i2]\begin{bmatrix}y_{i1}-\hat y_{i1}\\y_{i2}-\hat y_{i2}\end{bmatrix}. The squared length of the difference vector is precisely the squared error. In the case of an exact solution, the difference vector is zero. When the difference vector is depicted as a point in the xyxy plane, the good prediction will keep the point near to the origin. We can observe that most predictions are within the radius of 55 (green circle).

Axes are defined as differences between predictions and ground truths. The green circle contains most of the predictions with error magnitude bounded by 5.

Putting it all together

The complete executable code of the visualization is given below:

Python 3.8
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
# Function to separate features and outputs
def getAY(data):
y1 = data.pop('Y1')
y1 = np.array(y1)[:, np.newaxis]
y2 = data.pop('Y2')
y2 = np.array(y2)[:, np.newaxis]
Y = np.hstack((y1, y2))
A = np.array(data)
A = np.hstack((np.ones((A.shape[0], 1)), A))
return A, Y
# Function to visualize results
def plotResults(Y_true, Y_pred, title=''):
plt.plot(Y_true[:, 0]-Y_pred[:, 0], Y_true[:, 1]-Y_pred[:, 1], 'r+')
c1 = plt.Circle((0, 0), 5, color='g', alpha=1);plt.gca().add_patch(c1)
c2 = plt.Circle((0, 0), 7, color='b', alpha=0.2);plt.gca().add_patch(c2)
c3 = plt.Circle((0, 0), 9, color='r', alpha=0.2);plt.gca().add_patch(c3)
plt.title(title); plt.axis('equal'); plt.axis('square')
plt.xlim(plt.xlim()); plt.ylim(plt.ylim())
plt.plot([0, 0], [-100, 100], 'k', linewidth = 1.2)
plt.plot([-100, 100], [0, 0], 'k', linewidth=1.2)
plt.grid(color = 'k', linestyle = '--', linewidth = 0.8)
plt.legend([c1, c2, c3], ['r=5', 'r=7', 'r=9'])
plt.savefig('output/graph.png', dpi=300)
# Load and prepare data
df = pd.read_csv('reg_data.csv').drop('Unnamed: 0', axis=1)
train, test = train_test_split(df, test_size=0.2)
train_A, train_Y = getAY(train)
test_A, test_Y = getAY(test)
# Estimate parameters and test the model
W = np.linalg.lstsq(train_A, train_Y, rcond=None)[0]
test_Y_pred = test_A.dot(W)
mse = (np.linalg.norm(test_Y_pred-test_Y)**2)/test_Y.shape[0]
print(f'Mean Squared Error = {mse}')
# Plot the results
plotResults(test_Y, test_Y_pred, title='Multi-target Linear Regression')

Multi-target regression using sklearn

We can also use sklearn to perform multi-target linear regression as shown below:

from sklearn.linear_model import LinearRegression as LR
from sklearn.multioutput import MultiOutputRegressor as MLR
model = MLR(LR()).fit(train_A, train_Y)
test_Y_pred = model.predict(test_A)

The code below utilizes sklearn for multi-target linear regression:

Python 3.8
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression as LR
from sklearn.multioutput import MultiOutputRegressor as MLR
from matplotlib import pyplot as plt
# Function to visualize results
def plotResults(Y_true, Y_pred, title=''):
plt.plot(Y_true[:, 0]-Y_pred[:, 0], Y_true[:, 1]-Y_pred[:, 1], 'r+')
c1 = plt.Circle((0, 0), 5, color='g', alpha=1);plt.gca().add_patch(c1)
c2 = plt.Circle((0, 0), 7, color='b', alpha=0.2);plt.gca().add_patch(c2)
c3 = plt.Circle((0, 0), 9, color='r', alpha=0.2);plt.gca().add_patch(c3)
plt.title(title); plt.axis('equal'); plt.axis('square')
plt.xlim(plt.xlim()); plt.ylim(plt.ylim())
plt.plot([0, 0], [-100, 100], 'k', linewidth = 1.2)
plt.plot([-100, 100], [0, 0], 'k', linewidth=1.2)
plt.grid(color = 'k', linestyle = '--', linewidth = 0.8)
plt.legend([c1, c2, c3], ['r=5', 'r=7', 'r=9'])
plt.savefig('output/graph.png', dpi=300)
# Function to separate features and output
def getAY(data):
y1 = data.pop('Y1')
y1 = np.array(y1)[:, np.newaxis]
y2 = data.pop('Y2')
y2 = np.array(y2)[:, np.newaxis]
Y = np.hstack((y1, y2))
A = np.array(data)
A = np.hstack((np.ones((A.shape[0], 1)), A))
return A, Y
# Prepare data
df = pd.read_csv('reg_data.csv').drop('Unnamed: 0', axis=1)
train, test = train_test_split(df, test_size=0.2)
train_A, train_Y = getAY(train)
test_A, test_Y = getAY(test)
# Train multi-linear regression model from sklearn
m = MLR(LR()).fit(train_A, train_Y)
# Compute results and error
test_Y_pred = m.predict(test_A)
mse = (np.linalg.norm(test_Y_pred-test_Y)**2)/test_Y.shape[0]
print(f'Mean Squared Error = {mse}')
# Plot the results
plotResults(test_Y, test_Y_pred, title='Multi-target Linear Regression')