Search⌘ K

Generalized Linear Regression for Multiple Targets

Learn multitarget linear regression with simple examples and code comparison (custom vs. sklearn) by comparing loss.

Building on generalized linear regression, this lesson introduces techniques for simultaneously modeling several output variables. Previously, the optimal parameters were a vector (ww); now, the optimal parameters are represented by a weight matrix (WW). Each input vector xiRdx_i \in R^d now corresponds to a target vector yiRmy_i \in R^m, where mm is the number of outcomes we are predicting. The model uses the form fW(x)=WTϕ(x)f_{W}(x) = W^T\phi(x), where ϕ(x)\phi(x) still allows for non-linear feature transformation. This lesson focuses on formulating this multi-output prediction task as a single, efficient matrix minimization problem, often termed multi-target Ridge Regression, to derive the corresponding closed-form solution.

Multiple targets

Consider a regression dataset D={(x1,y1),(x2,y2),,(xn,yn)}D=\{(x_1,y_1),(x_2, y_2),\dots,(x_n,y_n)\}, where xiRdx_i \in R^d and yiRmy_i \in R^m. A function fW(x)=WTϕ(x)f_W(x) = W^T\phi(x) is a generalized linear model for regression for any given mapping ϕ\phi of the input features xx, and WW is a matrix with mm columns, one for each target. Note that fW(x)Rmf_W(x) \in \R^m, meaning the model produces a vector of mm predicted values for each input xx, with each component corresponding to one of the multiple target variables. This allows the generalized linear model to simultaneously predict all target outputs in a single evaluation.

Try this quiz to review what you’ve learned so far.

1.

In the context of the function fW(x)=WTϕ(x)f_W(\bold x) = W^T\phi(\bold x), if xRd\bold x \in \R^d, ϕ(x)Rk\phi(\bold x) \in \R^k, and WRp×mW \in \R^{p \times m}, then what is the value of pp?

A.

p=kp=k

B.

p=mp=m


1 / 1

The optimal parameters WW^* can be determined by minimizing a regularized squared loss (multi-target ridge regression) as follows:

W=arg minW{i=1nWTϕ(xi)yi22+λWF2}W^*=\argmin_{W}\bigg\{\sum_{i=1}^n \|W^T\phi(\bold x_i)-\bold y_i\|_2^2 + \lambda \|W\|_F^2\bigg\}

Here, i=1nWTϕ(xi)yi22\sum_{i=1}^n \|W^T\phi(x_i)-y_i\|_2^2 ...