Generalized Linear Regression
Explore generalized linear regression to model nonlinear relationships by transforming features using basis functions. Understand regularized squared loss and derive the closed-form solution for optimal parameters. Gain practical skills in vectorization and ridge regression implementation with Python and scikit-learn.
We’ve previously learned that while standard linear models are powerful, many real-world relationships are non-linear. The generalized linear model (GLM) framework solves this by introducing a basis function () that transforms the input features into a higher-dimensional space, allowing a linear model to fit a complex, non-linear curve to the data. In this lesson, we move from conceptual understanding to practical implementation by exploring closed-form solutions for training generalized linear models.
Single target
The input features are vectors where each data point has distinct, real-valued features (e.g., size, age). The target variable is a single, continuous, real-valued number (e.g., house price) that the model aims to predict, defining this as a single-target regression problem. The model is a generalized linear model (GLM). It achieves non-linear modeling by first applying a basis function (the mapping) to transform the input features, and then making the prediction via a linear dot product with the learned parameters .
Try this quiz to review what you’ve learned so far.
In the context of the function , if , and , then what is ?
The function successfully defines the structure of our generalized linear model (GLM) for any given input . However, this model structure is useless until we determine the ideal values for the parameter vector . These parameters must be chosen so that the model’s predictions best match the true target values in our training dataset .
To quantify how well a given set of parameters performs, we use a loss function . This function measures the total error between the model’s predictions and the actual observed values across all data points. To find the that provides the best fit, we must find the that minimizes this loss.
The optimal parameters can be determined by minimizing a regularized squared loss as follows:
Here, is the squared error (or data loss) term, and is the L2 regularization term. Their sum, ...