Search⌘ K
AI Features

Supervised Learning: Classification

Explore the fundamentals of supervised learning focused on classification problems. Understand how logistic regression works as a linear classifier and apply it using Python's sklearn library. Learn to interpret model performance with confusion matrices and handle multiclass problems with one-vs-rest and one-vs-one approaches.

A common supervised learning problem is classification, applicable to data with discrete output labels. Examples of classification problems include:

  • Spam email detection

  • Face recognition

  • Plant species prediction

  • Human action recognition

In all examples mentioned above, we have a small set of classes or discrete labels to predict.

The following figure shows a binary classification problem having two classes:

  • The first class is represented by green circle points.

  • The second class is represented by blue square points.

Data points of both classes have two input features, xx and yy. The pink line represents the decision boundary that separates both classes. Training is performed on the given data points of both classes to find the parameters of an optimal line that separates both classes with the minimum error.

The commonly used classification algorithms include:

  • Logistic regression

  • Naïve Bayes classification

  • Nearest neighbor classification

  • Decision trees (applicable to both classification and regression)

Here we discuss logistic regression only, which works on the logistic function.

Logistic regression

In contrast to its name, logistic regression is a classification method. It’s a type of linear classifier that resembles linear regression. Logistic regression uses a sigmoid function which is given as:

Here, z=ΘTXz = \Theta^TX ...