Search⌘ K
AI Features

Hard-Margin SVM

Learn to identify and implement hard-margin SVM models that find the maximum margin hyperplane for perfectly separable datasets. This lesson covers the mathematical formulation, the role of support vectors, convex optimization using cvxpy, and visualization of decision boundaries, preparing you for advanced SVM concepts.

In this lesson, we explore Hard-margin SVM, a type of support vector machine designed to perfectly separate two classes with a maximum margin hyperplane. Hard-margin SVM is ideal for datasets that are linearly separable and free of outliers. We’ll start by understanding the concept of a linearly separable dataset and the idea of maximizing the margin. Then, we’ll carefully derive the equivalent optimization problem, discuss the critical role of support vectors, and implement hard-margin SVM using cvxpy to visualize the decision boundary and the support vectors.

Linearly separable case

Given a binary classification dataset D={(x1,y1),(x2,y2),,(xn,yn)}D=\{(\bold x_1, y_1), (\bold x_2, y_2), \dots,(\bold x_n, y_n)\}, where xiRd\bold x_i \in \R^d and yi{1,1}y_i \in \{-1, 1\}, if a hyperplane wTϕ(x)=0\bold w^T\phi(\bold x)=0 exists that separates the two classes, the dataset is said to be linearly separable in the feature space defined by the mapping ϕ\phi. We’ll assume for now that the dataset DD is linearly separable. The goal is to find the hyperplane with a maximum margin by optimizing the following objective:

maxw[1wmini(yiwTϕ(xi))]\max_{\bold w}\bigg[\frac{1}{\|\bold w\|}\min_{i}\big(y_i\bold w^T\phi(\bold x_i)\big)\bigg]

The direct optimization of the above objective is complex. Here is the derivation of the equivalent, simplified optimization problem:

maxw[1wmini(yiwTϕ(xi))]=maxw1w[mini(yiwTϕ(xi))]=maxw,γγw                s.t.              yiwTϕ(xi)γ              i=maxw,γ1w/γ        s.t.              yi(w/γ)Tϕ(xi)1        i=maxw1w              s.t.              yi(wTϕ(xi)1              i=minww                  s.t.               ...