Fundamentals of Machine Learning: A Pythonic Introduction/

...

Linear Separability

Understand the concept of linear separability and learn how to change the feature space to make data linearly separable.

We'll cover the following...

Linear separability
Changing the feature space

Press + to interact

Suppose a medical researcher is working on a project to classify patient data as either “healthy” or “diseased.” They have collected data on various health parameters, such as age, blood pressure, and cholesterol levels, for a large number of patients. Their goal is to develop a model that can accurately classify new patients as either healthy or diseased based on these features. When we plot it on a graph with two features, healthy and diseased patients form clusters in different areas of the graph. No straight line can separate these two groups completely, indicating that the data isn’t linearly separable, as shown in the figure. Some of the healthy data points violate the constraints and make the optimization problem infeasible.

Press + to interact

Python 3.10.4

import numpy as np
from sklearn import datasets
import matplotlib.pyplot as plt
import cvxpy as cp
X, Y = datasets.make_classification(n_samples=100, n_features=2, 
                                    n_classes=2, n_clusters_per_class=2,
                                    n_informative=2, n_redundant=0,n_repeated=0,random_state=1)
def phi(X):
    return np.hstack((X, np.ones((X.shape[0], 1))))
phi_X = phi(X)
Y[Y==0] = -1
x1_samples = X[Y==1,:]
x2_samples = X[Y==-1,:]
fig = plt.figure()
plt.scatter(x1_samples[:,0],x1_samples[:,1],c='red', marker='+')
plt.scatter(x2_samples[:,0],x2_samples[:,1], c= 'green', marker='o')
w = cp.Variable((phi_X.shape[1],1))
objective = cp.Minimize(cp.norm(w,2))
constraints = [cp.multiply(Y[:,np.newaxis], phi_X @ w) >= 1]
prob = cp.Problem(objective, constraints)
prob.solve()
# check for infeasibility
if w.value is None:
    plt.title('Not linearly separable in the feature space defined by $\phi$')
else:
    plt.title('Linearly separable in the feature space defined by $\phi$')
    
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                        np.arange(y_min, y_max, 0.02))
    Y_hat = phi(np.c_[xx.ravel(), yy.ravel()]).dot(w.value).reshape(xx.shape)
    
    p2 = plt.contour(xx,yy,Y_hat, [0], colors='blue', linewidths = 1.5, linestyles='--')
    plt.xlabel('$x_1$'); plt.ylabel('$x_2$')
    # Support Vectors
    V = phi_X.dot(w.value)*Y[:,np.newaxis]-1
    idx = np.where(abs(np.squeeze(V))<1e-10)
    plt.scatter(X[idx,0],X[idx,1],facecolors='none', edgecolors='k',s=100)

Here is the explanation for the code above:

Lines 29–32: The highlighted code checks for the feasibility of the optimization problem solved in the above code. If the optimal value of w is None, the optimization problem is infeasible; that is, the data points are not linearly separable in the feature space defined by phi. It means we couldn’t find a weight vector to classify all the data points correctly, and we can’t draw a hyperplane in this feature space. Otherwise, if the optimal value of w is not None, the optimization problem is feasible, and the data points are linearly separable in the feature space defined by phi.

Changing the feature space

If data isn’t linearly separable in a feature space defined by a mapping $\phi$ , we may try another feature space in which the data becomes linearly separable. For example, consider the following transformation of the features:

ϕ ([\begin{array}{cc} x_{1} & x_{2} \end{array}

Course Overview

Supervised Learning

Detect Cyber Intrusion Using Machine Learning

Clustering

Project: Bag of Visual Words

Generalized Linear Regression

Face Recognition Using Kernel Linear Discriminant

Support Vector Machine

Logistic Regression

Ensemble Learning

Early Stage Diabetes Prediction Using Ensemble Learning

Decoding Dimensions: PCA and Autoencoders

Image Reconstruction Using PCA

Image Colorization using Autoencoders

Colorful Face Generation with VAEs

Appendix

Wrapping Up

How to Predict the Traffic Volume Using Machine Learning

Linear Separability

Linear separability

Changing the feature space