SVM Implementation Steps: 8 and 9

This lesson will go over steps 8-9 of implementing support vector machines.

We'll cover the following

8) Grid search

You can improve the accuracy of our model using a technique called grid search to help us find the optimal hyperparameters for this algorithm. While many hyperparameters belong to SVCSupport Vector Classifier, we will focus on C and gamma, which generally have the biggest impact on prediction accuracy when using this algorithm.

The hyperparameter C controls the cost of misclassification on the training data. In other words, C regulates the extent to which misclassified cases (placed on the wrong side of the margin) are ignored.

This flexibility in the model is referred to as a “soft margin,” and ignoring cases that cross over the soft margin can lead to a better fit. In practice, the lower C is, the more errors the soft margin is permitted to ignore. A C value of ‘0’ enforces no penalty on misclassified cases.

Gamma refers to the Gaussian radial basis function and the influence of the support vector. In general, a small gamma produces models with high bias and low variance. Conversely, a large gamma leads to low bias and high variance in the model.

Grid search allows us to list a range of values to test for each hyperparameter. An automated voting process then takes place to determine the optimal value for each hyperparameter.

Note: Grid search must examine each combination of hyperparameters. It can take a long time to run, particularly as you add more values for testing. In this exercise, you will test three values for each hyperparameter.

You should use the following code in a new cell within the same notebook.

Begin by stating the hyperparameters you wish to test.

hyperparameters = {'C':[10,25,50],'gamma':[0.001,0.0001,0.00001]}

Link your specified hyperparameters to GridSearchCV and the SVC algorithm under a new variable name.

grid = GridSearchCV(SVC(),hyperparameters)

Next, fit grid search to the X and y training data.

grid.fit(X_train, y_train)

You can now use the grid.best_params_ function to review the optimal combination of hyperparameters. This may take 30 seconds or longer to run on your machine.

grid.best_params_

Get hands-on with 1200+ tech skills courses.