Validation of Support Vector Machines (SVMs)

Learn about learning theory, VC dimension, and support vector regression.

Statistical learning theory and VC dimension

SVMs\text{SVMs} are good and practical classification algorithms for several reasons. In particular, they are formulated as a convex optimization problem that has many good theoretical properties and that can be solved with quadratic programming. They are formulated to take advantage of the kernel trick.

They have a compact representation of the decision hyperplane with support vectors and turn out to be fairly robust with respect to the hyperparameters. However, in order to act as good learners, they need to moderate the overfitting problem discussed earlier. A great theoretical contribution of Vapnik and his colleagues was the embedding of supervised learning into statistical learning theory and to derive some bounds that make statements on the average ability to learn from data. We briefly outline the ideas here and state some of the results without too many details, and we discuss this issue here entirely in the context of binary classification.

However, similar observations can also be made in the case of multiclass classification and regression. This lesson uses language from probability theory that we only introduce in more detail later. Therefore, this lesson might be best viewed at a later stage. Again, the main reason for placing this lesson here is to outline the deeper reasoning for specific models.

Get hands-on with 1200+ tech skills courses.