Overview

Learn about the scikit-learn library.

About the scikit-learn library

The open-source series of libraries called scikit build on the NumPy and SciPy libraries for more domain-specific support. In this chapter, we'll briefly introduce the scikit-learn library, or sklearn for short. This library started as a Google Summer of Code project by David Cournapeau and developed into an open-source library that now provides a variety of well-established machine learning algorithms. These algorithms, together with excellent documentation, are available at scikit-learn.org.

Objective

The goal of this chapter is to show how to apply machine learning algorithms in a general setting using some classic methods. In particular, we will show how to apply the following three important machine learning algorithms:

  • Support vector classifier (SVC)
  • Random forest classifier (RFC)
  • Multilayer perceptron (MLP)

While many of the methods studied later in this course go beyond these now classic methods, this does not mean that these methods are obsolete. Quite the contrary; many applications have limited amounts of data where some more data-hungry techniques, such as deep learning, might not work. Also, the algorithms discussed here provide some form of baseline to discuss advanced methods like probabilistic reasoning and deep learning.

Our aim here is to demonstrate that applying machine learning methods based on such machine learning libraries is not very difficult. It also provides us with an opportunity to discuss evaluation techniques that are very important in practice.

Get hands-on with 1200+ tech skills courses.