If you have stumbled upon this course, chances are you’ve heard of Deep Learning, which is all the rage these days. However, Deep Learning is only a subset of the field of Machine Learning. There are many traditional algorithms used in different fields, such as finance, education, medicine, and basic science. Deep Learning is not a silver bullet; many scenarios and tasks still require traditional Machine Learning.

Why do we still need traditional Machine Learning?

People may wonder why we need traditional Machine Learning when Deep Learning is so good. The simple answer is that Deep Learning has its limits.

  • Deep Learning is a black-box, which means we don’t know why it gives such outputs based on our inputs. Some traditional Machine Learning is interpretable, such as linear regression and decision trees. Interpretability is important in some areas.
  • Deep Learning generally requires a lot of data. However, in the real world, the data for some tasks is small. Traditional Machine Learning can easily handle small data.
  • Some problems are relatively simple, and the benefits brought by deep learning are not significant. Simple solutions can be just as effective with less input.
  • There are some problems that only traditional Machine Learning can deal with.

Scikit-learn is just what we need.

Scikit-learn (formerly scikits.learn and also known as sklearn) is a free, software Machine Learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means, and DBSCAN. In addition, sklearn also provides feature extraction, feature selection, model evaluation, model selection, parameter tuning, dimension reduction, and so on. These are essential for a complete Machine Learning project.

In this course, we will not cover the theory or mathematics behind any of these algorithms. Some background knowledge will be introduced in each lesson, but the main purpose is to help you use this library to deal with the real problems.

Why do we choose sklearn?

  • It’s easy to learn and use.
  • It covers all aspects of Machine Learning, even Deep Learning.
  • It’s very versatile and powerful.
  • Detailed documentation, open-source, and active community.
  • It is the most widely used Machine Learning toolkit.

What you will learn from this course?

  • Some models for supervised learning, such as Logistic Regression, SVM, and Naive Bayes
  • Tree-based models such as decision tree and GBDT
  • Clustering methods such as kmeans
  • Feature engineering such as feature selection, feature extraction, and dimension reduction
  • Data preprocessing
  • Model evaluation
  • Hyperparameter searching
  • Simple neural network