Feature engineering vs. feature selection
Feature engineering and feature selection are the keys to having an efficient and reliable model, many people confuse the two concepts. Therefore, the goal of this lesson is to understand the difference between feature engineering and feature selection.
The difference
Feature selection and feature engineering can be very confusing to beginners of machine learning:
- Feature engineering is creating and generating new features from the ones we already have in our dataset to help the machine learning model make more effective and accurate predictions. On the other hand, Feature selection is the process of selecting features from the feature pool (taking into consideration any newly-engineered features) that will help machine learning models make predictions on target variables more efficiently.
In a typical ML pipeline, feature selection is applied after complete feature engineering.
Here is an illustration depicting the difference between the process of feature engineering and feature selection in terms of change to the dataset itself:
Prerequisites
The learner of this course should have some familiarity with machine learning. We will be using Python as a programming language, as well as data science libraries like:
You do not have to download any library if you are willing to use the code playground on Educative. However, you will need to prepare your workspace. I suggest using Anaconda since it is pre-equipped with Python and all of the libraries mentioned above.