Feature selection is the process by which we select specific variables from the raw data and remove others from the analysis because the variables chosen may be the most salient. So, how can we determine which features to keep and which to throw out? There exists some research to guide us with this question.

Methods for feature selection

There are two main methods to determine which features are important. One way to do this is through a measure of information gain, which is a technique based on a statistical method called entropy, which we’ll define in more detail later. The other technique widely used for feature selection is to treat the problem as a search problem, in which we search for the best possible features to model the data. A good set of algorithms to use is greedy search algorithms.

Greedy search algorithms are algorithms that use heuristics to calculate a solution that is locally optimal. Locally optimal means that the solution found may be best given all the solutions tried around it, though it may or may not be the best possible solution (that is, global optimum). In this case, the greedy algorithm will add or remove a feature, and then assess the current solution to determine how good it is, given the previous solutions.

Get hands-on with 1200+ tech skills courses.