Univariate Feature Selection

Explore univariate feature selection techniques to evaluate individual features for predictive power and gain a clear understanding of logistic regression by learning how the sigmoid function operates. This lesson also introduces Python functions, enabling you to write and use your custom code effectively within data science projects.

We'll cover the following...

What it does and doesn’t do
Understanding logistic regression and the sigmoid function
- Python functions
- The sigmoid function
Try it yourself

What it does and doesn’t do

In this chapter, we have learned techniques for going through features one by one to see whether they have predictive power. This is a good first step, and if you already have features that are very predictive of the outcome variable, you may not need to spend much more time considering features before modeling. However, there are drawbacks to univariate feature selection. In particular, it does not consider the interactions between features. For example, what if the credit default rate is very high specifically for people with both a certain education level and a certain range of credit limit?

Also, with the methods we used here, only the linear effects of features are captured. If a feature is more predictive when it’s undergone some type of transformation, such as a polynomial or logarithmic transformation, or binning (discretization), linear techniques of univariate feature selection may not be effective. Interactions and transformations are examples of feature engineering, or creating new features, in these cases from existing features. The shortcomings of linear feature selection methods can be remedied by non-linear modeling techniques including decision trees and methods based on them, which we will examine later. But there is still value in looking for simple relationships that can be found by linear methods for univariate feature selection, and it is quick to do.

Understanding logistic regression and the sigmoid function

In this section, we will open the “black box” of logistic regression all the way: we will gain a comprehensive understanding of how it works. We’ll ...

1.Introduction

2.Data Exploration and Cleaning

Mini Project

3.Introduction to scikit-learn and Model Evaluation

Project

Mini Project

4.Details of Logistic Regression and Feature Extraction

Mini Project

5.The Bias-Variance Trade-Off

Mini Project

6.Decision Trees and Random Forests

Mini Project

7.Gradient Boosting, XGBoost, and SHAP Values

Mini Project

Project

8.Test Set Analysis, Financial Insights, and Delivery to the Client

Mini Project

9.Appendix

Univariate Feature Selection

What it does and doesn’t do

Understanding logistic regression and the sigmoid function