Exercise: F-test and Univariate Feature Selection

Understand how to perform univariate feature selection using the F-test to evaluate each feature's predictive power individually. Learn to implement this using scikit-learn's f_classif function and SelectPercentile class. This lesson guides you through extracting features with the strongest relationship to the response variable, enhancing your logistic regression modeling skills.

We'll cover the following...

Univariate feature selection using F-test
Try it yourself

Univariate feature selection using F-test

In this exercise, we’ll use the F-test to examine the relationship between the features and response variable. We will use this method to do what is called univariate feature selection: the practice of testing features one by one against the response variable, to see which ones have predictive power. Perform the following steps to complete the exercise:

Our first step in doing the ANOVA F-test is to separate out the features and response as NumPy arrays, taking advantage of the list we created, as well as integer indexing in pandas:
```
X = df[features_response].iloc[:,:-1].values
y = df[features_response].iloc[:,-1].values
print(X.shape, y.shape)
```
The output should show the shapes of the features and response:
```
# (26664, 17) (26664, )
```
There are 17 features, and both the features and response arrays have the same number of samples as expected.
Import the f_classif function and feed in the features and response:
```
from sklearn.feature_selection import 
f_classif 
[f_stat, f_p_value] = f_classif(X, y)
```
There are two outputs from ...

1.Introduction

2.Data Exploration and Cleaning

Mini Project

3.Introduction to scikit-learn and Model Evaluation

Project

Mini Project

4.Details of Logistic Regression and Feature Extraction

Mini Project

5.The Bias-Variance Trade-Off

Mini Project

6.Decision Trees and Random Forests

Mini Project

7.Gradient Boosting, XGBoost, and SHAP Values

Mini Project

Project

8.Test Set Analysis, Financial Insights, and Delivery to the Client

Mini Project

9.Appendix

Exercise: F-test and Univariate Feature Selection

Univariate feature selection using F-test