Model Performance on the Test Set

Explore how to rigorously evaluate machine learning model performance on a reserved test set to estimate future predictive accuracy. Understand the importance of preventing data leakage, calculating ROC AUC metrics, and consider methods like learning curves to decide on training data usage. This lesson also guides you on preparing your tested model for client delivery and monitoring over time.

We'll cover the following...

Rigorous estimate of expected future performance
Examining the test set and making predictions
A case study approach

Rigorous estimate of expected future performance

We already have some idea of the out-of-sample performance of the XGBoost model from the validation set. However, the validation set was used in model fitting via early stopping. The most rigorous estimate of expected future performance we can make should be created with data that was not used at all for model fitting. This was the reason for reserving a test dataset from the model-building process.

You may notice that we did examine the test set to some extent already, for example, in the chapter "Data Exploration and Cleaning," when assessing data quality and cleaning data. The gold standard for predictive modeling is to set aside a test set at the very beginning of a project and not examine it at all until the model is finished. This is the easiest way to ensure that none of the ...

1.Introduction

2.Data Exploration and Cleaning

Mini Project

3.Introduction to scikit-learn and Model Evaluation

Project

Mini Project

4.Details of Logistic Regression and Feature Extraction

Mini Project

5.The Bias-Variance Trade-Off

Mini Project

6.Decision Trees and Random Forests

Mini Project

7.Gradient Boosting, XGBoost, and SHAP Values

Mini Project

Project

8.Test Set Analysis, Financial Insights, and Delivery to the Client

Mini Project

9.Appendix

Model Performance on the Test Set

Rigorous estimate of expected future performance