Model Debugging and Testing

Explore the process of building and debugging machine learning models in this lesson. Understand how to identify issues like feature distribution changes, overfitting, and feature mismatches. Learn iterative improvement techniques to enhance model performance and debug components in large-scale ML systems, preparing you to maintain and optimize effective machine learning solutions.

We'll cover the following...

Building model v1
Deploying and debugging v1 model
Iterative model improvement
- Missing important feature
- Insufficient training examples
Debugging large scale systems

Let’s go over different phases in the development of a machine learning system, potential issues that we can face, and how to debug and fix them.

There are two main phases in terms of the development of a model that we will go over:

Building the first version of the model and the ML system.
Iterative improvements on top of the first version as well as debugging issues in large scale ML systems.

Building model v1

The goal in this phase is to build the 1st version of the model. Few important steps in this stage are:

We begin by identifying a business problem in the first phase and mapping it to a machine learning problem.
We then go onto explore the training data and machine learning techniques that will work best on this problem.
Then we train the model given the available data and features, play around with hyper-parameters.
Once the model has been set up and we have early offline metrics like accuracy, precision/recall, AUC, etc., we continue to play around with the various features and training data strategies to improve our offline metrics.
If there is already a heuristics or rule-based system in place, our objective from the offline model would be to perform at least as good as the current system, e.g., for ads prediction problem, we would want our ML model AUC to be better than the current rule-based ads prediction based on only historical engagement rate.

It’s important to get version 1 launched to the real system quickly rather than spending too much time trying to optimize it. For example, if our AUC is 0.7 and it’s better than the current system with AUC 0.68, it’s generally a better idea to take model online and then continue to iterate to improve the quality. The reason is primarily that model improvement is an iterative process and we want validation from real traffic and data along with offline validation. We will look at various ideas that can help in that iterative development in the following sections.

1.Introduction

2.Practical ML Techniques/Concepts

Breakout Session

3.Search Ranking

Breakout Session

4.Feed Based System

5.Recommendation System

Breakout Session

Mock Interview

6.Self-Driving Car: Image Segmentation

7.Entity Linking System

8.Ad Prediction System

Breakout Session

9.Fraud Detection System

Mock Interview

10.Hate Speech Detection

Mock Interview

11.Dynamic Pricing Engine

Mock Interview

Mock Interview

Mock Interview

Model Debugging and Testing

Building model v1

Deploying and