Evaluation
Explore effective approaches for evaluating machine learning models in an industrial context. Learn to use validation and test datasets to detect overfitting and measure final model performance. Understand how to apply evaluation metrics such as loss and accuracy to ensure your models generalize well beyond training data.
We'll cover the following...
A. Training vs. evaluation
To measure how well our model has been trained, we evaluate it on datasets other than the training set. The datasets used for evaluation are known as the validation and test sets. Note that we don’t shuffle or repeat the evaluation datasets, since those are techniques used specifically to improve training.
The validation set is used to evaluate a model in between training runs. We use the validation set to tweak certain hyperparameters for a model (such as learning rate or batch size) in order to make sure training continues smoothly. We also use the validation set to detect model overfitting, so we can stop training if overfitting is detected.
Overfitting occurs when we train a model (usually a relatively complex model) for too long on ...