Evaluation

Learn how to evaluate a pre-trained model stored in a checkpoint.

A. Training vs. evaluation

To measure how well our model has been trained, we evaluate it on datasets other than the training set. The datasets used for evaluation are known as the validation and test sets. Note that we don’t shuffle or repeat the evaluation datasets, since those are techniques used specifically to improve training.

The validation set is used to evaluate a model in between training runs. We use the validation set to tweak certain hyperparameters for a model (such as learning rate or batch size) in order to make sure training continues smoothly. We also use the validation set to detect model overfitting, so we can stop training if overfitting is detected.

Get hands-on with 1200+ tech skills courses.