Step 5 - Rinse and Repeat!
Understand the concept of epochs as complete training iterations in gradient descent. Learn how parameter updates vary with batch, mini-batch, and stochastic gradient descent methods. Discover the trade-offs in convergence speed and stability while training models in PyTorch through repeated iterations.
Introduction to epoch
Before we continue our process, let us explore what exactly is an epoch and when it gets completed since we will be using this later on.
Definition
The number of epochs is a hyper-parameter that refers to the number of complete iterations of the algorithm being used through the training set.
An epoch is complete whenever every point in the training set (N) has already been used in all steps: forward pass, computing loss, computing gradients, and updating parameters.
Updates and gradient descent
During one epoch, we perform at least one update, but no more than N updates. The number of updates (N/n) will depend on the type of gradient descent being used:
-
For batch (
n = N) gradient descent, this is trivial, as it uses all points for computing the loss; one epoch is the same as one update. -
For ...