Search⌘ K
AI Features

Minibatch Gradient Descent

Explore minibatch gradient descent to understand how it improves the training of machine learning models by using small data batches. Learn to balance update speed and stability, helping you efficiently optimize models on large datasets while reducing fluctuations in gradient updates.

Stochastic gradient descent (SGD)

Recall that to compute the gradient θJ(θ)\nabla_\theta J(\theta) of an objective J(θ)J(\theta), we need to aggregate the gradients θL(fθ(xi),yi)\nabla_\theta \mathcal{L}(f_\theta(x_i), y_i) ...