Limitations of Gradient Descent
Understand the limitations of gradient descent when applied to non-convex optimization problems in machine learning. Learn how local optima, intractability with large datasets, sensitivity to starting points, and learning rate choices affect convergence and model performance. This lesson helps you evaluate where gradient descent may fall short and why adjustments are necessary.
We have seen how well gradient descent works in the case of convex optimization because of the presence of a single global optimal solution. We will now look at some of the limitations of gradient descent and address them in this chapter.
Intractability
Consider a machine learning problem where we want to minimize the discrepancy between the model prediction
Here,
To compute the gradient