Search⌘ K
AI Features

Long Short Term Memory Networks (LSTM)

Explore the architecture and functioning of Long Short Term Memory Networks (LSTMs) to understand how they overcome vanishing gradient problems in recurrent neural networks, enabling better sequence learning for projects like sentiment analysis. This lesson guides you through the key gates and cell state mechanisms of LSTMs, providing foundational knowledge essential for advanced deep learning applications.

We'll cover the following...

What are LSTMs?

Long Short Term Memory Networks (LSTMs) in short, are modifications in the standard Recurrent Neural Network. Look at the image of a standard RNN architecture below.

In the above diagram:

  • Xt1X_{t-1} represents the input at time step t-1.
  • ht1h_{t-1} represents the output of the RNN cell at time step t-1.

This is the architecture that has the vanishing gradient problem, which we discussed in our previous lessons. We’ve also maintained that, as a rule of thumb, if the recurring weight (denoted by WrecW_{rec} ...