Search⌘ K

Writing a Custom LSTM Cell in Pytorch

Explore how to create a custom LSTM cell in PyTorch by translating LSTM gate equations into code. This lesson helps you understand LSTM internals and how to build sequential models using recurrent neural networks for deep learning applications.

We'll cover the following...

Creating an LSTM network in Pytorch is pretty straightforward.

import torch.nn as nn
## input_size -> N in the equations
## hidden_size -> H in the equations
layer = nn.LSTM(input_size= 10, hidden_size=20, num_layers=2)

Note that the number of layers is the number of cells that are connected. So this network will have LSTM cells connected together. We will see how in the next lesson. For now, we will focus on the simple LSTM cell based on the equations.

It is ideal to build an LSTM cell entirely from scratch. We have our equations for each gate, so all we have to do is transform them into code and connect them. As an example, a code template as well as the input gate will be provided and you will have to do the rest.

The originally proposed equations that we described are:

it=σ(Wxixt+Whiht1+Wcict1+bi)(1)i_t = \sigma( W_{xi} x_t + W_{hi} {h}_{t-1} + {W}_{ci} {c}_{t-1} + {b}_i) \quad\quad(1)

ft=σ(Wxfxt+Whfht1+Wcfct1+bf)(2){f}_t = \sigma( {W}_{xf} {x}_t + {W}_{hf} {h}_{t-1} + {W}_{cf} {c}_{t-1} + {b}_f) \quad\quad(2) ...