How to build an LSTM model using Pytorch

Long short term memory (LSTM) is a special kind of RNNRecurrent Neural Network. They have proven to work accurately on various problems in the field, such as speech recognition, machine translation, and more. They overcome the limitations of naïve RNNs, which fail to deal with long-term dependencies in the sequences.

Implement the LSTM model in PyTorch

We can build the LSTM model with PyTorch by following these steps:

Step 1

Firstly, we import the PyTorch library into our project using the following code snippet:

class LSTMModel(nn.Module):
    def __init__(self, input_d, hidden_d, layer_d, output_d):
        super(LSTMModel, self).__init__()
        
        self.hidden_dim = hidden_d
        self.layer_dim = layer_d
        # LSTM model 
        self.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True) 
        # batch_first=True (batch_dim, seq_dim, feature_dim)
        self.fc = nn.Linear(hidden_d, output_d)
    def forward(self, x):
    
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
        c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[:, -1, :]) 
        return out
    
input_dim = 30
hidden_dim = 120
output_dim = 15
layer_dim = 1
model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

Explanation

Line 1: We inherit nn.Module in the LSTM class.
Line 2: The input_d is the number of expected features in the input. The hidden_d is the number of features in the hidden state.
Lines 5–6: We define the number of hidden dimensions and layers is defined.
Line 12: We define the read-out layer using the fc() function.
Line 14: We define the forward function to create the forward pass for the LSTM model.
Line 16: We initialize the hidden state with zeros.
Line 18: We initialize the cell state with zeros.
Line 20: This step takes place thirty times. We detach as we truncate Backpropagation through timeA gradient-based technique to train certain types of RNNs. (BPTT). If we don't detach, we'll backprop to the start.
Lines 25–30: Variables are declared, and an LSTM model object is created.

Step 4

After the model has been instantiated, we instantiate the loss by calculating the cross entropy loss. The following code snippet demonstrates it:

# step 1: importing libraries  
import torch 
import torch.nn as nn
#step 3: creATING THE MODEL
class LSTMModel(nn.Module):
    def __init__(self, input_d, hidden_d, layer_d, output_d):
        super(LSTMModel, self).__init__()
        
        self.hidden_dim = hidden_d
        self.layer_dim = layer_d
        # LSTM model 
        self.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True) 
        self.fc = nn.Linear(hidden_d, output_d)
    def forward(self, x):
    
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
        c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        out = self.fc(out[:, -1, :]) 
        return out
    
input_dim = 30
hidden_dim = 120
output_dim = 15
layer_dim = 1
model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)
#step 4: calculating cross entropy loss
error = nn.CrossEntropyLoss()
#step 5: optimizer 
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

How to build an LSTM model using Pytorch

Implement the LSTM model in PyTorch

Step 1

Step 2

Step 3

Explanation

Step 4

Step 5

Step 6

Full implementation