How to build an LSTM model using Pytorch
Long short term memory (LSTM) is a special kind of
Implement the LSTM model in PyTorch
We can build the LSTM model with PyTorch by following these steps:
Step 1
Firstly, we import the PyTorch library into our project using the following code snippet:
import torchimport torch.nn as nn
Step 2
Next, we prepare and load the data set into the project.
Step 3
Now, we proceed to create the LSTM model and define the forward pass of the LSTM. The following code demonstrates this step:
class LSTMModel(nn.Module):def __init__(self, input_d, hidden_d, layer_d, output_d):super(LSTMModel, self).__init__()self.hidden_dim = hidden_dself.layer_dim = layer_d# LSTM modelself.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)# batch_first=True (batch_dim, seq_dim, feature_dim)self.fc = nn.Linear(hidden_d, output_d)def forward(self, x):h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outinput_dim = 30hidden_dim = 120output_dim = 15layer_dim = 1model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)
Explanation
Line 1: We inherit
nn.Modulein the LSTM class.Line 2: The
input_dis the number of expected features in the input. Thehidden_dis the number of features in the hidden state.Lines 5–6: We define the number of hidden dimensions and layers is defined.
Line 12: We define the read-out layer using the
fc()function.Line 14: We define the
forwardfunction to create the forward pass for the LSTM model.Line 16: We initialize the hidden state with
zeros.Line 18: We initialize the cell state with
zeros.Line 20: This step takes place thirty times. We detach as we truncate
(BPTT). If we don't detach, we'll backprop to the start.Backpropagation through time A gradient-based technique to train certain types of RNNs. Lines 25–30: Variables are declared, and an LSTM model object is created.
Step 4
After the model has been instantiated, we instantiate the loss by calculating the cross entropy loss. The following code snippet demonstrates it:
error = nn.CrossEntropyLoss()
Step 5
Next, we instantiate the optimizer by using the SGD optimizer:
learning_rate = 0.1optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
Step 6
The model is trained and is used to make predictions. We'll not discuss this step in this Answer.
Full implementation
The full implementation of the LSTM model is demonstrated below:
# step 1: importing librariesimport torchimport torch.nn as nn#step 3: creATING THE MODELclass LSTMModel(nn.Module):def __init__(self, input_d, hidden_d, layer_d, output_d):super(LSTMModel, self).__init__()self.hidden_dim = hidden_dself.layer_dim = layer_d# LSTM modelself.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)self.fc = nn.Linear(hidden_d, output_d)def forward(self, x):h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outinput_dim = 30hidden_dim = 120output_dim = 15layer_dim = 1model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)#step 4: calculating cross entropy losserror = nn.CrossEntropyLoss()#step 5: optimizerlearning_rate = 0.1optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
Free Resources