Named Entity Recognition with RNNs: Defining the Model

Defining hyperparameters

Now let’s define several hyperparameters needed for our RNN, as shown here:

  • max_seq_length: Denotes the maximum length for a sequence. We infer this from our training data during data exploration. It’s important to have a reasonable length for sequences because otherwise, memory can explode due to the unrolling of the RNN.

  • embedding_size: The dimensionality of token embeddings. Since we have a small corpus, a value < 100 will suffice.

  • rnn_hidden_size: The dimensionality of hidden layers in the RNN. Increasing the dimensionality of the hidden layer usually leads to better performance. However, note that increasing the size of the hidden layer causes all three sets of internal weights (that is, U, W, and V) to increase as well, resulting in a high computational footprint.

  • n_classes: Number of unique output classes present.

  • batch_size: The batch size for training data, validation data, and test data. A hgher batch size often leads to better results as we are seeing more data during each optimization step, but just like unrolling, this causes a higher memory requirement.

  • epochs: The number of epochs to train the model for.

These are defined below:

Get hands-on with 1200+ tech skills courses.