Search⌘ K
AI Features

Calculating Loss

Explore how to convert LSTM outputs into logits and apply sparse softmax cross entropy loss in language models. Understand using a padding mask to exclude padded time steps, ensuring accurate loss calculation. This lesson guides you through implementing loss calculation for sequence data in NLP models using TensorFlow.

Chapter Goals:

  • Convert your LSTM model's outputs into logits

  • Use a padding mask to calculate the overall loss

A. Logits & loss

As mentioned in earlier chapters, the task for a language model is no different from regular multiclass classification. Therefore, the loss function will still be the regular softmax cross entropy loss. We use a final fully-connected layer to convert model outputs into ...