Calculating Loss
Explore how to convert LSTM outputs into logits and apply sparse softmax cross entropy loss in language models. Understand using a padding mask to exclude padded time steps, ensuring accurate loss calculation. This lesson guides you through implementing loss calculation for sequence data in NLP models using TensorFlow.
We'll cover the following...
We'll cover the following...
Chapter Goals:
Convert your LSTM model's outputs into logits
Use a padding mask to calculate the overall loss
A. Logits & loss
As mentioned in earlier chapters, the task for a language model is no different from regular multiclass classification. Therefore, the loss function will still be the regular softmax cross entropy loss. We use a final fully-connected layer to convert model outputs into ...