Search⌘ K

Model Improvement

Explore effective methods to enhance Seq2Seq encoder-decoder models, including training strategies like adjusting batch size and learning rate. Understand domain-specific tweaks for tasks such as machine translation, text summarization, and dialog systems to optimize model performance and inference accuracy.

Chapter Goals:

  • Learn strategies for improving an encoder-decoder model

  • Run the encoder-decoder model in inference mode

A. Training strategies

Good encoder-decoder models tend to have a large number of weight parameters, since they consist of large LSTM/BiLSTM layers. Because of this, it usually takes a long time to train an encoder-decoder model to convergence.

To speed up ...