Language Model
Explore how language models assign probabilities to words in a sequence using LSTM networks. Learn to prepare training data by creating input-target pairs, apply sequence truncation to optimize learning, and understand the role of multiclass classification in predicting text sequences. This lesson equips you with foundational skills to implement and train effective language models.
We'll cover the following...
Chapter Goals:
Understand how a language model works
Learn how to set up the training data for a language model
A. Word probabilities
As mentioned in the introduction, the purpose of a language model is to assign probabilities to words in sequences of text. The probability for each word is conditioned on the words that appear before it in the sequence. Though it may not seem like it at first glance, the task of calculating a word probability based on the previous words in a sequence is essentially multiclass classification.
Since each word in a sequence must come from the text corpus' vocabulary, we consider each vocabulary word as a separate class. We then use the previous sequence words as input ...