Classical Approaches to Learning Word Representations

Word representations

What is meant by the word “meaning”? This is more of a philosophical question than a technical one. So, we won’t try to discern the best answer for this question but accept a more modest answer: meaning is the idea conveyed by or some representation associated with a word. For example, when we hear the word “cat,” we conjure up a mental picture of something that meows, has four legs, has a tail, and so on; then, if we hear the word “dog,” we again formulate a mental image of something that barks, has a bigger body than a cat, has four legs, has a tail, and so on. In this new space (that is, the mental pictures), it’s easier for us to understand that cats and dogs are similar than by just looking at the words.

Since the primary objective of NLP is to achieve human-like performance in linguistic tasks, it’s sensible to explore principled ways of representing words for machines. To achieve this, we’ll use algorithms that can analyze a given text corpus and come up with good numerical representations of words (that is, word embeddings) so that words that fall within similar contexts (for example, “one” and “two,” “I” and “we”) will have similar numerical representations compared to words that are unrelated (for example, “cat” and “volcano”).

Get hands-on with 1200+ tech skills courses.