Generating Document Embeddings
Explore the process of generating document embeddings by averaging word vectors from skip-gram, CBOW, GloVe, and contextual ELMo embeddings. Understand how to preprocess text and compute meaningful document representations using these advanced word vector algorithms.
We'll cover the following...
We'll cover the following...
Let’s first remind ourselves how we stored embeddings for the skip-gram, CBOW, and GloVe algorithms. The figure below depicts how these look in a pd.DataFrame object.
We can see that the bottom left corner in the image above says that it has 128 columns (i.e., the embedding size).
ELMo embeddings
ELMo embeddings are an exception to this. Since ELMo generates contextualized representations ...