The first step of entity linking is to build a representation of the terms that you can use in the ML models. It’s also critical to use contextual information (i.e., other terms in the sentence) as you embed the terms. Let’s see why such representation is necessary.
Contextualized text representation
It is often observed that the same words may refer to a different entity. The context (i.e., other terms in the sentence) in which the words occur helps us figure out which entity is being referred to. Similarly, the NER and NED models require context to correctly recognize entity type and disambiguate, respectively. Therefore, the representation of terms must take contextual information into account.
One way to represent text is in the form of embeddings. For instance, let’s say you have the following sentences: