Text Similarity
Learn to identify sentence similarity using the Hugging Face Inference API.
We'll cover the following...
Sentence similarity is used to check how similar (or dissimilar) two sentences or passages are. From plagiarism checking to information retrieval, sentence similarity has a lot of uses.
Find text similarity using the API
The sentence-transformers/all-MiniLM-L6-v2 model is recommended for text similarity tasks. However, there are many models available for this task, and some common models are below:
Models for Text Similarity
Model | Description |
| Based on |
| Based on |
| Based on |
We can call the following endpoint via the POST request method for the text similarity tasks by replacing the path parameter {model} with any model mentioned above:
https://api-inference.huggingface.co/models/{model}
Request parameters
The request parameters for this API call are as follows:
Parameter | Type | Category | Description |
| String | Required | Specifies the source sentence/passage to compare with other |
| String | Required | Specifies the list of sentences that will be compared with |
| Boolean | Optional | Hugging Face Inference API has a cache mechanism implemented to speed up the requests. Use it for the deterministic models. Default value is |
| Boolean | Optional | Hugging Face Inference API models takes time to initiate and process the requests. If the value is |
The following code checks for the similarity of source_sentence with sentences.
Let’s have a look at the highlighted lines shown in the code widget above:
Line 2: We specify the
sentence-transformers/all-MiniLM-L6-v2model for text similarity.Lines 10–11: We set the
source_sentenceto compare it with thesentences.Lines 24–31: We create a function,
textSimilarity, to make the API call and handle the exceptions.Line 33: We call the
textSimilarityfunction to invoke the endpoint.
Response fields
The API call above returns a list of scores depending on the inputs, and these are the semantic similarity scores of the source_sentence with each sentence in the sentences list. We have two sentences to find the similarity with source_sentence, and the score of the first sentence will be higher because it talks about the weather conditions, which are related to the source_sentence.
Example
Try out the following example in the widget above, and observe which sentence has a high score: