Embeddings
Learn how to convert a text into embedding vectors and find the similarity ratio between two separate texts.
We'll cover the following...
The embeddings endpoint
Embedding is a method to represent the data in a vector of continuous numbers. We can provide these vectors to machine learning algorithms and models. Similar texts will have the same embedding vectors, and two different texts will have very different embeddings. OpenAI API takes text as input and returns the embedding vector.
The Embeddings API call
To get an embeddings vector for a chunk of text, we can call the following function:
response = client.embeddings.create(input="The text whose embeddings are required",model="<model_name>")
Understanding the embeddings endpoint
Let’s look at the embeddings endpoint in more detail, reviewing the request parameters and the response parameters.
Request parameters
Let’s look at the parameters that are required to make a request at the embeddings endpoint.
| Fields | Format | Type | Description | 
| 
 | String | Required | The ID of the engine to use for this request. | 
| 
 | String/Array | Required | Input text for embedding, encoded as either a string or an array of tokens. For embedding multiple inputs within a single request, provide an array consisting of strings or arrays of tokens. The input size should not surpass the maximum input token limit for the model (8192 tokens for  | 
| 
 | String | Optional | The format in which embeddings is returned. The returned format can either be  | 
| 
 | Integer | Optional | The specified number of dimensions for the resulting output embeddings. This feature is supported starting from  | 
| 
 | String | Optional | A unique identifier that represents your end-user, aiding OpenAI in monitoring and identifying potential misuse or abuse. | 
Note ...