Practice Using ChromaDB for Multimodal Embeddings

So far in this chapter, we’ve explored vector databases and their importance in efficiently storing and retrieving high-dimensional data. In this lesson, we’ll dive deeper into using an open-source vector database by practicing with Chroma DB. Having the same image and descriptions datasetA dataset of thirty images, mostly of fruits, a few animals, feathers, and an image representing an artificial neural network. For each image, we have a text description. from the multimodal embeddings lesson, we’ll generate multimodal embeddings with Chroma’s utility functions, store these embeddings in the database, and query the database to find semantically similar results across different data modalities (images and text). Let’s start!

Import necessary libraries and modules

First of all, we import chromadb to manage embeddings and collections.

We can generate embeddings outside the Chroma or use embedding functions from the Chroma’s embedding_functions module. We have already explored the first way, and luckily, Chroma supports multimodal embedding functions, enabling the embedding of data from various modalities into a unified embedding space. So, we’ll utilize the multimodal embedding model from Chroma’s embedding_functions module to generate embeddings for our multimodal data. To do this, we import OpenCLIPEmbeddingFunction from chromadb.utils.embedding_functions.

We’ll store embedding to Chroma while our data is placed outside the Chroma. For data placed outside the Chroma, Chroma provides data loaders for loading and saving that data via URIs. Chroma does not store such data directly; instead, it stores the URI and loads the data from the URI as needed. So, to use this data when needed by Chroma, we import ImageLoader from chromadb.utils.data_loaders.

We import the os module for interacting with the operating system, particularly for file handling and pandas for data manipulation and loading CSV files.

Get hands-on with 1400+ tech skills courses.