Traditional search methods are only effective for exact or keyword-based matches. However, in modern applications such as e-commerce, social platforms, and digital content systems, users expect results that align with their intent, not just the words they type.
This is where vector search transforms the experience. It represents data such as product descriptions, images, or text as numerical embeddings in a high-dimensional space. These embeddings capture semantic relationships, allowing the system to recognize that “blue running shoes” and “navy sports sneakers” are conceptually similar, even if they share no common keywords.
By embedding this capability directly into their database, AWS enables developers to store, index, and query embeddings within the same document database that holds their structured application data. The result is faster, more relevant, and context-aware search and recommendation features, all without the need to manage a separate vector database.
The ability to store and perform similarity searches on vector embeddings is a foundational requirement for retrieval-augmented generation (RAG) workflows. However, the best place to host these vectors depends entirely on your specific workload and architectural needs. AWS offers a diverse portfolio of fully managed services, allowing you to choose a solution optimized for hybrid search, transactional integrity, serverless capacity, or managed enterprise knowledge discovery.
Amazon OpenSearch service (managed OpenSearch): OpenSearch supports knn_vector fields and k-NN/approximate k-NN search, using algorithms such as
OpenSearch serverless (vector search collections): This is a serverless flavor of OpenSearch that provides vector-search collections that auto scale and remove cluster management. It’s designed for variable or unpredictable workloads and for teams who want vector search without managing capacity, while retaining OpenSearch’s hybrid search capabilities. It is the ideal choice for variable or unpredictable vector workloads where you need fully managed scaling combined with the rich OpenSearch functionality, all without provisioning nodes.
Amazon Aurora PostgreSQL + pgvector: Aurora supports the pgvector extension so you can store and index embeddings inside a PostgreSQL-compatible relational database. This is a good fit when your vectors are tightly coupled to relational schemas, transactions, and analytics, or when you prefer SQL tooling and ACID semantics. It can be very cost-effective for smaller or hybrid workloads, but typically requires relational query design (and careful indexing) to achieve vector search scale.
Amazon Kendra (GenAI/enterprise search): Kendra provides a managed, high-quality semantic document retrieval service (with re-rankers and retrieval models), aimed at enterprise search and RAG workflows. It handles the complexity of embedding generation, document chunking, and retrieval automatically, returning high-quality passage or document matches. However, it is not a general-purpose vector database for large, custom vector workloads, as it is optimized specifically for document discovery and enterprise-knowledge use cases.
Amazon DocumentDB’s vector search: Amazon DocumentDB’s vector search capability is a significant architectural offering, moving beyond a simple storage feature. It provides a production-grade approximate nearest neighbor (ANN) indexing engine directly within your operational document store. Embedding vector search efficiently, AWS provides a compelling, integrated solution that eliminates the complexities traditionally associated with multi-database architectures for generative AI applications. This approach uses the familiarity and operational benefits of DocumentDB, making it a highly pragmatic choice for developers. It is specifically beneficial for anyone looking to add semantic search or recommendation features to applications that are already using it as the system of record.
Amazon DocumentDB’s vector search is different in several concrete, technical ways that are important when you evaluate trade-offs.
Unified document and vector storage (no external sync): DocumentDB lets you store embeddings as fields inside the same JSON documents that hold your product metadata, user profiles, or content. This eliminates the operational complexity and eventual-consistency problems of synchronizing a separate vector database with your primary document store (no ETL or change-data-capture pipeline required). This is often a decisive advantage if your application already treats DocumentDB as the system of record.
Built-in ANN indexing algorithms (HNSW and IVFFlat) and tuning knobs: DocumentDB supports efficient approximate-nearest-neighbor indexes (including HNSW and IVFFlat options), with parameters that you can tune (index build settings and query-time parameters such as efSearch/probes). This gives you the usual vector-DB controls over build time, memory, recall, and query latency. This is so that it’s not merely “DocumentDB storing vectors,” it is something that actually provides production-grade ANN indexing.
MongoDB-compatible API and developer ecosystem: As DocumentDB is MongoDB-compatible, you can continue using existing drivers, tooling, ORMs, and developer patterns (aggregation pipelines, BSON documents). This lowers migration and development effort compared to adopting a new API or SDK for a dedicated vector DB. It also means you can combine vector search inside aggregation pipelines and apply existing $match filters or other pipeline stages to implement faceted, or filtered semantic search.
Managed AWS operational model and security posture: DocumentDB inherits AWS managed-service features: VPC isolation, IAM controls, KMS encryption, automated backups, and availability across AZs. If you already rely on DocumentDB’s operational model and AWS networking/security controls, adding vectors keeps everything under the same operational umbrella and compliance posture. This reduces the number of separate services to secure and monitor.
When choosing DocumentDB’s vector search, it is critical to understand its constraints and when alternative services might be a better fit. If your application already uses DocumentDB as the operational, metadata store, and you want to add semantic search/recommendations without creating a separate vector pipeline, DocumentDB is often the simplest, most maintainable option.
DocumentDB’s vector feature has specific constraints and behaviors you must design around: indexed vector dimension limits (indexing guidance/limits apply, e.g., practical indexing up to a couple thousand dimensions). There’s also index build time and memory trade-offs for HNSW, and the requirement to run on DocumentDB 5.0 instance-based clusters.
Additionally, DocumentDB’s vector API is surfaced via aggregation/search stages (so your query patterns should be designed accordingly). These constraints mean DocumentDB is excellent when vectors are a feature of your document workload, but for massive, vector-only workloads or highly specialized vector features, a purpose-built vector engine (OpenSearch Serverless or a dedicated vector DB) may be better.
DocumentDB vector store is certainly not a one-size-fits-all solution. Let’s look at certain scenarios when it might not be the best choice for a vector store.
If you need advanced search, hybrid ranking, or scalable serverless vector storage, OpenSearch or a dedicated vector DB may be a better fit.
If your data is highly relational and you prefer SQL/ACID semantics with vector capabilities, Aurora and pgvector can be an effective alternative.
DocumentDB’s vector search shines when you need semantic capabilities tightly integrated with existing JSON workloads. Its core strengths include unified document and vector storage, built-in ANN indexing, MongoDB-compatible APIs, and AWS-managed security and scaling. For large-scale or highly specialized vector workloads, however, services like OpenSearch Serverless or Aurora pgvector may deliver better performance and flexibility.
To understand how vector search works in practice, let’s walk through a real-world example of building a product recommendation system, using Amazon Bedrock for embedding generation and Amazon DocumentDB for semantic retrieval.
This architecture allows applications to recommend items based on semantic similarity, not just keyword matching, enabling smarter and more intuitive search experiences.
The first step is to generate product embeddings using the foundation models in Amazon Bedrock. We can use the Amazon Titan Embeddings model to represent each product as a high-dimensional vector that captures its meaning and relationships with similar items.
For example, the product description retrieved from DocumentDB is converted into an embedding vector that the system can later use to find semantically similar products.
Here, we first connect to DocumentDB to fetch the product details, then invoke the embedding model to convert the description into a high-dimensional vector that captures its semantic meaning. This is useful for similarity searches or recommendations within the same database.
Next, we store these embeddings in an Amazon DocumentDB collection, and update the existing product document in Amazon DocumentDB to include its newly generated embedding.
This keeps both structured product details and unstructured vector data in one place, simplifying architecture and avoiding the need for external vector databases.
This code connects to Amazon DocumentDB and updates an existing product document by adding the generated embedding vector. It locates the product using its name (“Blue Running Shoes”) and stores the new vector under the embedding field, enabling semantic similarity searches directly within the same collection.
Now, the product document in DocumentDB includes both its traditional fields (like name, category, and price) and an embedding vector for semantic similarity searches.
When a user interacts with the application, for example, by searching for or viewing a product, you can generate an embedding for that query. You can then use Amazon DocumentDB’s vector search capability to find items with embeddings similar to the query vector.
Before performing vector search, we create an HNSW index on the embedding field to enable fast similarity-based retrieval as follows:
Here, we create an HNSW vector index on the embedding field in Amazon DocumentDB. The index is configured with 256 dimensions (matching the embedding size) and uses the cosine similarity metric to measure how close two vectors are.
Behind-the-scenes, DocumentDB uses an HNSW-based approximate nearest neighbor (ANN) index to efficiently retrieve the most relevant matches.
Here, we generate an embedding for a query (e.g., “Lightweight jogging sneakers”) and perform a vector search in Amazon DocumentDB.
The $search stage compares this query vector with stored embeddings and returns the top 5 most semantically similar products.
Possible matches might include:
“Navy performance sneakers.”
“Breathable sports trainers.”
“Lightweight mesh running shoes.”
Each result is ranked based on its vector similarity score, representing how closely related it is to the query in meaning.
To get the most out of Amazon DocumentDB’s vector search, follow these best practices to optimize performance, accuracy, and scalability.
Align embedding dimensions with your model: Ensure that your vector field dimensions match those produced by your embedding model (for example, 256 for Amazon Titan Embeddings). This alignment prevents schema mismatches and ensures optimal vector indexing and retrieval.
Use HNSW indexing for large datasets: The Hierarchical Navigable Small World (HNSW) algorithm provides a balanced trade-off between recall accuracy and query speed. This makes it well-suited for large-scale, high-performance semantic search.
Batch insert vectors during ingestion: Load embeddings in bulk instead of one at a time to reduce API call overhead, minimize network latency, and speed up initial data population.
Monitor performance metrics in Amazon CloudWatch: Track key metrics like query latency, index build time, and memory utilization. Configure CloudWatch alarms to automatically trigger actions, such as scaling up instance size, adding read replicas, or rebuilding vector indexes if latency or resource usage exceeds defined thresholds. This ensures consistent performance as your dataset grows.
Refresh embeddings periodically: Set up an automated workflow, for example, an AWS Lambda function triggered by Amazon DocumentDB change streams or Amazon EventBridge. Do this to recompute and update embeddings whenever source content such as product descriptions, user profiles, or documents changes. This ensures that your vector index always reflects the latest data, maintaining accurate and context-aware search results.
AI-driven personalization is redefining how users interact with applications and Amazon DocumentDB’s vector search brings that intelligence directly to your data layer.
It allows developers to store, index, and query embeddings alongside structured data and eliminates the need for complex external vector stores, and streamlines the entire retrieval-augmented workflow. The result is a simpler, more unified architecture for building semantic search, product recommendations, and intelligent data exploration, all within the same managed database environment.
As organizations increasingly move beyond traditional keyword search toward context-aware intelligence, Amazon DocumentDB’s vector search stands out as a practical and scalable solution. Whether you’re enhancing existing applications or building AI-integrated/optimized systems from the ground-up, this is the right time to explore what DocumentDB can do for your AI-powered workloads.
It's also a great time to explore the following Cloud Labs: