A practical guide to vector search in Amazon DocumentDB

A practical guide to vector search in Amazon DocumentDB

Discover how Amazon DocumentDB brings vector search natively to your document database—enabling intent-based and semantic search without managing a separate vector store.
10 mins read
Nov 21, 2025
Share

Traditional search methods are only effective for exact or keyword-based matches. However, in modern applications such as e-commerce, social platforms, and digital content systems, users expect results that align with their intent, not just the words they type.

This is where vector search transforms the experience. It represents data such as product descriptions, images, or text as numerical embeddings in a high-dimensional space. These embeddings capture semantic relationships, allowing the system to recognize that “blue running shoes” and “navy sports sneakers” are conceptually similar, even if they share no common keywords.

By embedding this capability directly into their database, AWS enables developers to store, index, and query embeddings within the same document database that holds their structured application data. The result is faster, more relevant, and context-aware search and recommendation features, all without the need to manage a separate vector database.

What are the available options for vector search in AWS?#

The ability to store and perform similarity searches on vector embeddings is a foundational requirement for retrieval-augmented generation (RAG) workflows. However, the best place to host these vectors depends entirely on your specific workload and architectural needs. AWS offers a diverse portfolio of fully managed services, allowing you to choose a solution optimized for hybrid search, transactional integrity, serverless capacity, or managed enterprise knowledge discovery.

  • Amazon OpenSearch service (managed OpenSearch): OpenSearch supports knn_vector fields and k-NN/approximate k-NN search, using algorithms such as HNSWHierarchical Navigable Small World, is an algorithm used for approximate nearest neighbor search in high-dimensional datasets. . It is a mature search engine that seamlessly combines full-text, structured filtering, aggregations, and vector similarity, and it supports very high vector dimensionality and rich query/score customization. OpenSearch is a natural choice when you need advanced search features, hybrid text and vector ranking, equipped with a rich analytics/searching surface.

  • OpenSearch serverless (vector search collections): This is a serverless flavor of OpenSearch that provides vector-search collections that auto scale and remove cluster management. It’s designed for variable or unpredictable workloads and for teams who want vector search without managing capacity, while retaining OpenSearch’s hybrid search capabilities. It is the ideal choice for variable or unpredictable vector workloads where you need fully managed scaling combined with the rich OpenSearch functionality, all without provisioning nodes.

  • Amazon Aurora PostgreSQL + pgvector: Aurora supports the pgvector extension so you can store and index embeddings inside a PostgreSQL-compatible relational database. This is a good fit when your vectors are tightly coupled to relational schemas, transactions, and analytics, or when you prefer SQL tooling and ACID semantics. It can be very cost-effective for smaller or hybrid workloads, but typically requires relational query design (and careful indexing) to achieve vector search scale.

  • Amazon Kendra (GenAI/enterprise search): Kendra provides a managed, high-quality semantic document retrieval service (with re-rankers and retrieval models), aimed at enterprise search and RAG workflows. It handles the complexity of embedding generation, document chunking, and retrieval automatically, returning high-quality passage or document matches. However, it is not a general-purpose vector database for large, custom vector workloads, as it is optimized specifically for document discovery and enterprise-knowledge use cases.

  • Amazon DocumentDB’s vector search: Amazon DocumentDB’s vector search capability is a significant architectural offering, moving beyond a simple storage feature. It provides a production-grade approximate nearest neighbor (ANN) indexing engine directly within your operational document store. Embedding vector search efficiently, AWS provides a compelling, integrated solution that eliminates the complexities traditionally associated with multi-database architectures for generative AI applications. This approach uses the familiarity and operational benefits of DocumentDB, making it a highly pragmatic choice for developers. It is specifically beneficial for anyone looking to add semantic search or recommendation features to applications that are already using it as the system of record.

Amazon DocumentDB’s vector search is different in several concrete, technical ways that are important when you evaluate trade-offs.

  1. Unified document and vector storage (no external sync): DocumentDB lets you store embeddings as fields inside the same JSON documents that hold your product metadata, user profiles, or content. This eliminates the operational complexity and eventual-consistency problems of synchronizing a separate vector database with your primary document store (no ETL or change-data-capture pipeline required). This is often a decisive advantage if your application already treats DocumentDB as the system of record.

  2. Built-in ANN indexing algorithms (HNSW and IVFFlat) and tuning knobs: DocumentDB supports efficient approximate-nearest-neighbor indexes (including HNSW and IVFFlat options), with parameters that you can tune (index build settings and query-time parameters such as efSearch/probes). This gives you the usual vector-DB controls over build time, memory, recall, and query latency. This is so that it’s not merely “DocumentDB storing vectors,” it is something that actually provides production-grade ANN indexing.

  3. MongoDB-compatible API and developer ecosystem: As DocumentDB is MongoDB-compatible, you can continue using existing drivers, tooling, ORMs, and developer patterns (aggregation pipelines, BSON documents). This lowers migration and development effort compared to adopting a new API or SDK for a dedicated vector DB. It also means you can combine vector search inside aggregation pipelines and apply existing $match filters or other pipeline stages to implement faceted, or filtered semantic search.

  4. Managed AWS operational model and security posture: DocumentDB inherits AWS managed-service features: VPC isolation, IAM controls, KMS encryption, automated backups, and availability across AZs. If you already rely on DocumentDB’s operational model and AWS networking/security controls, adding vectors keeps everything under the same operational umbrella and compliance posture. This reduces the number of separate services to secure and monitor.

Important considerations before using DocumentDB as a vector store#

When choosing DocumentDB’s vector search, it is critical to understand its constraints and when alternative services might be a better fit. If your application already uses DocumentDB as the operational, metadata store, and you want to add semantic search/recommendations without creating a separate vector pipeline, DocumentDB is often the simplest, most maintainable option.

Practical limits and constraints that you must consider#

DocumentDB’s vector feature has specific constraints and behaviors you must design around: indexed vector dimension limits (indexing guidance/limits apply, e.g., practical indexing up to a couple thousand dimensions). There’s also index build time and memory trade-offs for HNSW, and the requirement to run on DocumentDB 5.0 instance-based clusters.

Additionally, DocumentDB’s vector API is surfaced via aggregation/search stages (so your query patterns should be designed accordingly). These constraints mean DocumentDB is excellent when vectors are a feature of your document workload, but for massive, vector-only workloads or highly specialized vector features, a purpose-built vector engine (OpenSearch Serverless or a dedicated vector DB) may be better.

When DocumentDB isn’t the right choice#

DocumentDB vector store is certainly not a one-size-fits-all solution. Let’s look at certain scenarios when it might not be the best choice for a vector store.

  • If you need advanced search, hybrid ranking, or scalable serverless vector storage, OpenSearch or a dedicated vector DB may be a better fit.

  • If your data is highly relational and you prefer SQL/ACID semantics with vector capabilities, Aurora and pgvector can be an effective alternative.

DocumentDB’s vector search shines when you need semantic capabilities tightly integrated with existing JSON workloads. Its core strengths include unified document and vector storage, built-in ANN indexing, MongoDB-compatible APIs, and AWS-managed security and scaling. For large-scale or highly specialized vector workloads, however, services like OpenSearch Serverless or Aurora pgvector may deliver better performance and flexibility.

Recommendation system with Amazon DocumentDB#

To understand how vector search works in practice, let’s walk through a real-world example of building a product recommendation system, using Amazon Bedrock for embedding generation and Amazon DocumentDB for semantic retrieval.

This architecture allows applications to recommend items based on semantic similarity, not just keyword matching, enabling smarter and more intuitive search experiences.

High-level architecture of a semantic recommendation system
High-level architecture of a semantic recommendation system

Step 1: Generate product embeddings with Amazon Bedrock#

The first step is to generate product embeddings using the foundation models in Amazon Bedrock. We can use the Amazon Titan Embeddings model to represent each product as a high-dimensional vector that captures its meaning and relationships with similar items.

For example, the product description retrieved from DocumentDB is converted into an embedding vector that the system can later use to find semantically similar products.

Python 3.10.4 (Debug)
import boto3
import json
from pymongo import MongoClient
# Initialize Bedrock runtime client
session = boto3.session.Session(
aws_access_key_id="{{access_key_id}}",
aws_secret_access_key="{{secret_access_key}}",
region_name="us-east-1"
)
bedrock = session.client("bedrock-runtime")
# Connect to Amazon DocumentDB
client = MongoClient("mongodb://<your-docdb-endpoint>")
db = client["productdb"]
collection = db["items"]
# Fetch product document from DocumentDB
product = collection.find_one({"product_name": "Blue Running Shoes"})
product_description = product["description"]
# Generate embedding for the product description
response = bedrock.invoke_model(
modelId="amazon.titan-embed-text-v2",
body=json.dumps({"inputText": product_description})
)
# Extract the embedding vector
embedding = json.loads(response["body"].read())["embedding"]

Here, we first connect to DocumentDB to fetch the product details, then invoke the embedding model to convert the description into a high-dimensional vector that captures its semantic meaning. This is useful for similarity searches or recommendations within the same database.

Step 2: Store embeddings in Amazon DocumentDB#

Next, we store these embeddings in an Amazon DocumentDB collection, and update the existing product document in Amazon DocumentDB to include its newly generated embedding.

This keeps both structured product details and unstructured vector data in one place, simplifying architecture and avoiding the need for external vector databases.

Python 3.10.4 (Debug)
from pymongo import MongoClient
# Connect to Amazon DocumentDB
client = MongoClient("mongodb://<your-docdb-endpoint>")
db = client["productdb"]
collection = db["items"]
# Update existing product document with embedding
collection.update_one(
{"product_name": "Blue Running Shoes"},
{"$set": {"embedding": embedding}}
)

This code connects to Amazon DocumentDB and updates an existing product document by adding the generated embedding vector. It locates the product using its name (“Blue Running Shoes”) and stores the new vector under the embedding field, enabling semantic similarity searches directly within the same collection.

Now, the product document in DocumentDB includes both its traditional fields (like name, category, and price) and an embedding vector for semantic similarity searches.

When a user interacts with the application, for example, by searching for or viewing a product, you can generate an embedding for that query. You can then use Amazon DocumentDB’s vector search capability to find items with embeddings similar to the query vector.

Before performing vector search, we create an HNSW index on the embedding field to enable fast similarity-based retrieval as follows:

Python 3.10.4 (Debug)
# Create HNSW vector index on the 'embedding' field
collection.create_index(
[
("embedding", "vector")
],
name="embedding_hnsw_index",
vectorOptions={
"type": "hnsw",
"dimensions": 256,
"distanceMetric": "cosine"
}
)

Here, we create an HNSW vector index on the embedding field in Amazon DocumentDB. The index is configured with 256 dimensions (matching the embedding size) and uses the cosine similarity metric to measure how close two vectors are.

Behind-the-scenes, DocumentDB uses an HNSW-based approximate nearest neighbor (ANN) index to efficiently retrieve the most relevant matches.

Python 3.10.4 (Debug)
query_vector = get_embedding("Lightweight jogging sneakers")
results = collection.aggregate([
{
"$search": {
"vectorSearch": {
"path": "embedding", # Field containing vector embeddings
"queryVector": query_vector, # The input query vector
"k": 5 # Retrieve top 5 most similar items
}
}
}
])

Here, we generate an embedding for a query (e.g., “Lightweight jogging sneakers”) and perform a vector search in Amazon DocumentDB.

The $search stage compares this query vector with stored embeddings and returns the top 5 most semantically similar products.

Possible matches might include:

  • “Navy performance sneakers.”

  • “Breathable sports trainers.”

  • “Lightweight mesh running shoes.”

Each result is ranked based on its vector similarity score, representing how closely related it is to the query in meaning.

Best practices for implementing vector search in DocumentDB#

To get the most out of Amazon DocumentDB’s vector search, follow these best practices to optimize performance, accuracy, and scalability.

  • Align embedding dimensions with your model: Ensure that your vector field dimensions match those produced by your embedding model (for example, 256 for Amazon Titan Embeddings). This alignment prevents schema mismatches and ensures optimal vector indexing and retrieval.

  • Use HNSW indexing for large datasets: The Hierarchical Navigable Small World (HNSW) algorithm provides a balanced trade-off between recall accuracy and query speed. This makes it well-suited for large-scale, high-performance semantic search.

  • Batch insert vectors during ingestion: Load embeddings in bulk instead of one at a time to reduce API call overhead, minimize network latency, and speed up initial data population.

  • Monitor performance metrics in Amazon CloudWatch: Track key metrics like query latency, index build time, and memory utilization. Configure CloudWatch alarms to automatically trigger actions, such as scaling up instance size, adding read replicas, or rebuilding vector indexes if latency or resource usage exceeds defined thresholds. This ensures consistent performance as your dataset grows.

  • Refresh embeddings periodically: Set up an automated workflow, for example, an AWS Lambda function triggered by Amazon DocumentDB change streams or Amazon EventBridge. Do this to recompute and update embeddings whenever source content such as product descriptions, user profiles, or documents changes. This ensures that your vector index always reflects the latest data, maintaining accurate and context-aware search results.

Final thoughts#

AI-driven personalization is redefining how users interact with applications and Amazon DocumentDB’s vector search brings that intelligence directly to your data layer.

It allows developers to store, index, and query embeddings alongside structured data and eliminates the need for complex external vector stores, and streamlines the entire retrieval-augmented workflow. The result is a simpler, more unified architecture for building semantic search, product recommendations, and intelligent data exploration, all within the same managed database environment.

As organizations increasingly move beyond traditional keyword search toward context-aware intelligence, Amazon DocumentDB’s vector search stands out as a practical and scalable solution. Whether you’re enhancing existing applications or building AI-integrated/optimized systems from the ground-up, this is the right time to explore what DocumentDB can do for your AI-powered workloads.

It's also a great time to explore the following Cloud Labs:


Written By:
Fahim ul Haq
Free Edition
AWS CodeCommit is back—and hopefully for good
AWS CodeCommit’s 2024 de-emphasis reshaped source control and CI/CD across AWS. With CodeCommit now back in GA and a clear roadmap, teams can reassess AWS-native workflows and pipeline design.
7 mins read
Feb 6, 2026