Vector Search, Text Search, and Hybrid Search
Explore the strengths and limitations of vector search, BM25 keyword search, and hybrid search methods used in vector databases. Learn how to apply Reciprocal Rank Fusion to combine retrieval results for improved precision and recall. Understand when to use each retrieval strategy based on query types and corpus characteristics.
When a user types: What is the GDPR data-residency requirement for EU citizens? into a RAG-powered application, the query carries two distinct signals. “GDPR” is a precise acronym that must match documents containing that exact term. “Data-residency requirement,” on the other hand, is a conceptual phrase whose meaning could appear in passages worded as “obligation to store personal data within European borders” without ever using the word “residency.” No single retrieval method handles both signals optimally. Pick the wrong one, and the system either returns semantically related but irrelevant passages or drops the one document that contains the critical acronym alongside the right context.
With a vector database selected, the next decision that directly shapes answer quality is how you search it. This lesson walks through three retrieval strategies: pure vector search, BM25 keyword search, and hybrid search. It explains the mechanics of each, when to reach for one over another, and how Reciprocal Rank Fusion (RRF) merges their results into a single high-quality ranked list.
Pure vector search
Vector search, also called semantic search, works by converting both the user query and every stored document chunk into numerical vectors that live in the same high-dimensional space. At query time, the system computes a similarity metric, typically
Because embeddings encode meaning rather than surface tokens, vector search handles vocabulary mismatch gracefully. A query containing “automobile” retrieves documents that only mention “car.” It also works across languages when multilingual embedding models are used, and it excels at open-ended or conversational queries where intent matters more than any single keyword.
The limitations are equally important to understand. Rare proper nouns, product SKUs, error codes, and domain-specific identifiers are often underrepresented during embedding model training. The model compresses these tokens into a generic region of the vector space, losing the precise surface form. Retrieval quality also depends heavily on alignment between the embedding model’s training domain and the target corpus. A general-purpose model applied to a specialized legal corpus will degrade recall on domain-specific terminology.
Attention: A high cosine similarity score does not guarantee factual relevance. Two passages can be semantically close in embedding space yet discuss different entities. Always pair vector search with downstream reranking or validation in production RAG pipelines.
The following quiz tests whether you can identify this exact-match limitation in a realistic scenario.
Lesson Quiz
A RAG application fails to retrieve relevant documents when users search for the error code 'ERR_SSL_PROTOCOL_ERROR' despite those documents existing in the corpus. What is the most likely cause when using pure vector search?
The vector index is misconfigured and needs to be rebuilt with different parameters
Embedding models encode meaning rather than exact token sequences, causing rare identifiers to map to generic vector regions
Cosine similarity is the wrong metric for this type of query and should be replaced with dot product
The metadata filters are excluding documents that contain the error code
BM25 keyword search
BM25 (Best Matching 25) is a probabilistic ranking function that scores documents based on
At indexing time, BM25 tokenizes every document and builds an
The strengths and limitations of BM25 mirror those of vector search almost exactly, but in reverse.
Exact-match precision: BM25 excels when queries contain acronyms, legal statute numbers, error codes, or product names, because it matches on the literal token.
Computational efficiency: An inverted-index lookup is fast and well understood, having powered traditional search engines for decades.
Synonym blindness: “Automobile” and “car” are treated as completely unrelated tokens, so BM25 misses relevant documents that use different vocabulary.
Conversational query degradation: Long, natural-language queries dilute the keyword signal, causing ranking quality to drop compared to vector search.
This contrast between ...