Search⌘ K
AI Features

Vector Search, Text Search, and Hybrid Search

Explore the strengths and limitations of vector search, BM25 keyword search, and hybrid search methods used in vector databases. Learn how to apply Reciprocal Rank Fusion to combine retrieval results for improved precision and recall. Understand when to use each retrieval strategy based on query types and corpus characteristics.

When a user types: What is the GDPR data-residency requirement for EU citizens? into a RAG-powered application, the query carries two distinct signals. “GDPR” is a precise acronym that must match documents containing that exact term. “Data-residency requirement,” on the other hand, is a conceptual phrase whose meaning could appear in passages worded as “obligation to store personal data within European borders” without ever using the word “residency.” No single retrieval method handles both signals optimally. Pick the wrong one, and the system either returns semantically related but irrelevant passages or drops the one document that contains the critical acronym alongside the right context.

With a vector database selected, the next decision that directly shapes answer quality is how you search it. This lesson walks through three retrieval strategies: pure vector search, BM25 keyword search, and hybrid search. It explains the mechanics of each, when to reach for one over another, and how Reciprocal Rank Fusion (RRF) merges their results into a single high-quality ranked list.

Pure vector search

Vector search, also called semantic search, works by converting both the user query and every stored document chunk into numerical vectors that live in the same high-dimensional space. At query time, the system computes a similarity metric, typically cosine similarityA measure of the angle between two vectors, where a value of 1 means the vectors point in the same direction (maximum similarity) and 0 means they are orthogonal (no similarity). or dot product, between the query vector and every document vector, then returns the top-k most similar results.

Because embeddings encode meaning rather than surface tokens, vector search handles vocabulary mismatch gracefully. A query containing “automobile” retrieves documents that only mention “car.” It also works across languages when multilingual embedding models are used, and it excels at open-ended or conversational queries where intent matters more than any single keyword.

The limitations are equally important to understand. Rare proper nouns, product SKUs, error codes, and domain-specific identifiers are often underrepresented during embedding model training. The model compresses these tokens into a generic region of the vector space, losing the precise surface form. Retrieval quality also depends heavily on alignment between the embedding model’s training domain and the target corpus. A general-purpose model applied to a specialized legal corpus will degrade recall on domain-specific terminology.

Attention: A high cosine similarity score does not guarantee factual relevance. Two passages can be semantically close in embedding space yet discuss different entities. Always pair vector search with downstream reranking or validation in production RAG pipelines.

The following quiz tests whether you can identify this exact-match limitation in a realistic scenario.

Lesson Quiz

1.

A RAG application fails to retrieve relevant documents when users search for the error code 'ERR_SSL_PROTOCOL_ERROR' despite those documents existing in the corpus. What is the most likely cause when using pure vector search?

A.

The vector index is misconfigured and needs to be rebuilt with different parameters

B.

Embedding models encode meaning rather than exact token sequences, causing rare identifiers to map to generic vector regions

C.

Cosine similarity is the wrong metric for this type of query and should be replaced with dot product

D.

The metadata filters are excluding documents that contain the error code


1 / 1

BM25 keyword search

BM25 (Best Matching 25) is a probabilistic ranking function that scores documents based on term frequency–inverse document frequency (TF-IDF)A weighting scheme where a term's importance increases with its frequency in a document but decreases with its frequency across the entire corpus, ensuring common words like "the" carry little weight.. The core intuition is straightforward: a term that appears frequently in a specific document but rarely across the entire corpus is a strong signal that the document is relevant to a query containing that term.

At indexing time, BM25 tokenizes every document and builds an inverted indexA data structure that maps each unique term to the list of documents (and positions) where it appears, enabling fast lookup by keyword.. At query time, the system tokenizes the query, looks up each token in the inverted index, and computes a score for every matching document.

The strengths and limitations of BM25 mirror those of vector search almost exactly, but in reverse.

  • Exact-match precision: BM25 excels when queries contain acronyms, legal statute numbers, error codes, or product names, because it matches on the literal token.

  • Computational efficiency: An inverted-index lookup is fast and well understood, having powered traditional search engines for decades.

  • Synonym blindness: “Automobile” and “car” are treated as completely unrelated tokens, so BM25 misses relevant documents that use different vocabulary.

  • Conversational query degradation: Long, natural-language queries dilute the keyword signal, causing ranking quality to drop compared to vector search.

This contrast between ...