Search⌘ K
AI Features

Semantic Search: Problem Framing & Requirements

Explore how to frame semantic search system design by understanding query interpretation, key business metrics, and scale challenges. This lesson helps you decompose query understanding into intent classification, entity recognition, and disambiguation, set success metrics like NDCG and zero-result rate, and grasp latency budgets and embedding drift. Gain the foundation needed to align design decisions with production constraints and interview expectations.

Search is the single most frequently asked ML system design topic in MAANG interviews, and for good reason. It cuts across every layer of the ML stack, from data ingestion and feature engineering to model serving and online evaluation. Whether the prompt is “Design a semantic search system for Google,” “Build product search for Amazon,” or “Design people search for LinkedIn,” the interviewer is testing whether you can frame the problem before you solve it. This lesson walks through that framing step for semantic search. We will define query understanding as an upstream ML problem, establish the business metrics that quantify search quality, and set the scale constraints that rule out naive solutions. These three pillars form the foundation that every downstream design decision, from model architecture to serving infrastructure, must respect.

The industry-standard architecture for semantic search pairs dense vector embeddings with approximate nearest neighbor (ANN) algorithms. Understanding why that architecture exists requires understanding the problem space first. Candidates who skip problem framing and jump straight to model selection almost always optimize for the wrong objective or violate latency constraints.

Most candidates underprepare this step.

Query understanding as a prerequisite ML problem

Retrieval quality in a semantic search system is bounded by how well the system understands the query. A query like “apple” is useless to a retrieval engine without context. Query understanding is the upstream ML problem that resolves this ambiguity and routes the request to the right retrieval pipeline. It decomposes into three sub-problems.

Intent classification

Every query carries an implicit intent. Intent classification is a text classification task that assigns a query to one of several predefined categories, most commonly navigational, informational, or transactional, to determine which retrieval and ranking pipeline should handle the request. Consider these examples across platforms.

  • Navigational queries direct the user to a specific destination. “LinkedIn login” on Google or “my orders” on Amazon are navigational because the user already knows what they want.

  • Informational queries seek knowledge or comparison. “Best noise-cancelling headphones” on Amazon or “how to prepare for ML interviews” on LinkedIn fall here.

  • Transactional queries ...