Identifying the ML Task Type

Discover how to correctly identify one of the seven primary machine learning task types based on problem requirements and output contracts. Understand their distinct input-output behaviors, recognition signals, and implications for model architecture, loss functions, evaluation metrics, and serving constraints. Learn a three-step framework to justify task-type decisions in interviews by aligning business metrics, task outputs, and production considerations. This lesson helps you avoid costly early mistakes and demonstrates the senior-level awareness needed to design robust ML systems.

We'll cover the following...

Defining each task type
- Recognition heuristics for each type
  - Latency implications of task-type choice
When one problem maps to multiple types
- A three-step decision framework
A framework for defending your choice
Conclusion

When an interviewer says “Design a system to rank YouTube search results,” the very next move you make determines the trajectory of your entire design. Pick the wrong ML task type and every downstream decision, including loss function, evaluation metric, and serving architecture, lands on a flawed foundation. This lesson gives you a reliable method for getting that decision right every time.

With clarifying questions answered and requirements locked from the previous phase, the critical next step is translating those requirements into a concrete ML task type. Consider the YouTube example again. The output the system must produce is an ordered list of videos, not a binary label indicating whether a single video is relevant. That distinction alone separates a ranking formulation from a classification formulation, and it changes the loss function from cross-entropy to a pairwise or listwise ranking loss. Experienced interviewers evaluate whether you can make this distinction and justify it with reasoning tied to the problem’s input-output contract and business objective.

This lesson covers seven primary ML task types that appear repeatedly in system design interviews. Each one has a distinct input-output contract, recognition signals in the problem statement, and architectural implications for serving.

Note: Choosing the wrong task type is one of the most costly early mistakes in an ML system design interview. It cascades into misaligned metrics, incorrect model architectures, and flawed evaluation strategies.

The following map provides a bird’s-eye view of seven task types, their definitions, and one canonical example for each.

With this taxonomy in view, the next step is understanding what makes each task type distinct and how to spot it from a problem statement.

Defining each task type

Recognizing the correct task type from an interview prompt requires understanding the defining characteristics of each one. The differences come down to what the model outputs, what signals appear in the problem statement, and what architectural constraints follow.

Recognition heuristics for each type

The seven task types each carry specific linguistic and structural signals that surface during the problem statement or clarifying questions.

Ranking produces an ordered list scored by relevance or utility. Look for phrases like “show the most relevant,” “order by,” or “top-K results.” Canonical examples include search ranking, feed ranking, and ad auction ordering. The model scores each candidate, and a sorting step produces the final output.
Retrieval efficiently narrows a massive candidate pool, millions to billions of items, down to a manageable set. Look for scale signals and phrases like “find candidates” or “shortlist.” Examples include approximate nearest neighbor (ANN) retrieval in recommendation candidate generation and document retrieval in RAG pipelines.
Classification assigns a discrete label, either binary or multi-class, to an input. Look for “is this X or Y,” “detect,” “flag,” or “categorize.” Examples include spam detection, content moderation, and sentiment analysis. A single forward pass through the model produces a probability distribution over labels.
Regression predicts a continuous numeric value. Look for “predict the value of,” “estimate,” or “forecast.” Examples include Uber ETA prediction, ad bid price estimation, and demand forecasting.
Generation produces new content such as text, images, or code. Look for “generate,” “compose,” “summarize,” or “translate.” Examples include chatbot responses, image synthesis, and code completion. Generation tasks typically involve autoregressive decodingA sequential process where the model generates one token at a time, conditioning each new token on all previously generated tokens., which introduces fundamentally different latency characteristics compared to a single-forward-pass classifier.
Anomaly detection identifies rare events that deviate from learned normal behavior. Look for extreme class imbalance, “detect fraud,” “identify outliers,” or “flag unusual activity.” Examples include credit card fraud detection and infrastructure anomaly monitoring.
Clustering groups unlabeled data points by similarity without predefined categories. Look for “segment,” “group,” or “discover patterns.” Examples include user segmentation for marketing and topic discovery in document corpora.

Latency implications of task-type choice

A critical nuance from production practice is that task-type choice constrains your serving architecture. A generation task with autoregressive decoding may require hundreds of milliseconds per response, demanding techniques like model sharding and speculative decoding to meet latency budgets. A classification task with a single forward pass can often serve predictions in single-digit milliseconds. When you name a task type in an interview, you are implicitly committing to a latency profile.

Practical tip: After naming the task type, immediately state its latency implication. Saying “This is a generation task, so we need to budget for sequential decoding latency” signals production awareness.

The following table consolidates the recognition signals, input-output contracts, and typical loss functions for quick reference during interview preparation.

Machine Learning Task Types Overview

Task Type	Input-Output Contract	Key Recognition Signal	Canonical Interview Problem	Typical Loss Function
Ranking	Query + candidate set → ordered list	"Order by relevance"	Search ranking	Pairwise or listwise loss
Retrieval	Query → candidate subset from large corpus	"Narrow billions to hundreds"	Recommendation candidate generation	Contrastive loss
Classification	Input → discrete label	"Detect, flag, categorize"	Spam detection	Cross-entropy
Regression	Input → continuous value	"Predict, estimate, forecast"	ETA prediction	MSE or MAE
Generation	Input → new content	"Generate, compose, summarize"	Chatbot response	Cross-entropy with teacher forcing
Anomaly Detection	Input → normal vs. anomalous flag	"Extreme imbalance, flag outliers"	Fraud detection	Reconstruction error or one-class loss
Clustering	Unlabeled data → groups	"Segment, discover patterns"	User segmentation	Intra-cluster distance

With each task type clearly defined, the next challenge is handling situations where a single problem legitimately fits more than one type.

When one problem maps to multiple types

Not every interview problem maps cleanly to a single task type. Consider designing Airbnb search. You could frame it as ranking (order listings by booking probability), classification (predict book vs. not-book for each listing), or regression (predict expected booking value per listing). All three framings are technically valid, and the interviewer is watching to see how you navigate this ambiguity.

A three-step decision framework

The right framing depends on the business objective you surfaced during clarifying questions. A repeatable three-step framework helps you choose and defend your decision.

Identify the output contract: Ask what the consumer of the model output actually needs. If the search page must display an ordered list of listings, the consumer needs a ranked output, not just a binary label per listing.
Align with the business metric: If the KPI is revenue, regression on expected booking value may dominate because it directly optimizes for dollars. If the KPI is engagement or conversion rate, ranking by booking probability may be more appropriate.
Consider serving constraints: Ranking requires scoring and sorting a candidate set within a latency budget, which may demand a multi-stage retrieval-then-rank pipeline. A binary classifier can score items independently, simplifying the serving layer but losing the ordering signal.

Attention: Silently picking one framing without acknowledging alternatives is a missed opportunity. Interviewers reward candidates who name the alternative framings and then justify their choice with explicit trade-off reasoning.

It is also worth noting that concept driftA gradual change in the statistical relationship between input features and the target variable, causing model performance to degrade over time even if the input data distribution remains stable. affects each task type differently. A ranking model may silently degrade as user preferences shift, while an anomaly detection model may see its false-positive rate spike as normal transaction patterns evolve. Acknowledging task-type-specific monitoring needs demonstrates senior-level production awareness.

Now test your ability to identify the correct task type from interview-style prompts.

With the recognition skill practiced, the final piece is a repeatable verbal template for defending your choice in a live interview.

A framework for defending your choice

Knowing the right task type is necessary but not sufficient. You must also articulate your reasoning concisely under interview pressure. A three-sentence template provides structure without sounding rehearsed.

The template works as follows. First, state the output contract and map it to the task type. Second, name an alternative framing and explain why you rejected it. Third, connect your choice to a specific loss function and evaluation metric that align with the business KPI.

Here is the template applied to a concrete example: designing a notification relevance system for a social media app. ”The output contract for this problem is an ordered list of notifications scored by likelihood of user engagement, which maps to a ranking task. I considered framing it as binary classification (will the user tap or not) but the product surface requires ordering notifications within a feed, so ranking better serves the UX. This choice implies a pairwise ranking lossA loss function that compares pairs of items and penalizes the model when a less relevant item is scored higher than a more relevant one. and NDCG as the offline evaluation metric, which align with the engagement-rate KPI we established during requirements.”

This task-type decision feeds directly into the next phase of the interview, where you define functional and non-functional requirements. The next lesson covers that decomposition in depth.

Most candidates can name a task type. The ones who stand out at L4 through Staff+ levels are those who justify it against alternatives and connect it to loss functions, metrics, and serving constraints in a single coherent argument.

The following flowchart captures the full decision process in a visual format you can internalize for interview day.

Conclusion

This lesson covered the seven ML task types (ranking, retrieval, classification, regression, generation, anomaly detection, and clustering) along with their defining input-output contracts and the recognition signals that surface them from interview prompts. The three-step decision framework (output contract, business metric alignment, serving constraints) gives you a repeatable method for choosing and defending a task-type framing when multiple options exist. Remember that this decision is the bridge between the requirements phase and the modeling phase; it determines your loss function, evaluation metric, and architectural patterns. Production-grade systems must also account for silent degradation from data drift and concept drift, and acknowledging task-type-specific monitoring needs demonstrates the senior-level awareness interviewers look for. With the task type identified, the next step is decomposing the problem into functional requirements and non-functional requirements, including latency SLAs, throughput, fairness, and cost, that govern how the system must behave.

1.The Interview Framework and Communication

2.Problem Formulation and Requirements

3.Data Strategy: Collection, Pipelines, and Features

4.Model Design and Architecture Selection

5.Evaluation: Offline, Online, and Fairness

6.Serving, Deployment, and MLOps

7.Case Study: Video Recommendation System

8.Case Study: Social Feed Ranking System

9.Case Study: Ad Click-Through Rate Prediction System

Mock Interview

10.Case Study: Semantic Search Engine

11.Case Study: Content Moderation System

Mock Interview

12.Case Study: Object Detection System

Mock Interview

13.Case Study: Visual Search System

Mock Interview

14.Case Study: Fraud Detection System

Mock Interview

15.Case Study: RAG-Based Enterprise Knowledge Assistant

16.Case Study: LLM-Powered Code Generation Tool

Identifying the ML Task Type

Defining each task type

Recognition heuristics for each type

Latency implications of task-type choice

Machine Learning Task Types Overview

When one problem maps to multiple types

A three-step decision framework

A framework for defending your choice

Conclusion