Year-End Discount: 10% OFF 1-year and 20% OFF 2-year subscriptions!

Home/Blog/Cracking the machine learning interview: System design approaches

Cracking the machine learning interview: System design approaches

Nov 23, 2021 - 11 min read
Jerry Ejonavi

Note: This post was originally published in 2020 and has been updated as of Nov. 23, 2021.

Machine Learning (ML) is the study of computer algorithms that improve automatically through experience. ML is a lucrative field that is growing quickly. It is predicted to reach $30.6 billion by 2024. If you’re pursuing a data scientist or software engineering role, you’ll go through a competitive interview process. You may be tested on your programming, data analysis, critical thinking, and system design skills in your interview.

System design skills can set you apart from other engineers. Top tech companies ask system design interview questions to see if you can efficiently solve real-world problems. Today we’ll discuss how you can ace machine learning interviews using system design concepts.

We’ll cover:

Ace your ML engineer interview.

Master system design concepts and impress your interviewer.

Grokking the Machine Learning Interview

What is the ML interview?

ML aims to solve a multitude of complex problems. It has made rapid progress in areas like speech understanding, search ranking, and credit card fraud detection. Companies are leveraging these technologies across industries from healthcare and agriculture to manufacturing and retail.

machine learning solutions

A high level of technical skill is required in the machine learning field, particularly for machine learning engineers. In a machine learning interview, you’ll be asked open-ended questions to test your ability to solve an ML system design problems, similar to system design interviews.

In an interview, you’ll be tested on the following:

  • Technical and programming skills
  • Data analysis skills, including multiple approaches and technologies
  • System design concepts
  • Your ability to apply machine learning theories effectively
  • Communication skills and cultural fit

During your interview, you may be asked to:

  • Build a recommendation system that shows relevant products to users
  • Build a visual understanding system for a self-driving car
  • Build a search-ranking system

Overview of ML interview concepts and techniques

Performance and capacity considerations

Our goal is to improve our metrics when working on an ML-based system. We also want to ensure that we meet the capacity and performance Service Level Agreement (SLA). Performance-based SLA ensures that we return results within a given time frame (e.g. 500ms) for 99% of queries. Capacity refers to the load that our system can handle (e.g. the system supports 1000 queries per second).

There are two important discussions regarding performance and capacity when building an ML system:

  • Training time: How much training data and capacity is needed to build our predictor?
  • Evaluation time: What are the SLA that we have to meet while serving the model and capacity needs?

The layered/funnel modeling approach is the best way to solve for scale and relevance while keeping performance and capacity in check. You’ll start with a relatively fast model when you have the highest number of documents (e.g. 100 million documents in case of the search query “computer science”). In each later stage, you continue to increase the complexity (i.e. more optimized model in prediction) and execution time. The model needs to run on a reduced number of documents as the stages progress (e.g. your first stage could use a linear model and the final stage can use a deep neural network).

Training data collection strategies

An ML model learns directly from the data it’s provided. It creates and refines its rules on a given task based on that data, which is called training data. This makes it crucial to avoid inadequate, irrelevant, or biased data. For instance, a machine learning model based on racially biased data will simply learn to automate racial bias. Even the most performant algorithms are useless if they are not based on quality dataset.

The quality and quantity of training data is a big factor in determining how far you can go in your machine learning optimization task. Data collection techniques primarily involve user interactions, human labelers, or specialized labelers.

You can also make use of other creative data collection techniques. For example, you can build a personalized experience in your product by collecting data from users. If you’re working with a system that uses visual data, such as object detectors or image segmenters, you can use GANs (generative adversarial networks) to enhance the training data. Other things to consider include:

  • Data splits
  • Data training
  • Test/validation
  • Data quantity
  • Data filtering

Online experimentation

“Success” can be measured in numerous ways in machine learning system design. A successful machine learning system must gauge its performance by testing different scenarios. This can make a model’s design more innovative.

To run an online experiment, A/B testing is a great way to assess the impact of new features or changes in the system. In an A/B experiment, a second modified version of a webpage or screen is created. The original version is known as the control, and the modified version is the variation. From here, we can formulate two hypotheses:

  • Null hypothesis
  • Alternative hypothesis

We an also use this stage to measure long term effects with back testing and long-running A/B tests.

Experimental framework stages
Experimental framework stages


Embeddings enable us to encode entities (e.g., words, docs, images, person) in a low-dimensional vector space in order to capture their semantic information. Two popular models used for word embeddings are:

  • CBOW: A continuous bag of words (CBOW) predicts the current word from surrounding words.
  • Skipgram: In this architecture, we try to predict surrounding words from the current word.

Other ML interview concepts and techniques

We’ve gone over the main concepts and techniques we use in ML interview and design. This is just an introduction to the techniques you will need to be successful in machine learning system design and interviews. More topics you’ll want to know are:

  • Transfer learning
  • Model debugging and testing
  • Training data filtering
  • Building models & iterative model improvement

Rock your ML interview with system design.

Step-by-step walkthroughs of common machine learning system design problems and 300+ other courses.

Grokking the Machine Learning Interview

How to set up an ML system

You’ll be expected to set up a system effectively in an ML interview. Let’s discuss the thought process required to answer an interviewer’s questions.

Setting up the problem

Interviewers will generally ask you to design a machine learning system for a particular task. This question is usually broad. The first thing you need to do is ask questions to narrow down the scope of the problem and ensure your system’s requirements. You should also ask questions about performance and capacity considerations of the system.

Clarifying these questions will guide your system’s architecture. Knowing that you need to return results quickly will influence the depth and complexity of your models.

Defining the metrics of the problem

After asking questions, you should carefully choose your system’s performance metrics for both online and offline testing. These metrics will differ depending on the problem your system is trying to solve.

For example, if you are performing binary classification, you will use the following offline metrics: Area Under Curve (AUC), log loss, precision, recall, and F1-score.

When deciding on online metrics, you may need both component-wise and end-to-end metrics. Component-wise metrics are used to evaluate the performance of ML systems that are plugged in to and used to improve other ML systems. End-to-end metrics evaluate a system’s performance after an ML model has been applied. For example, a metric for a search engine would be the users’ engagement and retention rate after your model has been plugged in.

metrics machine learning system design

Architecture discussion

The next step is to design your system’s architecture. You need to think about the system’s components and how the data will flow through those components. In this step, your aim is to design a model that can scale easily.

Architectural components for ML system of search engine
Architectural components for ML system of search engine

To build a scalable system, your design needs to efficiently deal with a large and continually increasing amount of data. For instance, an ML system that displays relevant ads to users can’t process every ad in the system at once. You could use the funnel approach, wherein each stage has fewer ads to process. This will yield a scalable system that quickly determines relevant ads for users despite the increase in data.

When you have nailed down all of your ML system’s requirements, you can proceed to building your model. This involves:

  1. Training data generation: This involves sourcing data for use in training your models. This data could be either manually labelled or collected from a user’s interaction with the pre-existing system.
  2. Feature engineering: In order to implement a feature, you would need to identify the primary actors involved in the given task. You’ll individually inspect these actors and explore their relationships.
  3. Model training: You will make a decision on what model to use for your system.
  4. Offline evaluation: This is beneficial because it allows you to quickly test many different models.
  5. Online execution, evaluation and iterative improvement: Only the most promising models are selected for this step, which is a slower process.

Now, we’ll move on to the task of building an entity linking system.

Building an entity linking system

Named entity linking (NEL) is the process of detecting and linking named entities in a given text to corresponding entities in a target knowledge base. There are two parts to entity linking:

  • Named-entity recognition (NER): NER detects and classifies potential entity mentions into predefined categories. These categories can include a person, organization, location, medical code, and time expression.
  • Disambiguation: This process disambiguates each detected entity by linking it to its corresponding entity in the knowledge base.

Let’s see entity linking in action in the following example:

entity linking example machine learning system design

The text says, “Michael Jordan is a machine learning professor at UC Berkeley.” First, NER detects and classifies the named entities Michael Jordan and UC Berkeley as person and organization. Next, disambiguation takes place. Assume that there are two ‘Michael Jordan’ entities in the given knowledge base, the UC Berkeley professor and the athlete. Michael Jordan in the text is linked to UC Berkeley professor entity in the knowledge base. Similarly, UC Berkeley in the text is linked to the University of California entity in the knowledge base.


Entity linking has applications in many natural language-processing tasks. Use cases can be broadly categorized as information retrieval, information extraction, and building knowledge graphs. These can be used in many systems, such as:

  • Semantic search
  • Content analysis
  • Chatbots, virtual assistants, and other systems that answer questions

The aforementioned applications require a high-level representation of text. In this high-level representation, the concepts relevant to the application are separated from the text and other non-meaningful data.

Problem statement

The interviewer has asked you to design an entity linking system that:

  • Identifies potential named entity mentions in the text
  • Searches for possible corresponding entities in the target knowledge base for disambiguation
  • Returns either the best candidate corresponding entity or nil

The problem statement translates to the following machine learning problem:

"Given a text and knowledge base, find all the entity mentions in the text (Recognize) and then link them to the corresponding correct entry in the knowledge base (Disambiguate).”

Interview questions for entity linking

These are some of the questions that an interviewer can put forth during a discussion on entity linking systems.

  • How would you build an entity recognizer system?
  • How would you build a disambiguation system?
  • Given a piece of text, how would you extract all persons, countries, and businesses mentioned in it?
  • How would you measure the performance of a disambiguator/entity recognizer/entity linker?
  • Given multiple disambiguators/recognizers/liners, how would you figure out which is the best one?

What to learn next

Congrats! You have learned about implementing introductory ML system concepts and how to approach interview questions based on system design concepts. There’s still a lot to learn about ML system design.

You’ll need to master the following systems:

  • Ad prediction system
  • Self-driving car systems
  • Recommendation system
  • Feed-based system
  • Search ranking

To help you master these concepts and strategies, check out Educative’s Grokking the Machine Learning Interview course. You’ll master machine learning system design and answer some of the most popular interview problems at big tech companies. You should come out of the course with the ability to impress interviewers by thinking about systems at a high level.

If you want even more practice with system design questions for machine learning interviews, check out Machine Learning System Design.

Continue reading about machine learning

WRITTEN BYJerry Ejonavi

Join a community of more than 1.6 million readers. A free, bi-monthly email with a roundup of Educative's top articles and coding tips.