Search⌘ K
AI Features

Metrics Sense: Designing a Metric II

Explore techniques to define and implement latency SLAs for services with diverse client requests. Understand various SLA metric options, their pros and cons, and how to instrument systems to capture and measure these metrics effectively. This lesson prepares you to tackle TPM interview questions requiring practical metric design and technical implementation skills.

Question

You are the TPM for a service that has multiple customer clients who send many (often large) requests to your service’s API and it receives a response. You want to set a latency SLAService Level Agreement for your clients. How would you go about setting this SLA and instrumenting your service to measure this?

Background

This question is both a metric and technical question combined into one, making it excellent for practicing both skills needed for a TPM interview. Setting SLAs for platforms with many different clients is also a common requirement, and it’s critical for TPMs to at least be familiar with them since it’s important in managing cross-team and external relationships and expectations.

Solution approach

This is a two-part question; we need to define the SLA (the metrics portion) and then discuss an implementation option.

  • For the metrics portion, we’ll start by listing several options and the pros and cons of each option. Then, we’ll provide a recommendation of which option we should go with and why.
  • For the implementation portion, we’ll discuss a way to capture the necessary events
...