Metrics Sense: Designing a Metric II
Explore techniques to define and implement latency SLAs for services with diverse client requests. Understand various SLA metric options, their pros and cons, and how to instrument systems to capture and measure these metrics effectively. This lesson prepares you to tackle TPM interview questions requiring practical metric design and technical implementation skills.
We'll cover the following...
Question
You are the TPM for a service that has multiple customer clients who send many (often large) requests to your service’s API and it receives a response. You want to set a latency
Background
This question is both a metric and technical question combined into one, making it excellent for practicing both skills needed for a TPM interview. Setting SLAs for platforms with many different clients is also a common requirement, and it’s critical for TPMs to at least be familiar with them since it’s important in managing cross-team and external relationships and expectations.
Solution approach
This is a two-part question; we need to define the SLA (the metrics portion) and then discuss an implementation option.
- For the metrics portion, we’ll start by listing several options and the pros and cons of each option. Then, we’ll provide a recommendation of which option we should go with and why.
- For the implementation portion, we’ll discuss a way to capture the necessary events