Grokking the Modern System Design Interview/

...

Design Considerations of a Distributed Task Scheduler

Learn about the design considerations for the distributed task scheduler.

We'll cover the following...

Queueing
Execution cap
Prioritization
Resource capacity optimization
Task idempotency
Schedule and execute untrusted tasks

A distributed queue is a major building block used by a scheduler. The simplest scheduling approach is to push the task into the queue on a first-come, first-served basis. If there are 10,000 nodes (resources) in a cluster (cloud), the task scheduler quickly extracts tasks from the queue and schedules them on the nodes. But, if all the resources are currently busy, then tasks will need to wait in the queue, and small tasks might need to wait longer.

This scheduling mechanism can affect the reliability of the system, availability of the system, and priority of tasks. There could be cases where we want urgent execution of a task—for example, a task that notifies a user that their account was accessed from an unrecognized device. So, we can’t rely only on the first-come, first-serve to schedule tasks. Instead, we categorize the tasks and set appropriate priorities. We have the following three categories for our tasks:

Tasks that can’t be delayed.
Tasks that can be delayed.
Tasks that need to be executed periodically (for example, every 5 minutes, or every hour, or every day).

Press + to interact

Distributed Cache System

Pub-Sub

Blob Store

TikTok

Uber Eats

NewsFeed

Facebook Messenger

ChatGPT

Design Considerations of a Distributed Task Scheduler

Queueing

Execution cap