Focus on Client-Side Errors in a Monitoring System

Understand the critical difference between internal server errors and client-side failures, such as routing bugs or DNS issues. Learn why these external problems are often invisible to standard server monitoring, necessitating specialized System Design approaches for full availability.

Client-side errors

In distributed systems, clients commonly interact with services over HTTP. Server-side failures can be identified by monitoring web and application server logs for elevated rates of HTTP 500 errors.

However, client-side errors are difficult to detect because the service lacks insight into the client’s environment. While engineers might look for dips in traffic load, this metric is unreliable. It often produces false positives or negatives due to natural load variability or when issues affect only a small segment of users.

Several factors can prevent clients from reaching the server, including:

  • DNS resolution failures.

  • Routing failures between the client and the service provider.

  • Third-party infrastructure failures (e.g., middleboxes or CDNs).