Search⌘ K
AI Features

Network Errors and Resilience

Explore techniques to handle network errors effectively in Python applications. Learn to implement timeouts, catch specific exceptions, use retries with exponential backoff, and apply logging for diagnostics. This lesson equips you to build resilient programs that can recover gracefully from transient network failures and avoid hanging processes.

Computer networks are not always reliable. Infrastructure can fail, connections may time out, packets can be lost, and servers may become overloaded during periods of high demand. If programs assume that every request succeeds immediately and without interruption, they will break when network failures occur.

In this lesson, we will move beyond idealized “happy-path” scenarios and examine the principles of defensive networking. We will learn how to anticipate transient failures, implement timeouts and retries, handle exceptions gracefully, and provide meaningful error reporting.

The necessity of time-outs

By default, the requests library will wait indefinitely for a server to respond. If a server becomes unresponsive or a network device silently drops packets, the request may never complete. In this situation, the program may block execution and prevent further progress. In practice, this can leave processes running but unresponsive, sometimes described as hung or stalled processes.

By default, the requests library will wait indefinitely for a server to respond. If a server is overwhelmed or a firewall silently drops packets, our program could hang for minutes or even hours, blocking all other execution. This is often called a zombie process.

To avoid this, we should always specify a timeout when making network requests. A time-out defines the maximum number of seconds the client is willing to wait before abandoning the attempt and raising an exception. This ensures that our program fails predictably, rather than hanging indefinitely, and it allows us to recover through retries, fallback logic, or user-facing error messages.

Python
import requests
url = "https://httpbin.org/delay/5" # This endpoint waits 5 seconds before responding
try:
# We set a timeout of 2 seconds. Since the server takes 5, this will fail.
response = requests.get(url, timeout=2)
print("Success:", response.status_code)
except requests.exceptions.Timeout:
print("The request timed out. The server took too long to respond.")
  • Line 7: The timeout=2 argument sets a maximum wait time for the request. It forces the request to abort immediately if the server is silent for more than 2 seconds, guaranteeing our program retains control. ...