Production Issues
Explore strategies for identifying and resolving production outages in software systems. Understand how to handle partial failures, secure application services, and improve logging for faster incident response. This lesson equips you to minimize disruption and enhance system resilience during production crises.
We'll cover the following...
Overview
In this lesson, we'll discuss the complexity of the issues in the production environment and then discuss security concerns within an application.
Identifying and resolving a production issue
At some point, you might encounter a partial or complete production outage of a system you're supporting. Start by determining the scope and the source of the problem. The recent deployment may not be the cause of the outage.
Don't start by restarting everything, as this can make things worse. First look for outages in related systems:
Does your vendor have an outage?
Are other customers of your vendor down too?
They may not be aware of the issue yet. When you have identified a solution and tested it live, it may require a subset of the unit tests to be temporarily ignored (don’t do this in a regulated environment). This ...