Bulkhead Pattern


This pattern enforces resource partitioning and damage containment to preserve partial functionality in the case of a failure.The Bulkhead pattern is also known as the Failure Containment Principle and the Damage Control Principle.

Context and problem

The Titanic disaster has been well studied over the years, and there are many lessons we can learn from it in the IT industry. Among the many reasons why it sank, a few of them are as follows:

  • Design flaws (watertight compartments did not reach high enough in order to allow more living space in first class).
  • Implementation/construction faults (the three million rivets used to hold different parts of the Titanic together were found to be made from substandard quality iron, and the collision with the iceberg badly impacted them).
  • Operational failures (the iceberg notice was given too late, and the ship was traveling too fast to react to any warning).

We can identify similar issues in software projects today too, but luckily they have not cost so many human lives so far, and as a consequence, we learn more slowly than other industries. However, that does not mean software failures have no consequences. Plenty of examples (such as the Therac-25 machine) caused human death because of software defects.

Get hands-on with 1200+ tech skills courses.