Expanding System Lifespan

Learn about the system's longevity and factors that affect it.

System longevity

The major dangers to any system’s longevity are memory leaks and data growth. Both kinds of sludge can stop a system in production and both are rarely caught during testing.

Testing makes problems visible so we can fix them. Following Murphy’s Law, whatever we do not test against will happen. Therefore, if we do not test for crashes right after midnight or out-of-memory errors in the application’s forty-ninth hour of uptime, those crashes will happen. If we do not test for memory leaks that show up only after seven days, we will have memory leaks after seven days.

The trouble is that applications never run long enough in the development environment to reveal their longevity bugs. How long do we usually keep an application server running in your development environment? We could bet the average life span is less than the length of a sitcom on Netflix.

In QA, it might run a little longer but is probably still recycled at least daily, if not more often. Even when it is up and running, it’s not under continuous load. These environments are not conducive to long-running tests, such as leaving the server running for a month under daily traffic.

These sorts of bugs usually aren’t caught by load testing either.

Load testing

A load test runs for a specified period of time and then quits. Load-testing vendors charge large dollars per hour, so nobody asks them to keep the load running for a week at a time. Our development team probably shares the corporate network, so we can’t disrupt such vital corporate activities such as email and web browsing for days at a time.

So, how do we find these kinds of bugs? The only way we can catch them before they bite you in production is to run your own longevity tests. If we can, set aside a developer machine, have it run JMeter, Marathon, or some other load testing tool.

Get hands-on with 1200+ tech skills courses.