Why Property-based Testing?

Take a look at property-based testing and its advantages over standard testing techniques.

Testing

Testing can get pretty boring, but it is a necessity that we can’t avoid. Tests are critical for the safety and longevity of programs, especially for those that change over time. They can also prove useful in helping properly design programs and write them as users as well as implementers.

Take a look at this example test that checks that an Erlang function can take a list of presorted lists and always return them merged as one single sorted list:

merge_test() ->
[] = merge([]),
[] = merge([[]]),
[] = merge([[],[]]),
[] = merge([[],[],[]]),
[1] = merge([[1]]),
[1,1,2,2] = merge([[1,2],[1,2]]),
[1] = merge([[1],[],[]]),
[1] = merge([[],[1],[]]),
[1] = merge([[],[],[1]]),
[1,2] = merge([[1],[2],[]]),
[1,2] = merge([[1],[],[2]]),
[1,2] = merge([[],[1],[2]]),
[1,2,3,4,5,6] = merge([[1,2],[],[5,6],[],[3,4],[]]),
[1,2,3,4] = merge([[4],[3],[2],[1]]),
[1,2,3,4,5] = merge([[1],[2],[3],[4],[5]]),
[1,2,3,4,5,6] = merge([[1],[2],[3],[4],[5],[6]]), [1,2,3,4,5,6,7,8,9] = merge([[1],[2],[3],[4],[5],[6],[7],[8],[9]]), Seq = seq(1,100),
true = Seq == merge(map(fun(E) -> [E] end, Seq)),
ok.

This is slightly modified code taken from the Erlang/OTP test suites for the lists module, one of the most central libraries in the entire language. The developer is trying to think of all the possible ways the code can be used and make sure that the result is predictable. We could probably think of another ten or thirty lines that could be added, and it could still be significant and explore the same code in somewhat different ways. Nevertheless, it’s perfectly reasonable, usable, readable, and effective test code. The problem is that it’s just so repetitive that a machine could do it. In fact, that’s exactly the reason why traditional tests are boring. They’re carefully laid out instructions to tell the machine which test to run every time, with no variation, as a safety check.

Testing this code manually can be very repetitive and introduce human errors. Let’s learn how to write a property test that will do the work of checking line by line for us.

Property-based testing

Property-based testing is one of the software development practices that has generated the most excitement in the last few years. It promises better, more solid tests than nearly any other tool out there, with very little code. Accordingly, this means that the software we develop using property-based testing is also improving. Although there is a steep learning curve, property-based testing offers an automated mode to ensure quality software creation and maintenance. Here’s what an equivalent property-based test could look like:

sorted_list(N) -> ?LET(L, list(N), sort(L)).
prop_merge() ->
?FORALL(List, list(sorted_list(pos_integer())),
merge(List) == sort(append(List))).

Not only is this test shorter with just four lines of code, but it also covers more cases. In fact, it can cover hundreds of thousands of cases. Right now, the property-based test probably looks like a bunch of gibberish that can’t be executed,at least not without the PropEr framework. But in due time, this should become easy for us to read, while taking less time than a traditional test.

In this chapter, we’ll see the results that we should expect from property-based testing, and cover the principles behind the practice and how they influence the way we write tests. We’ll also pick the tools we need to get started since property-based testing does require a framework to be useful.

Promises of property-based testing

Property-based tests are different from traditional tests and require a different skill set. Good property-based testing is a learned and practiced skill, much like playing a musical instrument or using a paintbrush. We’ll always have areas to improve and constantly be finding ways to innovate our approach. Experts at property-based testing can do some pretty amazing stuff with code.

Even beginners can benefit greatly from property-based testing. We’ll be able to write simple, short, and concise tests that automatically comb through code the way only the most thorough tester could. Our code coverage should rise even as we modify the program without changing the tests. We’ll even be able to use these tests to find new edge cases without even needing to modify anything.

With a bit more experience, we’ll be able to write straightforward integration tests of stateful systems that find complex and convoluted bugs no one would even think to look for. Property testing also teaches us what to look out for and helps us uncover hidden bugs.

Example 1: Project FIFO

Overall, we’ll find that property-based testing doesn’t just involve using a bunch of tools to automate boring tasks, but is actually a wholly different way to approach testing and software design itself. For example, Thomas Arts’ slide set and presentation from the Erlang Factory 2016 conference mentions using QuickCheck, the canonical property-based testing tool, to run tests on Project FIFO, an open-source cloud project. With a mere 460 lines of property tests, they covered 60,000 lines of production code and uncovered twenty-five important bugs, including:

  • Timing errors
  • Race conditions
  • Type errors
  • Incorrect use of library APIs
  • Errors in documentation
  • Errors in the program logic
  • System limits errors
  • Errors in fault handling
  • One hardware error

Considering that some studies estimate that developers average six software faults per 1,000 lines of code, finding twenty-five important bugs using 460 lines of tests is quite a feat. That’s finding over fifty bugs per 1,000 lines of the test, with each of these lines covering 140 lines of production code.

Example 2: Google’s levelDB

Let’s take a look at some more expert work. Joseph Wayne Norton ran a QuickCheck suite of under 600 lines over Google’s levelDB to find specific sequences of seventeen and thirty-one calls that could corrupt databases with ghost keys. No matter how dedicated someone is to the task, it would have been very difficult to come up with the proper sequence of thirty-one calls required to corrupt a database.Again, this required a surprisingly low amount of code to find a high number of nontrivial errors on software that was otherwise already tested and running in production.

Property-based testing is so impressive that it has wedged itself in multiple industries, including:

  • Mission-critical telecommunication components
  • Databases
  • Components of cloud providers’ routing
  • Certificate-management layers
  • IoT platforms
  • Cars

It’s important to remember property-based testing is not a thing reserved for advanced programmers. The effort required to improve is continuous, but the benefits of property-based testing obviously makes that effort well worth it. Just remember that each wall we hit reveals an opportunity for improvement. We’ll get there together, one step at a time.