Acceptance Tests Trump Unit Tests

At work, we have been practising something approximating Acceptance Test Driven Development now for several months. This means that pretty much every feature of the system that a user would expect to be there, has an automated test to ensure that it really is.

It has given me a whole new perspective on the value of tests as artefacts produced by a project.

I made a pledge to myself when I started this new job in August that I would not (knowingly) check in a single line of code that wasn’t driven out by a failing test. At the time, I thought this would always mean a failing unit test, but I’m starting to see that this isn’t always necessary, or in fact even wise.

Don’t get me wrong. Unit testing is extremely important, and there’s no doubt that practising TDD helps you to write well-structured, low-defect code in an really satisfying manner. But I do feel like the extent to which TDD, at the level of unit testing alone, allows for subsequent changes to the behaviour of the code, has been oversold.

If you think you’re doing TDD, and you’re only writing unit tests, I think you’re doing it wrong.

As new requirements come in, the tensions influencing the design of the code shift. Refactoring eases these tensions, but by definition means that the design has to change. This almost certainly means that some, often significant, portion of the unit tests around that area of the code will have to change too.

I struggled with this for a long time. I had worked hard on those tests, for one thing, and was intuitively resistant to letting go of them. More than that, I knew that somewhere in there, they were testing behaviour that I wanted to preserve: if I threw them out, how would I know it still worked?

Yet those old unit tests were so coupled to the old design that I wanted to change…

Gulliver

In my mind, I have started to picture the tests we write to drive out a system like little strings, each one pulling at the code in a slightly different direction. The sum total of these tensions is, hopefully, the system we want right now.

While these strings are useful to make sure the code doesn’t fall loose and do something unexpected, they can sometimes mean that the code, like Gulliver in the picture above, is to restrained and inflexible to change.

The promise of writing automated tests up front is regression confidence: if every change to the system is covered by a test, then it’s impossible to accidentally reverse that change without being alerted by a failing test. Yet how often do unit tests really give us regression alerts, compared to the number of times they whinge an whine when we simply refactor the design without altering the behaviour at all? Worse still, how often do they fail to let us know when the mocks or stubs for one unit fail to accurately simulate the actual behaviour of that unit?

Enter acceptance tests.

By working at a higher level, acceptance tests give you a number of advantages over unit tests:

  • You get a much larger level of coverage per test
  • You get more space within which to refactor
  • You will test through layers to ensure they integrate correctly
  • They remain valuable even as underlying implementation technology changes

Admittedly, the larger level of coverage per test has a downside: When you get a regression failure, the signpost to the point of failure isn’t as clear. This is where unit tests come in: if you haven’t written any at all yet, you can use something like the saff squeeze to isolate the fault and cover it with a new test.

They’re also much slower to run, which can be important when you’re iterating quickly over changes to a specific part of the system.

To be clear, I’m not advocating that you stop unit testing altogether. I do feel there’s a better balance to strike, though, than forcing yourself to get 100% coverage from unit tests alone. They’re not always the most appropriate tool for the job.

To go back to the metaphor of the pulling strings, I think of acceptance tests as sturdy ropes, anchoring the system to the real world. While sometimes the little strings will need to be cut in order to facilitate a refactoring, the acceptance tests live on.

The main thing is to have the assurance that if you accidentally regress the behaviour of the system, something will let you know. As long as every change you make is driven out by some kind of automated test, be it at the system level or the unit level, I think you’re on the right track.