Two ways to react

Every organisation that makes software, makes mistakes. Sometimes, despite everybody’s best efforts, you end up releasing a bug into production. Customers are confused and angry; stakeholders are panicking.

Despite the pressure, you knuckle down and fix the bug.

Now it gets interesting: you have to deploy your fix to production. Depending on how your organisation works, this could take anywhere between a couple of minutes and a couple of weeks. You might simply run a single command, or you might spend hours shuffling emails around between different managers trying to get your change signed off. Meanwhile, your customers are still confused and angry. Perhaps they’ve started leaving you for your competitors. Perhaps you feel like leaving your job.

With the bug fixed, you sit down and try to learn some lessons. What can you do to avoid this from happening again?

There are two very different ways you can react to this kind of incident. The choice you make speaks volumes about your organisation.

Make it harder to make mistakes

A typical response to this kind of incident is to go on the defensive. Add more testing phases, hire more testers. Introduce mandatory code review into the development cycle. Add more bureaucracy to the release cycle, to make sure nobody could ever release buggy code into production again.

This is a reaction driven by fear.

Make it easier to fix mistakes

The alternative is to accept the reality. People will always make mistakes, and mistakes will sometimes slip through. From that perspective, it’s clear that what matters is to make it easier to fix your mistakes. This means cutting down on bureaucracy, and trusting developers to have access to their production environments. It means investing in test automation, to allow code to be tested quickly, and building continuous delivery pipelines to make releases happen at the push of a button.

This is a reaction driven by courage.

I know which kind of organisation I’d like to work for.

Cucumber 1.3.1 released

Over the weekend we released Cucumber version 1.3.0. This was quickly replaced by 1.3.1 when we realised there was a bug 1.3.0 that only appeared on Windows.

Along with masses of bugfixes, this release contains the early stages of some serious internal refactoring work planned for release in version 2.0. Although our tests continue to pass, it may be that we’ve changed untested behaviour so that’s why we’ve bumped the minor release number. We’re already aware of one minor bug that’s been introduced[1]. Please let us know if you find any other issues.

New Features

  • Faster features, thanks to in-process Aruba. (Aslak Hellesøy)
  • Add lib to default load path
    (#162 Steve Tooke)
  • Add snippet type to support different type of ruby snippets.
    (#390 Roel van Dijk)
  • Call nested steps using any language keywords (#433 Tomohiko Himura)

Bugfixes

  • Update WATIR example (#427) Luiz Guilherme D’Abruzzo Pereira
  • Ensure that cucumber.yml is only parsed once (#416) Steve Tooke
  • Improve rake task report on failure (#400 Andrey Vakarev
  • Show details of nested steps in backtraces
    (#69) Steve Tooke
  • Filter out control characters from CDATA section of JUnit formatter output.
    (#383 @teacup-on-rockingchair)
  • Fix problem with non-ascii characters in file path
    (#150 Oleg Sukhodolsky)
  • Fix problem loading ruby files in project root directory
    (#269 Steve Tooke)
  • Fix JsonPretty formatter problem
    (#197 Oleg Sukhodolsky)
  • Don’t display multi-line strings when –no-multiline is passed
    (#201 David Kowis)
  • Moved the profile information output to a callback of the formatter
    (#175 David Kowis)
  • Fix html formatter to not mark skipped/unimplemented steps as failed
    (#337 Oleg Sukhodolsky)
  • Allow duplication for format+output pair in command line
    (#434 David Lantos)
  • Better delegation to IO in Cucumber::Formatter::Interceptor::Pipe
    (#312 Oleg Sukhodolsky)

[1] https://github.com/cucumber/cucumber/issues/438

A coding dojo story

It was 2008, and I was at the CITCON conference in Amsterdam. I’d only started going to conferences that year, and was feeling as intimidated as I was inspired by the depth of experience in the people I was meeting. It seemed like everyone at CITCON had written a book, their own mocking framework, or both.

I found myself in a session on refactoring legacy code. The session used a format that was new to me, and to most of the people in the room: a coding dojo.

Our objective, I think, was to take some very ugly, coupled code, add tests to it, and then refactor it into a better design. We had a room full of experts in TDD, refactoring, and code design. What could possibly go wrong?

One thing I learned in that session is the importance of the “no heckling on red” rule. I watched as Experienced Agile Consultant after Experienced Agile Consultant cracked under the pressure of criticism from the baying crowd of Other Experienced Agile Consultants. With so many egos in the room, everyone had an opinion about the right way to approach the problem, and nobody was shy of sharing his opinion. It was chaos!

We got almost nowhere. As each pair switched, the code lurched back and forth between different ideas for the direction it should take. When my turn came around, I tried to shut out the noise from the room, control my quivering fingers, and focus on what my pair was saying. We worked in small steps, inching towards a goal that was being ridiculed by the crowd as we worked.

The experience taught me how much coding dojo is about collaboration. The rules about when to critique code and when to stay quiet help to keep a coding dojo fun and satisfying, but they teach you bigger lessons about working with each other day to day.

Please consider supporting my work on Cucumber through gittip

My first commit to Cucumber was in 2008. Since then I’ve poured countless hours into the project and the community around it, whether directly as commits to the code, or answering questions on this mailing list, or writing blog articles. I am independent, so those hours have all been done on my own time.

Why do I do that?

It’s a complicated question. My consulting business is built around BDD, so I have a vested interest in the success of the Cucumber project. It’s more than that though. I have a firm belief that the difference between a software project that’s fun to work on and one that’s miserable is the communication between people. I’ve seen how Cucumber can improve that communication, or at least make it clear to people when it needs to improve. It’s not a silver bullet, but I do genuinely think it can help to make a software team more enjoyable to work on. I also love getting to collaborate with so many of you on code that nobody else owns, that we can all enjoy.

I have a family now, and as anyone else with young kids will know, my time feels extremely precious. Yet I feel a responsibility to all of you to keep Cucumber-Ruby’s code healthy and full of the features you need.

Will you help me?

A couple of weeks ago Olaf Lewitz pointed me to this TED talk by Amanda Palmer about “The Art of Asking” and it occurred to me to ask all of you, if you appreciate the work I do on the Cucumber project, to consider making a regular donation to me on gittip. You can donate as little as 0.25 cents per week, but each contribution keeps me motivated and tells me you appreciate my work.

Thanks!

Optimising a slow build? You’re solving the wrong problem

At the time I left Songkick, it took 1.5 hours to run all the cukes and rspec ‘unit’ tests on the big ball of Rails. We were already parallelising over a few in-house VMs at the time to make this manageable, but it still took 20 minutes or so to get feedback. After I left, the team worked around this by getting more slave nodes from EC2, and the build time went down to under 10 minutes.

Then guess what happened?

They added more features to the product, more tests for those features, and the build time went up again. So they added more test slave nodes. In the end, I think the total build time was something like 15 hours. 15 fucking hours! You’re hardly going to run all of that on your laptop before you check in.

The moral of this story: if you optimise your build, all you’ll do is mask the problem. You haven’t changed the trajectory of your project, you’ve just deferred the inevitable.

The way Songkick solved this took real courage. First, they started with heart-to-heart conversations with their stakeholders about removing rarely-used features from the product. Those features were baggage, and once the product team saw what it was costing them to carry that baggage, they were persuaded to remove them.

Then, with a slimmed-down feature set, they then set about carving up their architecture, so that many of those slow end-to-end Cucumber scenarios became fast unit tests for simple, decoupled web service components. Now it takes them 15 seconds to run the tests on the main Rails app. That’s more like it!

So by all means, use tricks to optimise and speed up the feedback you get from your test suite. In the short term, it will definitely help. But realise that the real problem is your architecture: if your tests take too long, the code you’re testing has too many responsibilities. The sooner you start tackling this problem head-on, the sooner you can start enjoying the benefits.

Cucumber 1.2.2 Released

This is a maintenance release, but marks a new period in Cucumber’s life as it was released by our new team member Oleg Sukhodolsky. Oleg has been doing a fantastic job since he joined the team a few weeks ago, closing tickets like a boss.

Here’s a summary of what’s in the release:

New Features

  • Ruby 2.0.0 support (#377 Matt Wynne & #357 @charliesome)
  • Capture duration value for json formatter (#329 Rick Beyer)
  • Added support for Hindi (hi), although some systems may need to install fonts which support the Devanagari script.
  • Obey program suffix when invoking bundler (#324 Eric Hodel)

Bugfixes

  • Fix class loading problems –format option had (#345, #346 @ksylvest)
  • Exit with failure status when interrupted (#299 @aaronjensen)
  • Cannot map table headers after table hashes is referenced (#275 @chrisbloom7 / Matt Wynne)
  • (before|after)_step aren’t called when scenario outline’s table is processed (#284 Oleg Sukhodolsky)
  • Raise exception when remote socket end disconnects using wire protocol (#348 @rdammkoehler)
  • Fix –dry-run option ignored when set via profile (#248 / #255 Igor Afonov)
  • More clear suggested ruby code for undefined steps (#328 / #331 @martco)
  • Fix exception in Html formatter with –expand mode and undefined steps (#336 Roberto Decurnex)
  • Fix Table.diff! problem with :surplus_row => false and interleaved surplus rows (#220)

Building software backwards

I am utterly dismayed by the number of so-called Agile teams I meet who are still, after all this time, building software backwards.

Toast

What do I mean by that? Let’s defer to the great W. Edwards Deming as he ridiculed the approach of 1970s American manufacturing to quality:

Let’s make toast the American way! I’ll burn it, you scrape it.

What an incredibly wasteful way to make things: Have one team cobble it together, have another team find all the mistakes, then send all the mistakes back to the first team to be fixed. Still, as long as we’re tracking it all with story points on our burn-down chart, and talking about it in our stand-up meetings, we must be Agile, right?

Wrong. A really Agile team would learn how to prevent those mistakes from happening in the first place.

Yet time-after-time, I come across scrum teams with a separate test team working an iteration behind the developers, finding all their mistakes.

Here’s why I think this is such a bad idea:

  1. The cost of fixing a defect increases exponentially with time. Defects that fester undetected in the code become harder and harder to remove safely. The longer the defective code is there, the more chance there is that other code will be built on top of it. When the original mistake is rectified, that other code may be broken by the change.

  2. The cheapest fix is the one you never had to make. When you write tests up-front, you often spot edge-cases as you write the tests; then you build in support for those edge-cases as you go. If you build software backwards, leaving a team of testers to discover those edge cases, you’ll pay a large penalty for having to go back and introduce the change to existing code.

  3. You discover requirements too late. When you practice test-driven development, you prepare for any change to the software by first defining how it should behave when you’re done. If you build software backwards, you may never discover the behaviour you really want until the testers get their hands on it and start asking all those awkward questions. The toast has already been burned.

  4. Continuous Delivery will always be a pipe-dream. Many companies, perhaps your competitors, deploy to production several times a day, just minutes after a developer checks in their change. If you have to wait for a separate team of testers to scrape all your burned toast, this will only ever be a pipe-dream for you.

When I talk to people about adopting BDD, the most frequent objection I hear is that it must take longer. This is true, in a sense, because it does take longer for a change to get to the stage where a developer is done with it. If you’re used to burning toast you’ll find this frustrating, because you don’t realise yet that the time and effort you’re putting in to write the tests up-front won’t pay off until later, when you hand the change to your testers and they can’t find anything wrong with it.

To do this takes a leap of faith, so hold your nerve. It will be worth it.


Stop building software backwards: come to BDD Kickstart

If you’d like to hear more about these ideas and learn concrete techniques to make them work in your organisation, I’m teaching a course from March 11-13 in Edinburgh, UK just for you. Click here to find out more.

The problem with solutions

First date. You’re out for dinner and the meal has gone well. Good food, great conversation. The waiter brings the bill, and you ask to pay with your credit card:

May I submit my billing info to your merchant account via your payment gateway?

Your date looks at you askance: nobody talks like that. Do they?

In any software project, there are two main domains: the problem domain and the solution domain. The problem domain is where your customers live. The programmers and other technicians working on solving that problem operate in the solution domain.

Both of those domains contain funny terminology, a dialect of English you can only learn by spending time amongst the people who speak it. This helps you to figure out which domain you’re in by listening to the language being used.

Here are some examples:

Solutions vs Problems

This specialised terminology is inevitable. Both business people and technicians need to use specialist language to communicate effectively with their peers. But what about when those two tribes need to speak to each other?

David West’s Object Thinking eloquently describes the natural divide between these two domains, and the problems it causes. In his view, our job as programmers is to build a model of the problem, and allow a solution to naturally fall out of that model. As soon as we start thinking in solutions, we lose sight of the problem and create ugly solutions. This implies that, as programmers, we need to immerse ourselves in the problem domain.

Using executable acceptance tests written in plain language helps you to keep your language and thinking rooted in the problem domain. This is why I have such an objection to Cucumber scenarios that talk about clicking buttons and filling in fields: they’re jumping into the solution domain too early.

Instead, I like to use Cucumber features as the place where we document our understanding of the problem domain. Writing those features collaboratively with people from both domains helps us to grow that understanding, and increase the overlap between the two domains.

TDD vs BDD

I regularly find myself explaining to people the difference between TDD (Test-Driven Development) and BDD (Behaviour-Driven Development). There still seems to be a lot of confusion over this, so I wanted to write this up for reference.

Late last year I was interviewed for a virtual panel on InfoQ along with Dan, Gojko, and Liz. Probably the most interesting part of that conversation covered the difference between TDD and BDD. Or rather the lack of any great difference.

We’ll start with some snippets from that discussion.

Both TDD and BDD include acceptance testing

One common misconception is that TDD is what you do when you’re unit-testing, and BDD is what you do when you’re writing customer-facing acceptance tests. Here’s Dan North on that point:

TDD – as originally described – is also about the behaviour of entire systems.
Kent [Beck] specifically describes it as operating on multiple levels of abstraction, not just
“down in the code”. BDD is equally important in this space, because describing the
behaviour of systems is fractal: you can describe different granularities of behaviour
from the entire application right down to individual small components, classes or
functions.

Extreme Programming has always talked about writing acceptance tests, sometimes also called functional tests to describe what the customer expects to be done at the end of an iteration.

So this is nothing new. What’s new is how we explain it, and therefore how successful teams end up being in making it work for them.

BDD describes TDD done well

When Dan was working as a coach teaching TDD, he found that it was easier to get people to understand the principles of TDD if he stopped using the word ‘test’:

My experiences as a coach told me people were missing the point, with all this talk
of unit tests, acceptance tests, functional tests, integration tests… Kent Beck’s
style of TDD is a very smart way to develop software, so I tried removing the word
“test” when I was coaching it, replacing it with things like behaviour, examples,
scenarios etc. The result was very encouraging: People seemed to “get” TDD much
quicker when I avoided referring to testing.

When Aslak and I wrote the Cucumber Book, I wrote this description of BDD:

BDD builds upon TDD by formalising the good habits of the best TDD practitioners.

That’s basically all there is to it. We want to re-explain TDD in a way that highlights the habits that successful TDD practitioners having been using for over a decade.

So what are those good habits?

Specifically, I think those good habits are:

  1. Working outside-in, starting from a business or organisational goal
  2. Using examples to clarify requirements
  3. Developing and using a ubiquitous language

Working outside-in seems obvious to habitual TDD practitioners, but many teams seem to limit themselves to doing this at the level of small units of code. Business-level black-box testing is still done manually, or automated as a check after the code has already been implemented.

This misses out of the major benefit of working outside-in, which is having the requirement challenged: if you need to explain to a computer how to check the requirement, you’ll need to be damn sure understand it yourself. If you don’t (and you often don’t) it’s much cheaper to find that out before you write the code.

Examples have always been a great way to make sure you really understand a requirement. What BDD does is formalise this by encouraging you to use scenarios to describe behaviour. These examples provide the perfect bridge between the business-facing and technology-facing sides of a team: they’re just formal enough that you can get a computer to check them, but anyone on the team can read them and make sure they’re describing behaviour that they actually want.

The GOOS Book, written by two of the best TDD practitioners in the business, frequently highlights the importance of domain language in our programs. In software teams, communication is probably the biggest overhead you have, and you make that communication a lot harder when you allow different dialects of terminology to be used by different parts of the team. Developing and then sticking to a consistent language takes deliberate effort, but it’s something that the best TDD practitioners have long learned will give them a significant advantage.

My experience is that BDD’s emphasis on collaboration, and the use of business-readable, executable specifications, means that this shared language develops much more quickly. When everyone is involved in writing documentation that describes what the system should do, they all get a chance to learn the language of the domain together.

So BDD really isn’t all that different to TDD. What BDD adds is a clear emphasis on what it takes to make TDD succeed.

Is Cucumber just a scam?

David Heinemeier Hansson recently wrote on his blog:

Don’t use Cucumber unless you live in the magic kingdom of
non-programmers-writing-tests (and send me a bottle of fairy dust if you’re there!)

Well, good news readers! The magic kingdom is real! I’ve been there! Look, I even have a bottle of fairy dust. I keep it right next to the teabags:

Fairy Dust

Admittedly, that fairy dust is pretty good stuff, and I’ve been hitting it hard lately. Maybe it’s been making me hallucinate?

I decided I’d better check with some other people I know who’ve been to the magic kingdom too.

Lisa Clark told me a wonderful story about how her team use BDD:

I live in a magic kingdom and work in a castle (the Rackspace Castle). I’ve been using
Cucumber on a RESTful web service project since project inception over a year ago…
Currently our BDD sessions include the BA, QA, Dev Lead, and Developer. We will pull in
the PO or architects as needed.

Here’s how Lisa’s team have benefitted from using BDD:

We’ve found value in the BDD documentation process and obtaining a shared understanding of
what we’re building before it’s actually built. The added benefit of having executable
requirements and an automated functional test suite that’s ready when the code is ready is
icing.

Hear that readers? Lisa doesn’t see the primary benefit of Cucumber as being the testing, it’s the shared understanding that the team have built from writing the tests together. In Lisa’s magic kingdom, non-programmers don’t sit about writing tests on their own, they collaborate. And look the confidence that the whole team gets from that shared understanding:

The developers on my team have a strong level of confidence when delivering a
story that all scenarios have been coded and are working as we said they should be. The
QA knows up front exactly what we’re delivering with the story. I have confidence that
regardless of which developer owned the story, all expected scenarios are coded and
tested.

Of course there are other ways of getting that shared understanding and confidence. Working in small, cross-functional teams helps, and keeping the team together for a long time so that everyone becomes a domain expert is sadly under-appreciated by most big companies. Getting around a whiteboard or a set of design mock-ups to talk through a new feature is also invaluable, and some people find this is enough for them.

It really depends on the complexity of your domain: teams that work with complex, poorly understood business rules and requirements need all the tools they can get their hands on to manage that complexity. Many people I know have found this is the key benefit of using Cucumber: in a strange new domain, having a place to write down what you’re learning can really help you to stay sane.

I’ve never spoken to him about it, but my guess is that this is the reason DHH doesn’t see the need for Cucumber: he works on a team of good communicators who already have a wealth of domain expertise. In that context I might not use Cucumber either.

What I thought was most interesting about Lisa’s story was what happened when the team were under pressure and decided to throw off their BDD shackles:

We realized the value of our process when we bypassed it in order to quickly deliver a
number of features for a high profile effort. We absorbed a new BA, QA team, and Devs
that were unfamiliar with BDD, had tight timelines, and hoped to quickly knock out
features. These features have had a higher number of defects, did not meet their delivery
timelines, have a lack of automated testing (from both the developer and QA test fronts),
and general hesitation from developers in touching this code in fear of breaking
something. After our experience with these non-BDD implemented features the team (with
the support of management) has committed to full BDD for all new features.

Higher defects, missed deadlines, hesitation from developers to touch the code in case they break something… Does that sound familiar to you?

It certainly doesn’t sound like much fun. Pass the fairy dust.