Sandi Metz’s Practial Object Oriented Design comes to London!

I’m delighted to announce that I’ll be joining Sandi Metz to teach two courses on Object-Oriented design this summer in London. We’re nick-naming it POODL.

You have two options:

  1. A two-day class running from July 3-4
  2. A more in-depth three-day class running from June 25-27

Sandi mentoring at a RailsGirls workshop

Sandi is not only one of the most experienced object thinkers I know, she’s also a wonderful teacher. If you’ve read her book, Practial Object Oriented Design in Ruby, you’ll already know her knack of bringing scary-sounding design principles gently down to earth with simple, practical examples.

 Not just for Rubyists

POODL is definitely not just for Rubyists. We’ll teach the examples in Ruby, but any programmer familiar with an Object-Oriented language will be absolutely fine. The challenging stuff will be learning the trade-offs between different design patterns, and how to refactor between them.

The feedback from this course in the US has been amazing, and I’m really looking forward to being a part of it over here in Europe.

Book your ticket now!

Agile / Lean Software Development
Hexagonal Rails

Comments (1)


Bring your Product Owner to BDD Kickstart for free!

BDD is powerful stuff, and it’s much more powerful when your product owner understands the benefits.

We love having product owners at BDD Kickstart. The first day is all about the collaborative aspects of BDD where we learn how to break down and describe requirements using examples. We find that product owners really enjoy it.

That’s why we’re announcing a special promotion for our next public course in London. Use the discount code “BYPO” when booking your three-day ticket, and we’ll throw in a free ticket worth £450 that you can use to bring your product owner, project manager or scrum-master along to day one.

These tickets are limited, so book now before they sell out.


Comments (0)


Costs and benefits of test automation

I think balance is important. Whenever I teach people about BDD or automated testing, we make a list of the costs and benefits of test automation.

The lists typically look something like this:


  • thorough analysis of a requirement
  • confidence to refactor
  • quick feedback about defects
  • repeatable test
  • living / trustworthy documentation
  • frees up manual testers for more interesting exploratory testing


  • time spent leaning how to write a test
  • time spent writing the test
  • waiting for the test to run
  • time spent maintaining the test
  • having a false sense of security when am automated test is passing

The benefits are great, but don’t underestimate the costs. If your team are in the early stages of adopting test automation, you’re going to invest a lot of time in learning how to do it well. You’ll make some mistakes and end up with tests that are hard to maintain.

Even once you’re proficient, it’s important for each test to justify its existence in your test suite. Does it provide enough of a benefit to justify the investment needed to write it, and the ongoing maintenance cost? Is there a way to bring down the ongoing cost, for example by making it faster?

I also find that listing costs and benefits helps to tackle skepticism. Having a balanced discussion makes space for everyone’s point of view.

I’m teaching a public BDD course in London, 4-6 December. If you’d like to take part you can sign up here:


Comments (1)


Cucumber-Ruby 2.0 moves into master branch

After months of hard work, we’ve got Cucumber 2.0 into a state where it can run its own tests and (usually) give us useful feedback. We’ve just merged this code into the master branch.

There’s still a lot to do. The specs all pass, but only approximately 50 / 150 scenarios are passing. The 100 that fail are tagged out with @wip-new-core while we get them into a passing state.

The decision to move this code into master was taken because we’ve been getting pull requests from kind people fixing things on code that’s going to be deleted for the 2.0 release. Having the 2.0 code on the master branch should help avoid this confusion.

We’ll continue to release bugfixes as needed off of the 1.3.x-bugfix branch, but we’ll concentrate our efforts on getting the 2.0 code ready.

I’d love some more help with this. Particularly:

  1. Taking individual @wip scenarios and making them pass.
  2. Refactoring the adapter that bridges between the new report API and the old formatter API.

Helping with (1) could be as simple as taking an individual @wip scenario, diagnosing the root cause, and creating a PR with the @wip tag removed (so that the test is failing) and explaining what needs to be changed. Even if you don’t feel confident to make the change, just doing the work to turn a statistic into a meaningful task on our todo list would be really helpful.

Of course if you want to try and fix the code to make the scenario pass, that would be even better! We’re always happy to give you some free BDD coaching.


Comments (0)


What is BDD and why should I care? (Video)

This is the pitch that I give right at the beginning of my BDD Kickstart classes to give everyone an overview of what BDD is, and why I think it matters.

In this video, I cover:

  • How BDD improves communication between developers and stakeholders
  • Why examples are so important in BDD
  • How BDD builds upon Test-Driven Development (TDD)
  • Why business stakeholders need to care about refactoring

If you’d like to learn more, there are still a few tickets left for the next public course in Barcelona on 11th September 2013.

Agile / Lean Software Development

Comments (0)


How much do you refactor?

Refactoring is probably the main benefit of doing TDD. Without refactoring, your codebase degrades, accumulates technical debt, and eventually has to be thrown away and rewritten. But how much refactoring is enough? How do you know when to stop and get back to adding new features?

TDD loop (image credit: Nat Pryce)

I get asked this question a lot when I’m coaching people who are new to TDD. My answers in the past have been pretty wooly. Refactoring is something I do by feel. I rely on my experience and instincts to tell me when I’m satisfied with the design in the codebase and feel comfortable with adding more complexity again.

Some people rely heavily on metrics to guide their refactoring. I like the signals I get from metrics, alerting me to problems with a design that I might not have noticed, but I’ll never blindly follow their advice. I can’t imagine metrics ever replacing my design intuition.

So how can I give TDD newbies some clear advice to follow? The advice I’ve been giving them up to now has been this:

There are plenty of codebases that suffer from too little refactoring but not many that suffer from too much. If you’re not sure whether you’re doing enough refactoring, you’re probably not.

I think this is a good general rule, but I’d like something more concrete. So today I did some research.

Cucumber’s new Core

This summer my main coding project has been to re-write the guts of Cucumber. Steve Tooke and I have been pairing on a brand new gem, cucumber-core that will become the inner hexagon of Cucumber v2.0. We’ve imported some code from the existing project, but the majority is brand new code. We use spikes sometimes, but all the code in the master branch has been written test-first. We generally make small, frequent, commits and we’ve been refactoring as much as we can.

There are 160 commits in the codebase. How can I look back over those and work out which ones were refactoring commits?

Git log

My first thought was to use git log --dirstat which shows where your commit has changed files. If the commit doesn’t change the tests, it must be a refactoring commit.

Of the 160 commits in the codebase, 58 of them don’t touch the specs. Because we drive all our changes from tests, I’m confident that each of these must be a refactoring commit. So based on this measure alone, at least 36% of all the commits in our codebase are refactorings.

Sometimes though, refactorings (renaming something, for example) will legitimately need to change the tests too. How can we identify those commits?

Commit message

One obvious way is to look at the commit message. It turns out that a further 11 (or 7%) of the commits in our codebase contained the word ‘refactor’. Now we know that at least 43% of our commits are refactorings.

This still didn’t feel like enough. My instinct is that most of our commits are refactorings.

Running tests

One other indication of a refactoring is that the commit doesn’t increase the number of tests. Sure, it’s possible that you change behaviour by swapping one test for another one, but this is pretty unlikely. In the main, adding new features will mean adding new tests.

So to measure this I extended my script to go back over each commit that hadn’t already been identified as a refactoring, check out the code and run the tests. I then did the same for the previous commit, and compared the results. All the tests had to pass, otherwise it didn’t count as a refactoring. If the number of passing tests was unchanged, I counted it as a refactoring.

Here are the results now:

Refactoring vs Feature Adding Commits

Wow. So according to this new rule, less than 25% of the commits to our codebase have added features. The rest have been either improving the design, or perhaps improvements to the build infrastructure. That feels about right from my memory of our work on the code, but it’s still quite amazing to see the chart.


It looks as though in this codebase, there are about three refactoring commits for every one that adds new behaviour.

There will be some errors in how I’ve collected the data, and I may have made some invalid assumptions about what does or does not constitute a refactoring commit. It’s also possible that this number is artificially high because this is a new codebase, but I’m not so sure about that. We know the Cucumber domain pretty well at this stage, but we are being extremely rigorous to pay down technical debt as soon as we spot it.

We have no commercial pressure on us, so we can take our time and do our best to ensure the design is ready before forcing it to absorb more complexity.

If you’re interested, here’s the script I used to analyse my git repo. I realise it’s a cliche to end your blog post with a question, but I’d love to hear how this figure of 3:1 compares to anything you can mine from your own codebases.

Update 28 July 2013: Corrected ratio from 4:1 to 3:1 – thanks Mike for pointing out my poor maths!

Agile / Lean Software Development

Comments (4)


Death to sleeps!

When I run workshops to review and improve people’s automated tests, a common problem I see is the use of sleeps.

I have a simple rule about sleeps: I might use them to diagnose a race condition, but I never check them into source control.

This blog post will look at what it means to use sleeps, why people do it, why they shouldn’t, and what the alternatives are.


If you don’t have time to read this whole article, you can sum it up with this quote from Martin Fowler’s excellent essay on the subject:

Never use bare sleeps to wait for asynchonous responses: use a callback or polling. —

Why sleep?

When two code paths run in parallel and then meet at a certain point, you have what’s called a race condition. For example, imagine you’re testing the AJAX behaviour of Google Search. Your test says something like this:

Given I am on the google homepage
When I type "matt" into the search box
Then I should see a list of results
And the wikipedia page for Matt Damon should be the top result

Notice that I didn’t hit Enter in the test, so the results we’re looking for in the two Then steps will be populated by asynchronous javascript calls. As soon as the tests have finished typing “Matt” into the search box, we have a race on our hands: will the app be able to return and populate the results before the tests examine the page to see if the right results are there?

We don’t need this kind of excitement in automated tests. They need to be deterministic, and behave exactly the same way each time they’re run.

The easy route to achieve this is to handicap the tests so that they always lose. By adding a sleep into the test, we can give the app sufficient time to fetch the results, and everything is good.

Given I am on the google homepage
When I type "matt" into the search box
And I wait for 3 seconds
Then I should see a list of results
And the wikipedia page for Matt Damon should be the top result

Of course in practice you’d push this sleep down into step definitions, but you get the point.

So why is this a bad idea?

What’s wrong with sleeps?

Sleeps quickly add up. When you use sleeps, you normally have to pad out the delay to a large number of seconds to give you confidence that the test will pass reliably, even when the system is having a slow day.

This means that most of the time, your tests will be sleeping unnecessarily. The system has already got into the state you want, but the tests are hanging around for a fixed amount of time.

All this means you have to wait longer for feedback. Slow tests are boring tests, and boring tests are no fun to work with.

What can I do instead?

The goal is to minimise the time you waste waiting for the system to get into the right state. As soon as it reaches the desired state, you want to move on with the next step of your test. There are two ways to achieve that:

  1. Have the system send out events (which the tests can listen for) as soon as it’s done
  2. Poll the system regularly to see if it has reached the right state yet

Using events is great when you can. You don’t need to use some fancy AMQP setup though; this can be a simple as touching a known file on the filesystem which the tests are polling for. Anything to give a signal to the tests that the synchronisation point has been reached. Using events has the advantage that you waste absolutely no time – as soon as the system is ready, the tests are notified and they’re off again.

In many situations though, polling is a more pragmatic option. This does involve the use of sleeps, but only a very short one, in a loop where you poll for changes in the system. As soon as the system reaches the desired state, you break out of the loop and move on.

How Capybara can save you

Many people using Capybara for web automation don’t realise how sophisticated it is for solving this problem.

For example, if you ask Capybara to find an element, it will automatically poll the page if it can’t find the element right away:

find('.results') # will poll for 5 seconds until this element appears

After five seconds, if the element hasn’t appeared, Capybara will raise an error. So your tests won’t get stuck forever.

This also works with assertions on Capybara’s page object:

page.should have_css('.results')

Similarly, if you want to wait for something to disappear before moving on, you can tell Capybara like this:

page.should have_no_css('.loading')

The reason you need to use should have_no_css here, rather than should_not have_css is because the have_no_css matcher is going to deliberately poll the page until the thing disappears. Think about what will happen if you use the have_css matcher instead, even with a negative assertion.

A more generic polling loop

As Jonas explained, there used to be a wait_until method on Capybara’s API, but it was removed. It’s easy enough to roll your own, but you can also use a library like anticipate if you’d rather not reinvent the wheel.


Comments (2)


Cucumber 1.3.1 released

Over the weekend we released Cucumber version 1.3.0. This was quickly replaced by 1.3.1 when we realised there was a bug 1.3.0 that only appeared on Windows.

Along with masses of bugfixes, this release contains the early stages of some serious internal refactoring work planned for release in version 2.0. Although our tests continue to pass, it may be that we’ve changed untested behaviour so that’s why we’ve bumped the minor release number. We’re already aware of one minor bug that’s been introduced[1]. Please let us know if you find any other issues.

New Features

  • Faster features, thanks to in-process Aruba. (Aslak Hellesøy)
  • Add lib to default load path (#162 Steve Tooke)
  • Add snippet type to support different type of ruby snippets. (#390 Roel van Dijk)
  • Call nested steps using any language keywords (#433 Tomohiko Himura)


  • Update WATIR example (#427) Luiz Guilherme D’Abruzzo Pereira
  • Ensure that cucumber.yml is only parsed once (#416) Steve Tooke
  • Improve rake task report on failure (#400 Andrey Vakarev
  • Show details of nested steps in backtraces (#69) Steve Tooke
  • Filter out control characters from CDATA section of JUnit formatter output. (#383 @teacup-on-rockingchair)
  • Fix problem with non-ascii characters in file path (#150 Oleg Sukhodolsky)
  • Fix problem loading ruby files in project root directory (#269 Steve Tooke)
  • Fix JsonPretty formatter problem (#197 Oleg Sukhodolsky)
  • Don’t display multi-line strings when –no-multiline is passed (#201 David Kowis)
  • Moved the profile information output to a callback of the formatter (#175 David Kowis)
  • Fix html formatter to not mark skipped/unimplemented steps as failed (#337 Oleg Sukhodolsky)
  • Allow duplication for format+output pair in command line (#434 David Lantos)
  • Better delegation to IO in Cucumber::Formatter::Interceptor::Pipe (#312 Oleg Sukhodolsky)



Comments (1)


Please consider supporting my work on Cucumber through gittip

My first commit to Cucumber was in 2008. Since then I’ve poured countless hours into the project and the community around it, whether directly as commits to the code, or answering questions on this mailing list, or writing blog articles. I am independent, so those hours have all been done on my own time.

Why do I do that?

It’s a complicated question. My consulting business is built around BDD, so I have a vested interest in the success of the Cucumber project. It’s more than that though. I have a firm belief that the difference between a software project that’s fun to work on and one that’s miserable is the communication between people. I’ve seen how Cucumber can improve that communication, or at least make it clear to people when it needs to improve. It’s not a silver bullet, but I do genuinely think it can help to make a software team more enjoyable to work on. I also love getting to collaborate with so many of you on code that nobody else owns, that we can all enjoy.

I have a family now, and as anyone else with young kids will know, my time feels extremely precious. Yet I feel a responsibility to all of you to keep Cucumber-Ruby’s code healthy and full of the features you need.

Will you help me?

A couple of weeks ago Olaf Lewitz pointed me to this TED talk by Amanda Palmer about “The Art of Asking” and it occurred to me to ask all of you, if you appreciate the work I do on the Cucumber project, to consider making a regular donation to me on gittip. You can donate as little as 0.25 cents per week, but each contribution keeps me motivated and tells me you appreciate my work.



Comments (1)


Cucumber 1.2.2 Released

This is a maintenance release, but marks a new period in Cucumber’s life as it was released by our new team member Oleg Sukhodolsky. Oleg has been doing a fantastic job since he joined the team a few weeks ago, closing tickets like a boss.

Here’s a summary of what’s in the release:

New Features

  • Ruby 2.0.0 support (#377 Matt Wynne & #357 @charliesome)
  • Capture duration value for json formatter (#329 Rick Beyer)
  • Added support for Hindi (hi), although some systems may need to install fonts which support the Devanagari script.
  • Obey program suffix when invoking bundler (#324 Eric Hodel)


  • Fix class loading problems –format option had (#345, #346 @ksylvest)
  • Exit with failure status when interrupted (#299 @aaronjensen)
  • Cannot map table headers after table hashes is referenced (#275 @chrisbloom7 / Matt Wynne)
  • (before|after)_step aren’t called when scenario outline’s table is processed (#284 Oleg Sukhodolsky)
  • Raise exception when remote socket end disconnects using wire protocol (#348 @rdammkoehler)
  • Fix –dry-run option ignored when set via profile (#248 / #255 Igor Afonov)
  • More clear suggested ruby code for undefined steps (#328 / #331 @martco)
  • Fix exception in Html formatter with –expand mode and undefined steps (#336 Roberto Decurnex)
  • Fix Table.diff! problem with :surplus_row => false and interleaved surplus rows (#220)


Comments (0)