When I run workshops to review and improve people’s automated tests, a common problem I see is the use of sleeps.
I have a simple rule about sleeps: I might use them to diagnose a race condition, but I never check them into source control.
This blog post will look at what it means to use sleeps, why people do it, why they shouldn’t, and what the alternatives are.
TL;DR
If you don’t have time to read this whole article, you can sum it up with this quote from Martin Fowler’s excellent essay on the subject:
Never use bare sleeps to wait for asynchonous responses: use a callback or polling.
— MartinFowler.com
Why sleep?
When two code paths run in parallel and then meet at a certain point, you have what’s called a race condition. For example, imagine you’re testing the AJAX behaviour of Google Search. Your test says something like this:
Given I am on the google homepage
When I type "matt" into the search box
Then I should see a list of results
And the wikipedia page for Matt Damon should be the top result
Notice that I didn’t hit Enter
in the test, so the results we’re looking for in the two Then
steps will be populated by asynchronous javascript calls. As soon as the tests have finished typing “Matt” into the search box, we have a race on our hands: will the app be able to return and populate the results before the tests examine the page to see if the right results are there?
We don’t need this kind of excitement in automated tests. They need to be deterministic, and behave exactly the same way each time they’re run.
The easy route to achieve this is to handicap the tests so that they always lose. By adding a sleep into the test, we can give the app sufficient time to fetch the results, and everything is good.
Given I am on the google homepage
When I type "matt" into the search box
And I wait for 3 seconds
Then I should see a list of results
And the wikipedia page for Matt Damon should be the top result
Of course in practice you’d push this sleep down into step definitions, but you get the point.
So why is this a bad idea?
What’s wrong with sleeps?
Sleeps quickly add up. When you use sleeps, you normally have to pad out the delay to a large number of seconds to give you confidence that the test will pass reliably, even when the system is having a slow day.
This means that most of the time, your tests will be sleeping unnecessarily. The system has already got into the state you want, but the tests are hanging around for a fixed amount of time.
All this means you have to wait longer for feedback. Slow tests are boring tests, and boring tests are no fun to work with.
What can I do instead?
The goal is to minimise the time you waste waiting for the system to get into the right state. As soon as it reaches the desired state, you want to move on with the next step of your test. There are two ways to achieve that:
- Have the system send out events (which the tests can listen for) as soon as it’s done
- Poll the system regularly to see if it has reached the right state yet
Using events is great when you can. You don’t need to use some fancy AMQP setup though; this can be a simple as touching a known file on the filesystem which the tests are polling for. Anything to give a signal to the tests that the synchronisation point has been reached. Using events has the advantage that you waste absolutely no time – as soon as the system is ready, the tests are notified and they’re off again.
In many situations though, polling is a more pragmatic option. This does involve the use of sleeps, but only a very short one, in a loop where you poll for changes in the system. As soon as the system reaches the desired state, you break out of the loop and move on.
How Capybara can save you
Many people using Capybara for web automation don’t realise how sophisticated it is for solving this problem.
For example, if you ask Capybara to find an element, it will automatically poll the page if it can’t find the element right away:
find('.results') # will poll for 5 seconds until this element appears
After five seconds, if the element hasn’t appeared, Capybara will raise an error. So your tests won’t get stuck forever.
This also works with assertions on Capybara’s page object:
page.should have_css('.results')
Similarly, if you want to wait for something to disappear before moving on, you can tell Capybara like this:
page.should have_no_css('.loading')
The reason you need to use should have_no_css
here, rather than should_not have_css
is because the have_no_css
matcher is going to deliberately poll the page until the thing disappears. Think about what will happen if you use the have_css
matcher instead, even with a negative assertion.
A more generic polling loop
As Jonas explained, there used to be a wait_until
method on Capybara’s API, but it was removed. It’s easy enough to roll your own, but you can also use a library like anticipate if you’d rather not reinvent the wheel.
If you’re using RSpec, Capybara’s matchers will handily make
should have_no_css
andshould_not have_css
functionally equivalent, which is nice.Of course, if you’re not using Capybara from RSpec then you’ll need to be more careful, exactly as you say.
Thanks for this great post. It greatly summarizes my experiences analyzing BDD stories I found. You might want to read my answer and how we solved the problem with Java: http://minds.coremedia.com/2013/06/03/death-to-sleeps-raise-of-conditions/