At the time I left Songkick, it took 1.5 hours to run all the cukes and rspec ‘unit’ tests on the big ball of Rails. We were already parallelising over a few in-house VMs at the time to make this manageable, but it still took 20 minutes or so to get feedback. After I left, the team worked around this by getting more slave nodes from EC2, and the build time went down to under 10 minutes.
Then guess what happened?
They added more features to the product, more tests for those features, and the build time went up again. So they added more test slave nodes. In the end, I think the total build time was something like 15 hours. 15 fucking hours! You’re hardly going to run all of that on your laptop before you check in.
The moral of this story: if you optimise your build, all you’ll do is mask the problem. You haven’t changed the trajectory of your project, you’ve just deferred the inevitable.
The way Songkick solved this took real courage. First, they started with heart-to-heart conversations with their stakeholders about removing rarely-used features from the product. Those features were baggage, and once the product team saw what it was costing them to carry that baggage, they were persuaded to remove them.
Then, with a slimmed-down feature set, they then set about carving up their architecture, so that many of those slow end-to-end Cucumber scenarios became fast unit tests for simple, decoupled web service components. Now it takes them 15 seconds to run the tests on the main Rails app. That’s more like it!
So by all means, use tricks to optimise and speed up the feedback you get from your test suite. In the short term, it will definitely help. But realise that the real problem is your architecture: if your tests take too long, the code you’re testing has too many responsibilities. The sooner you start tackling this problem head-on, the sooner you can start enjoying the benefits.
Programmers experience precisely the same problem when they substitute stubs/mocks for objects in their system in order to speed up the tests.
I encourage programmers to “reinvest” this their test speed profits to improve the design and reduce the number of dependencies. This increases cohesiveness and decreases code dependency on its context.
So it is for the build: reinvest the build speedup profits to split the system into smaller, more cohesive modules, each of which builds in seconds or minutes. Thanks for sharing.
I have a 20 min build that I’d like to be at least halved, so I’m interested. But surely this is just about throwing away tests, right? When do they test the complexity of the interaction between the web services – i.e. real production scenarios?
Thanks
Jem, the idea is that if you have clearly defined contracts between your services, you only need to test a small number of interactions. Then thoroughly test the behaviour of the individual services. Testing everything from the outside leads to combinatorial explosion, and ultimately, madness.
Good stuff, Matt.
To share another success story: I’ve been working on my current project for a year. This project is far more complex than any rails app I’ve ever worked on. I took what I had learned from Destroy All Software, Objects on Rails and GOOS and applied it to how we structured our code and how we tested it, and here’s where we’re at:
$ time bin/rspec spec/unit
Finished in 2.72 seconds (including 1 forced GC cycle(s), totalling 0.42926 seconds)
709 examples, 0 failures, 9 pending
Randomized with seed 2969
bin/rspec spec/unit –format progress 4.99s user 1.02s system 94% cpu 6.356 total
Running all of unit tests takes 6.35 seconds, and more than half of that is spent loading all the code.
The clearly defined interfaces between the components of our system have enabled us to make some large scale changes, all while getting virtually instantaneous feedback from our unit tests.
I understand. Here’s a simplistic example.
I have a service that routes cars through traffic lights. I test that cars only go through lights when they are green.
I have a service that co-ordinates different traffic lights & test that intersecting lights are never green at the same time.
I can logically conclude that cars won’t crash in intersections … but I don’t test it, because that would be an integration test. Have I missed something important?
Well, I don’t know for sure. It doesn’t look like it at first glance, but if that was my system for real, I’m sure I’d have a few integration tests that covered some common scenarios, because the consequence of failure is so high. But I wouldn’t run those tests on every check-in of the code.
Having too many integration (read: long running) tests may also be an indicator of too much system coupling. That’s why it makes sense that increasing the proportion of unit tests would also have a shorter build.
So hypothetically, in your opinion, can build time be used as a proxy for coupling?