Outside-In vs Inside Out – Comparing TDD Approaches

At last month’s ScotRUG Brian Swan and I attempted to solve the TDD Avatars problem as a live recital in our chosen style. We each had 35 minutes.

The videos are here:

Brian’s Inside-Out TDD approach

Matt’s Outside-In approach

When Brian had walked us through his approach and solution at the last month’s meeting, he’d built his solution as a Rails application, with web forms for filling out bookings and viewing receipts and so on.

When I came to start practicing and converted the use case from the TDD Avatars paper into a Cucumber feature, it quickly became clear that the value of the system I was building, at least as described by the use case, was to provide printed receipts to customers. I then started to think about the simplest way I could build a system to provide that value.

Here’s the feature I wrote:

Feature: Pay bill
 
  Background: Prices
    Given the following operations are available:
      | operation        | price |
      | routine check up | 10    |
      | shots            | 5     |
 
  Scenario: Dave Pays for Fluffy
    Given there is an owner Dave Atkins, let's call him "Dave"
    And Dave brings his pet named Fluffy into the clinic for the following operations:
      | routine check up |
      | shots            |
    When the veterinarian charges him for the visit
    And Dave pays cash
    Then Dave is given a receipt which looks like this:
      """
      Operations:
        $10 (routine check up)
        $5 (shots)
 
      Total to pay: $15
 
      Paid cash, received with thanks
 
      """

Notice that the scenario doesn’t talk about clicking particular buttons or filling in boxes on a form? I’ve used a higher-level declarative style to describe the behaviour I want. In my experience this helps in various ways:

  • more human-readable features
  • features that aren’t coupled to a particular user interface

If you watch the video, you’ll see that the first thing I did, working my way in from the step definitions, was to create a custom step definition DSL for my problem domain. Instead of using a generic DSL like Capybara’s fill_in, click_button etc, I created this one:

module VetsHelper
  def register_operation_price(operation, price)
  end
 
  def remember_owner(name, nickname)
  end
 
  def create_visit(owner_nickname, pet_name, operations)
  end
 
  def charge_for_visit
  end
 
  def pay_with(payment_type, nickname)
  end
 
  def receipt
    ""
  end
end

This is arguably unnecessary: my step definitions are already translating from English into Ruby, so why add this extra layer of indirection?

As I worked my way from the outside (the features) into the step definitions, I wasn’t ready to commit myself to how I was going to couple the tests to my new application. By defining this interface, I’ve deferred that commitment a little later. I’ve also given myself a clean view of all the behaviour the new application needs to support.

My first iteration implementation (the one in the video) of VetsHelper drives out a domain model directly from the methods in that module. If that was what we released to our user, they’d only be able to print receipts if they knew how to use an IRB prompt. That might seem ridiculous, but we’ve gone a long way to solving the problem, and we could probably spike a simple script that let them do it from the command-line without much risk.

For our second iteration, we can talk to the customer about that command-line interface, then write a new implementation of VetsHelper, perhaps using some of Aruba’s DSL, which goes through that command-line interface instead of directly to the model. This is the beauty of using a declarative style together with your own domain-specific step definition DSL: it gives you the flexibility to swap in connections to the system that hit it at different levels, using exactly the same acceptance tests.

Did BDD Save Me Time?

When Brian and I were planning this month’s session, I showed him the code I’d written and he decided to do a comparable solution this time, without any UI, so that they were easy to compare. In fact, Brian’s solution looked much simpler, and was certainly quicker to write, because he didn’t have to spend any time writing the acceptance testing layers and he didn’t write any kind of entry-point Practice class. He just went straight into building the Appointment class.

A big difference between the solution we produced this month and the one that Brian had originally built was that we didn’t use Rails, and instead went for a much simpler solution that still provided some immediate value. I like to think that the idea for doing this came from the BDD approach I took—I’m pretty sure I remember the lightbulb going on as I typed out the feature—but we’ll never know now where this idea originated.

I noticed that Brian spent time testing getters on his classes, which I probably wouldn’t have done. I tend to try to avoid using them, except on value object, and I rarely test the behaviour of value objects. I rely on my acceptance tests to tell me if they’re not working.

Focus and Design

Brian’s big take-away was that the difference in our approaches when we needed a collaborator object. When I needed a collaborator for a class, I would just mock out the collaborator and carry on finishing off the class I was building, whereas he would leave the current class broken and go and build the other class first.

I find my (mock-based) approach gives me focus, and also means I can sketch out the design of the collaborator without having to commit myself to that design until I understand how it’s going to be used.

I’m really happy with the design I ended up with. It’s hard to make much of a judgement in such a simple problem, but I’d be interested to hear your thoughts on how the two designs compare. Which one would you have preferred to add a new feature to?

Agile / Lean Software Development

Comments (0)

Permalink

Acceptance Tests Trump Unit Tests

At work, we have been practising something approximating Acceptance Test Driven Development now for several months. This means that pretty much every feature of the system that a user would expect to be there, has an automated test to ensure that it really is.

It has given me a whole new perspective on the value of tests as artefacts produced by a project.

I made a pledge to myself when I started this new job in August that I would not (knowingly) check in a single line of code that wasn’t driven out by a failing test. At the time, I thought this would always mean a failing unit test, but I’m starting to see that this isn’t always necessary, or in fact even wise.

Don’t get me wrong. Unit testing is extremely important, and there’s no doubt that practising TDD helps you to write well-structured, low-defect code in an really satisfying manner. But I do feel like the extent to which TDD, at the level of unit testing alone, allows for subsequent changes to the behaviour of the code, has been oversold.

If you think you’re doing TDD, and you’re only writing unit tests, I think you’re doing it wrong.

As new requirements come in, the tensions influencing the design of the code shift. Refactoring eases these tensions, but by definition means that the design has to change. This almost certainly means that some, often significant, portion of the unit tests around that area of the code will have to change too.

I struggled with this for a long time. I had worked hard on those tests, for one thing, and was intuitively resistant to letting go of them. More than that, I knew that somewhere in there, they were testing behaviour that I wanted to preserve: if I threw them out, how would I know it still worked?

Yet those old unit tests were so coupled to the old design that I wanted to change…

Gulliver

In my mind, I have started to picture the tests we write to drive out a system like little strings, each one pulling at the code in a slightly different direction. The sum total of these tensions is, hopefully, the system we want right now.

While these strings are useful to make sure the code doesn’t fall loose and do something unexpected, they can sometimes mean that the code, like Gulliver in the picture above, is to restrained and inflexible to change.

The promise of writing automated tests up front is regression confidence: if every change to the system is covered by a test, then it’s impossible to accidentally reverse that change without being alerted by a failing test. Yet how often do unit tests really give us regression alerts, compared to the number of times they whinge an whine when we simply refactor the design without altering the behaviour at all? Worse still, how often do they fail to let us know when the mocks or stubs for one unit fail to accurately simulate the actual behaviour of that unit?

Enter acceptance tests.

By working at a higher level, acceptance tests give you a number of advantages over unit tests:

  • You get a much larger level of coverage per test
  • You get more space within which to refactor
  • You will test through layers to ensure they integrate correctly
  • They remain valuable even as underlying implementation technology changes

Admittedly, the larger level of coverage per test has a downside: When you get a regression failure, the signpost to the point of failure isn’t as clear. This is where unit tests come in: if you haven’t written any at all yet, you can use something like the saff squeeze to isolate the fault and cover it with a new test.

They’re also much slower to run, which can be important when you’re iterating quickly over changes to a specific part of the system.

To be clear, I’m not advocating that you stop unit testing altogether. I do feel there’s a better balance to strike, though, than forcing yourself to get 100% coverage from unit tests alone. They’re not always the most appropriate tool for the job.

To go back to the metaphor of the pulling strings, I think of acceptance tests as sturdy ropes, anchoring the system to the real world. While sometimes the little strings will need to be cut in order to facilitate a refactoring, the acceptance tests live on.

The main thing is to have the assurance that if you accidentally regress the behaviour of the system, something will let you know. As long as every change you make is driven out by some kind of automated test, be it at the system level or the unit level, I think you’re on the right track.

Agile / Lean Software Development

Comments (2)

Permalink

Come to CITCON

Some people think there is no conference for those of us who care about CI and testing, but oh yes there is.

As an avid reader of this blog, I know that you, like me, realise that continuous integration and testing are to software development what the spirit level and the plumb-line are to the construction industry: powerful tools that will one day be regarded as essential for any professional practitioner.
plumb-line
If you fancy meeting other like minds, come and join me at CITCON, the Continuous Integration and Testing Conference. What could be finer?

Agile / Lean Software Development

Comments (0)

Permalink

Story Driven Development – Just Another *DD?

Bryan Helmkamp, who maintains the handy little library webrat, did a talk recently at GoRuCo 2008 which explains his experiences using RSpec plain-text stories to build ruby-on-rails applications in a manner he calls ‘Story Driven Development’:

Before code is written, the team produces executable scenarios for a user story.

Continue Reading »

Agile / Lean Software Development

Comments (3)

Permalink

Behaviour-Driving Routes in Rails with RSpec

One thing that isn’t documented very well for RSpec is how to test your routes.

I came across an old post on the rspec mailing list which described a great way to do this:

describe TasksController "routing" do
 
    it "should route POST request for /tasks to the 'create' action" do
        params_from(:post, "/tasks").should == {:controller =>; "tasks", :action =>; "create"}
    end
 
end

Very nice.

Agile / Lean Software Development

Comments (0)

Permalink

Awesome Acceptance Testing

My notes on DanNorth and JoeWalnes‘ session at Spa 2008.

Five artefacts:

  • Automation – the glue that binds the tests to the code
  • Vocabulary – the language that the tests are expressed in
  • Syntax – the technology that the tests are expressed in (C#, Java)
  • Intent – the actual scenario being tested
  • Harness – the thing that runs the tests and tells you if they passed

Four roles. People might fill more than one, or more than one person might be in a role:

  • Stakeholder
  • Analyst
  • Tester
  • Developer

Taking a requirement, the Stakeholder and the Analyst have a conversation:

  • what does that requirement mean?
  • how can we create a shared understanding?

Then the Analyst and the Tester have a conversation:

  • what is the scope of (‘bigness’) of this requirement?
  • how will we know when we’re done?
  • => Scenarios (examples)

Tester then ‘monkeyfies’ the scenarios, using the following template:

Given … - assumptions, context in which the scenario occurs.

When … - user action, interaction with the system

Then … - expected outcome

e.g. Given we have an account holder Joe and their current account contains $100 and the interest rate is 10% When Joe withdraws $120 Then Joe’s balance should be $-22

The tester and the developer sit down and write an automated test to implement each scenario.

You might chain these up, but you can always categorise test code into these three partitions. This really helps how you look at test code.

Consistency Validation Between ‘Units’

See the Consumer Driven Contracts paper on Martin Fowler‘s website.

Tooling for Automation

Consider extending / creating the domain model to cover the application itself – the UI, the back end.

Loads of tools are availlable. Use whatever works and build on it.

Building a Vocabulary

Ubiquitous Language – Start with a shared language. It becomes ubiquitous when it appears everywhere – documents, code, databases, conversations.

You will use different vocabularies in different bounded contexts. A context might be your problem domain, testing domain, software domain, or the user interface domain.

Beware which roles understand you when you’re talking in a particular domain. Often terms will span domains.

e.g. NHibernateCustomerRepository <– 1 –><– 2–><– 3 –>

1 = 3rd Party Provider Domain 2 = Problem Domain 3 – Software Domain

Make your tests tell a story – make it flow. Don’t hide away things in Setup methods that will make the test hard to read. If that means a little bit of duplication, so be it. ‘Damp not DRY’.

Syntax – Implementing Your Tests

  • write your own
  • keep it simple. don’t fart around writing too fancy a DSL. you’ll be surprised what testers / analysts / stakeholders will be prepared to read.
  • great way to learn
  • Jbehave2
  • training wheels?
  • rspec
  • very nice.
  • create templates for each given / when / then which you can plug together with parameter values into scenarios
  • fit
  • concordion
  • nbehavejoe ocampo

Basically what you need is a way to assemble different permutations and combinations of Given / When / Then with different parameters to make different scenarios.

Expressing Intent

Think in terms of narrative, flow. Think in terms of bounded contexts, and who the audience (role) is for that context. Who will understand that vocabulary?

Make sure the intent is clear – that’s the main thing.

Harness

Do you want to hook into continuous integration build?

Which version of the code is it going to run against?

Keep the tests in two buckets: * in progress * done

Those which are in the ‘done’ bucket, should always work, those which are in progress are allowed to be failing, until you make them pass.

Getting Started

Things you can do today.

  • Try it for your next requirement
  • Given When Then helps guide the tests
  • It’s a collaborative process – get people involved
  • Works for bug fixes
  • a bug is a scenario that you missed in the first place use the tools you’re most comfortable with
  • doesn’t have to be perfect

Down The Line

What to aim for.

  • ALL requirements have acceptance criteria specified up front
  • helps with estimation
  • acceptance tests are automated where appropriate
  • just having thought about it helps – you may come back to automating it later.
  • Push button, availlable to all.
  • helps build trust with stakeholders

Advice

  • Automate pragmatically
  • Don’t try to automate what you can’t do manually
  • Testing is validating an outcome against intention
  • Non functional requirements
  • Plan for false positives
  • Quality is a variable
  • doesn’t mean you don’t go test first
  • doesn’t mean low quality code
  • does mean how complete is the solution? – how many scenarios / edge cases are you going to try and meet?

Summary

  • Have a shared understanding of done
  • There is no Golden Hammer
  • Be aware of the five aspects of test automation
  • Automation, Vocabulary, Syntax, Intent, Harness
  • Start simple, then you can benefit now

Agile / Lean Software Development

Comments (4)

Permalink

Integration Tests – Good or Evil?

As with most stupid questions like this, the answer is “neither”. There are times when integration tests really help, and there are times when they can be a pain in the neck.

I was prompted to write this post when a colleague pointed me towards this page on the behaviour-driven wiki, which mentions the disadvantages of integration tests, which usually involve some complex (and often slow to run) procedure to set up an expected state in the system your code is integrating with. This tight coupling with the external system reduces agility and makes the test code brittle.

The page does point out that “even in this state the code is often much more robust and a much better functional fit than code developed under more traditional methods based on large-scale up-front design”

I agree with the principles in the article, and I believe BDD is a great way to think, but I do think as long as the integration tests are well-factored (and hence easy to change) then the problems highlighted don’t apply – you’re still going to be quick on your feet if requirements change.

The question is whether you’re going to spend more time fixing your unit tests than you would debugging the code – if you’re confident you can write it correctly first time and anybody needing to change it is highly unlikely to introduce bugs in the area you’re coding, it’s a waste of everybody’s time to write a unit test – the test just becomes baggage for the team to drag around.

Conversely, if there’s a risk that future changes could break what you’re coding, or you’re bored of hitting F5 in the browser to test some subtle tweak in a function way down deep in a subsystem, thinking of an imaginative way to write a lightweight unit test that isolates that function and proves that it works as you want it to, is probably going to save you some dull debugging time.

Uncategorized

Comments (0)

Permalink