refactoring – Tea-Driven Development

Half-arsed agile

Matt — Mon, 12 Aug 2013 15:30:37 +0000

Transitioning to agile is hard. I don’t think enough people are honest about this.

The other week I went to see a team at a company. This team are incredibly typical of the teams I go to see at the moment. They’d adopted the basic agile practices from a ScrumMaster course, and then coasted along for a couple of years. Here’s the practices I saw in action:

having daily stand-up meetings
working in fixed-length iterations, called sprints
tracking their work as things called Stories
estimating their work using things called Story Points
trying to predict a release schedule using a thing called Velocity

This is where it started to fall down. This team have no handle on their velocity. It seems to vary wildly from sprint to sprint, and the elephant in the room is that it’s steadily dropping.

I see this a lot. Even where the velocity appears to be steady, it’s often because the team have been gradually inflating their estimates as time has gone on. They do this without noticing, because it genuinely is getting harder and harder to add the same amount of functionality to the code.

Why? Sadly, the easiest bits of agile are not enough on their own.

Keeping the code clean

Let’s have a look at what the team are not doing:

continuous integration, where everyone on the team knows the current status of the build
test-driven development
refactoring
pair programming

All of these practices are focussed on keeping the quality of the code high, keeping it malleable, and ultimately keeping your velocity under control in the long term.

Yet most agile teams, don’t do nearly enough of them. My view is that product owners should demand refactoring, just as the owner of a restaurant would demand their staff kept a clean kitchen. Most of the product owners I meet don’t know the meaning of the term, let alone the business benefit.

So why does this happen?

Reasons

Firstly, everyone on the team needs to understand what these practices are, and how they benefit the business. Without this buy-in from the whole team, it’s a rare developer who has the brass neck to invest enough time in writing tests and refactoring. Especially since these practices are largely invisible to anyone who doesn’t actually read the code.

On top of this, techniques like TDD do –despite the hype– slow a team down initially, as they climb the learning curve. It takes courage, investment, and a bit of faith to push your team up this learning curve. Many teams take one look at it and decide not to bother.

The dirty secret is that without these technical practices, your agile adoption is hollow. Sure, your short iterations and your velocity give you finer control over scope, but you’re still investing huge amounts of money having people write code that will ultimately have to be thrown away and re-written because it can no longer be maintained.

What to do

Like any investment decision, there’s a trade-off. Time and money spent helping developers learn skills like TDD and refactoring is time and money that could be spent paying them to build new features. If those features are urgently needed, then it may be the right choice in the short term to forgo the quality and knock them out quickly. If everyone is truly aware of the choice you’re making, and the consequences of it, I think there are situations where this is acceptable.

In my experience though, it’s far more common to see teams sleep-walking into this situation without having really attempted the alternative. If you recognise this as a problem on your team, take the time to explain to everyone what the business benefits of refactoring are. Ask them: would you throw a dinner party every night without doing the washing up?

Does the world need to hear this message? Vote for this article on Hacker News

How much do you refactor?

Matt — Wed, 24 Jul 2013 14:33:26 +0000

Refactoring is probably the main benefit of doing TDD. Without refactoring, your codebase degrades, accumulates technical debt, and eventually has to be thrown away and rewritten. But how much refactoring is enough? How do you know when to stop and get back to adding new features?

(image credit: Nat Pryce)

I get asked this question a lot when I’m coaching people who are new to TDD. My answers in the past have been pretty wooly. Refactoring is something I do by feel. I rely on my experience and instincts to tell me when I’m satisfied with the design in the codebase and feel comfortable with adding more complexity again.

Some people rely heavily on metrics to guide their refactoring. I like the signals I get from metrics, alerting me to problems with a design that I might not have noticed, but I’ll never blindly follow their advice. I can’t imagine metrics ever replacing my design intuition.

So how can I give TDD newbies some clear advice to follow? The advice I’ve been giving them up to now has been this:

There are plenty of codebases that suffer from too little refactoring but not many that suffer from too much. If you’re not sure whether you’re doing enough refactoring, you’re probably not.

I think this is a good general rule, but I’d like something more concrete. So today I did some research.

Cucumber’s new Core

This summer my main coding project has been to re-write the guts of Cucumber. Steve Tooke and I have been pairing on a brand new gem, cucumber-core that will become the inner hexagon of Cucumber v2.0. We’ve imported some code from the existing project, but the majority is brand new code. We use spikes sometimes, but all the code in the master branch has been written test-first. We generally make small, frequent, commits and we’ve been refactoring as much as we can.

There are 160 commits in the codebase. How can I look back over those and work out which ones were refactoring commits?

Git log

My first thought was to use git log --dirstat which shows where your commit has changed files. If the commit doesn’t change the tests, it must be a refactoring commit.

Of the 160 commits in the codebase, 58 of them don’t touch the specs. Because we drive all our changes from tests, I’m confident that each of these must be a refactoring commit. So based on this measure alone, at least 36% of all the commits in our codebase are refactorings.

Sometimes though, refactorings (renaming something, for example) will legitimately need to change the tests too. How can we identify those commits?

Commit message

One obvious way is to look at the commit message. It turns out that a further 11 (or 7%) of the commits in our codebase contained the word ‘refactor’. Now we know that at least 43% of our commits are refactorings.

This still didn’t feel like enough. My instinct is that most of our commits are refactorings.

Running tests

One other indication of a refactoring is that the commit doesn’t increase the number of tests. Sure, it’s possible that you change behaviour by swapping one test for another one, but this is pretty unlikely. In the main, adding new features will mean adding new tests.

So to measure this I extended my script to go back over each commit that hadn’t already been identified as a refactoring, check out the code and run the tests. I then did the same for the previous commit, and compared the results. All the tests had to pass, otherwise it didn’t count as a refactoring. If the number of passing tests was unchanged, I counted it as a refactoring.

Here are the results now:

Wow. So according to this new rule, less than 25% of the commits to our codebase have added features. The rest have been either improving the design, or perhaps improvements to the build infrastructure. That feels about right from my memory of our work on the code, but it’s still quite amazing to see the chart.

Conclusions

It looks as though in this codebase, there are about three refactoring commits for every one that adds new behaviour.

There will be some errors in how I’ve collected the data, and I may have made some invalid assumptions about what does or does not constitute a refactoring commit. It’s also possible that this number is artificially high because this is a new codebase, but I’m not so sure about that. We know the Cucumber domain pretty well at this stage, but we are being extremely rigorous to pay down technical debt as soon as we spot it.

We have no commercial pressure on us, so we can take our time and do our best to ensure the design is ready before forcing it to absorb more complexity.

If you’re interested, here’s the script I used to analyse my git repo. I realise it’s a cliche to end your blog post with a question, but I’d love to hear how this figure of 3:1 compares to anything you can mine from your own codebases.

Update 28 July 2013: Corrected ratio from 4:1 to 3:1 – thanks Mike for pointing out my poor maths!

A coding dojo story

Matt — Fri, 19 Apr 2013 08:42:19 +0000

It was 2008, and I was at the CITCON conference in Amsterdam. I’d only started going to conferences that year, and was feeling as intimidated as I was inspired by the depth of experience in the people I was meeting. It seemed like everyone at CITCON had written a book, their own mocking framework, or both.

I found myself in a session on refactoring legacy code. The session used a format that was new to me, and to most of the people in the room: a coding dojo.

Our objective, I think, was to take some very ugly, coupled code, add tests to it, and then refactor it into a better design. We had a room full of experts in TDD, refactoring, and code design. What could possibly go wrong?

One thing I learned in that session is the importance of the “no heckling on red” rule. I watched as Experienced Agile Consultant after Experienced Agile Consultant cracked under the pressure of criticism from the baying crowd of Other Experienced Agile Consultants. With so many egos in the room, everyone had an opinion about the right way to approach the problem, and nobody was shy of sharing his opinion. It was chaos!

We got almost nowhere. As each pair switched, the code lurched back and forth between different ideas for the direction it should take. When my turn came around, I tried to shut out the noise from the room, control my quivering fingers, and focus on what my pair was saying. We worked in small steps, inching towards a goal that was being ridiculed by the crowd as we worked.

The experience taught me how much coding dojo is about collaboration. The rules about when to critique code and when to stay quiet help to keep a coding dojo fun and satisfying, but they teach you bigger lessons about working with each other day to day.

Lessons From a Master

Matt — Sat, 22 Mar 2008 20:34:43 +0000

One of the several great things about working for my current client is that their high public profile means it’s reasonably easy to get interesting people to come and visit us from time to time.

Last week the mighty Martin Fowler dropped by to talk to us.

I was lucky enough to spend a few hours in Martin’s company, sharing lunch in the canteen with him and a few other senior technical bods, listening to his talk and Q&A, then as he visited the various teams, taking a look at how we do things.

His talk lasted just over an hour, and was delivered with great panache, especially considering it was entirely off the cuff. Beginning with his experiences in software design right through his career, he described key lessons learned, building up to an inspirational tub-thumping call to arms for the British media (and my client in particular) to continue to battle the ‘information wasteland’ coming out of the American commercial media. Fantastic stuff.

I made several notes during his talk and I thought I’d jot them down here.

A lot of good ideas in software get lost though the generations. Part of the responsibility for this lies with the old-timers who don’t take adequate steps to communicate their ideas and lessons they’ve learned to the younger generation. Martin sees it as his main aim nowadays: to look backward, spot patterns, and try to communicate those through his books and talks.

The best ideas and techniques span multiple languages and technologies. By moving back and forth between the C++ and SmallTalk communities in his early days as a programmer, Martin was able to cross-pollinate ideas between the two, and filter out the most powerful concepts – those which applied to both.

Irreversible decisions are the most dangerous kind for any organisation. Lean (and thence Agile) principles try to make as many decisions as possible reversible, and to defer for as long as possible those which appear not to be. Largely, this is achieved by moving rapidly in small, safe steps so that the path backwards is always clear. Thus a big and risky task becomes smaller and more approachable.

Don’t accept that something can’t be changed – just try to figure out how.

If something is awkward and painful (e.g. database refactoring) then do it as often as possible, until it becomes easy.

Collaboration with the customer is key and to that end, the creation of a ubiquitous language is key. Other tools that can help with this collaboration are UML diagrams (as sketched on a whiteboard), CRC cards and tests.

The most important skill for a software developer is to be able to collaborate with people. The traditional view that mathematical skills are the most important is becoming redundant.

When designing code, two main features make for a good design:

Lack of duplication
Clear reveal of intent

To enable clarity of intent, naming is extremely important.

When designing a system, one approach is to think of modules as things which keep secrets, then partition your design around the secrets that need to be kept.