In TDD in context #1: Keeping my job, I…

  • Started a job as a software developer after 5+ years away from programming
  • Implemented my first simple new feature in an unfamiliar domain and a legacy codebase
  • Passed code review by one of the software’s original authors
  • Delivered to production, only to discover (in the form of user complaints) that I’d caused several regressions
  • Scrambled to fix my dumb mistakes, while suddenly and vividly recalling the value of Test-Driven Development
  • Obtained management permission to test-drive from then on

Getting to testability

There were a couple very basic end-to-end tests to show that the server could handle concurrent write requests without obviously always corrupting the database. But to avoid the screwup I’d just upscrewed, I’d have needed unit tests around the code I changed. Since they didn’t already exist, I’d have needed to write them; since that was too hard, I needed to make it easier somehow.

Working against me were my ignorance of the problem domain, my inexperience with rescuing legacy code, my lack of recent practice at programming in general and Perl in particular, my limited grasp of the full complexity of the environment, and the fact that this application had come to life as a hack-weekend proof-of-concept — one that had so convincingly succeeded that they hired a dedicated full-time developer (moi) to maintain and extend it.

Working in my favor were the existence of a couple of in-house (now open source) tools that provided network transport and authentication entirely outside our server daemon’s process space, the imminent in-house release of a major upgrade to the protocol library that promised to obviate the need for marshal-and-unmarshal concern-mixing boilerplate in applications, and the ready availability of the two in-house developers who’d built those tools and my application.

Tipping the balance further in my favor was that, arguably, doing that protocol library upgrade would help us meet a business need. A Unix command-line client had been part of the proof-of-concept code, but a web-based GUI for users from other platforms was inevitable. Before we’d build and maintain a second client, though, we wanted a dead simple client API. The protocol library upgrade would solve that problem too.

Sealing the deal, one of the in-house developers was my manager. Bill well understood the reasoning, agreed with it, and arranged for us to have plenty of slack in the schedule for “the SSP 2 port.”

After a month or two of careful, incremental changes with help and supervision from the local experts, we had less code, better structure, a vastly simpler client API, decent confidence that we hadn’t otherwise changed any behavior, and a handful of fast new functional tests that helped me understand common workflows through the system. We ran the new service alongside the old until we’d found and ported all the other clients in the wild to our new API, then retired the old service. The SSP 2 port was complete. But so what?

Getting under test

Now that we had the beginnings of a test suite, for each new feature, I knew what to do: TDD with one extra step. Before writing a new red test, I checked whether any tests covered the current behavior in the area of code I was likely to change. Almost always the answer was “no”, so I’d dig around until I understood enough to add some tests. Only then would I add a red test. (Or, sometimes, take one of the tests I’d just added and change its assertion.)

After about half a year, I noticed that the answer was almost always “yes”. We’d arrived at TDD as usual. I felt good about that. But so what?

Dubious business value

Just before a release, a strange feature request came down out of nowhere. It wasn’t a request at all, as we found out when we tried and failed to refuse it on the basis that it wouldn’t actually solve the stated problem (along with having lots of unspecified corner cases). So on release-day morning, Bill apologetically asked if there were any way we could implement the feature before cutting the release.

We had good test coverage around that area of the code, so I said “Yeah, I think so. I’ll write a spec covering all the corner cases I can think of. Come back in an hour and half ready to review it.” He read thoughtfully through the new tests and agreed: “I think this feature is dumb, but if we have to do it, then this is how it has to work.”

45 minutes later I had the new tests passing, the rest of the suite still green, the dumb feature committed to source control, and the release ready to deploy.

Genuine business value

We were right to feel skeptical about the feature: it never did solve the stated problem. A few years later, we removed it.

I was right to feel confident about having arrived at TDD as usual: it improved our internal collaboration, helped us arrive at early and precise agreement, and allowed us to move quickly and safely — not just for this silly feature, but for dozens of far more valuable ones.

And Bill was right to have told me to do my work however I saw fit. I vividly recall what he said when we shipped that feature:

”I may not have the discipline to write tests first, but I sure am glad you do.”

Me too.