Last week someone asked some Agile coaches for their thoughts on metrics. It occurred to me that I’ve got some feelings too.
Nope!
I distrust metrics. More accurately, I distrust what people — myself included — tend to do with metrics not only by default, but also in spite of any better intentions we might set out with. It doesn’t matter what we say a number is supposed to be for: people will decide (and re-decide) for themselves. Regardless of what we wish, expect, or can even observe, a metric is “for” whatever behaviors it produces. In other words:
The purpose of a system is what it does.
—Stafford Beer
Metrics are tools. Tools have affordances. Affordances influence behavior. Most metrics have untidy affordances.
Not on my watch!
When I’m asked for metrics, I find myself feeling oppositional. I may ask questions like “Which decisions will this help you make?”, but I’m not asking to clarify intent and explore the problem. I’m asking to be passively aggressive, to be obstructive, to make myself a hurdle to be cleared before one more development team gets measured in one more useless, counterproductive, or spirit-crushing way. As I like to say:
Careful what you measure, because you’ll get it.
—Amitai Schleier
(I’m smart enough not to track how many times I quote myself.)
There are a few problems with my strategy.
Who needs my approval?
Nobody. I don’t have to agree that a given metric is a good idea for it to get measured and observed.
Who’s measuring whom?
My reaction is maybe a bit parental, triggered by the prospect that people with more power might use it to do harm to people with less. But sometimes teams have their own reasons to measure themselves, as an input to their own satisfaction with their own work. When that’s what’s happening, and we can make sure nobody else will ever get to see our numbers, I’m happy to join in clarifying intent and exploring the problem.
Who’s got a better idea?
Just because I think my idea is better doesn’t mean anyone else does.
For instance, maybe I get on my soapbox and claim that we’ve already got some metrics. What?!? Sure, we’ve already got some idea of how much it’s costing us to do stuff, how much risk we’re taking, how much value we’re delivering, and how much people feel like what we’re doing is worth continuing to do. Aren’t these the questions we’re motivated to answer better? And if so, how about before we pose new questions, let’s look for ways to get better answers?
Maybe that’s convincing, maybe it isn’t. If we’re gonna take on new metrics, I try to improve the likelihood that we’ll pay attention to them and act on what we notice.
Idea: metrics come with expiration dates
Just because I think adding a metric constitutes an experiment to see whether it produces net-desirable behaviors doesn’t mean anyone else does. Because I believe they’ll come around to my way of thinking once they see for themselves, I suggest a behavioral hack. It’s a rule:
Every new metric we add must be accompanied by an expiration date.
Could be two retrospectives from now, if we’re doing iterations. Could be three months from now, if we’re not comfortable with running experiments.
The expiration date is a means to an end. The ends are affordances to:
- Remember to observe, and
- Stop if we want.
By the time the expiration date rolls around, we might not remember why we started tracking the metric. That would be an observation worthy of reflection, but also an affordance to retain the metric out of inertial uncertainty. The expiration date provides a counter-affordance that it’s safe to drop it, because we’ve made that aspect of our original intent clear.
If we remember what else we originally intended, and/or it seems to be serving us well, we can choose a new date and renew our subscription. Or try a new variation.
Conclusion
In short, I find most grasping for metrics to be a reliable metric for lack of understanding of human behavior, not only that of those who would be measured but that of those who would do the measuring.
If a higher-up wants a metric about a team, say, as an input to their judgment about whether the team’s work is satisfactory, oughtn’t there be some other way to tell?
And if I choose nearly any metric on someone else’s behalf, doesn’t that reveal my assumption that I know something about how they do their good work better than they do?
Or worse, that I prefer they nail the metric than do something as loose and floppy as “good work”?
Let’s try that again
New metric (expiration = next subhead, privacy = public): I’m 0 for 1 on satisfying conclusions to this post.
I’m hardly an expert on human behavior. If I were one, rather than being passive-aggressive and obstructive, I’d have a ready step to suggest to metrics-wanters, one that they’d likely find more desirable than metrics.
Instead I have to talk myself down from passo-aggro-obstructo, by which time they’ve chosen what they’ll observe and the ready step I can offer is limited to encouraging them to observe the effects of their observation.
Can you give me some better ideas?
Further reading
I’m much more comfortable with the latest conclusion, especially if you’ll give me some ideas. I’ll call it 1 for 2 and let the metric expire here.
My suggestion that metrics auto-expire was mentioned in Tim Ottinger’s recent post, Modern Agile Metrics: What Should We Measure?.
I’ve heard that Dave Nicolette’s book, Software Development Metrics, would be useful for me to read. Also How to Measure Anything. What else?
Updates:
Chris Freeman writes in:
You might add Out Of The Crisis as a resource to your post since he writes about how the folks on the floor are in control of their own improvements and how metrics are hard and how surprising “normal” can be.
Doc Norton has a presentation called Agile Metrics - Velocity Is NOT The Goal.
Alan Dayley suggests over on the Google+ Agile group that “Every metric should be tied to a business or improvement goal”, and uses a modified Goal-Question-Metric model.
Mike Rogers points at Larry Maccherone’s Eight Dragons of Agile Metrics.
Troy Magennis offers a GitHub repo full of treasures.
Long ago in my career when I was doing large waterfall projects — 9, 12, 18 month release projects — we had to measure things or we couldn’t tell whether the project was in line with successful delivery, for whatever definition of “successful” we were working under. I have a whole list of metrics I’ve kept. They were mostly useful. And I did use the same organizing principle you mention: I didn’t measure a thing unless there was some concrete decision we would make with the result.
The great thing about the ways most places deliver software today is that we’ve broken things down into small chunks where you can usually tell just by some version of “looking at it” whether things are going well or not. And I never, ever want to go back to the old way. Because every last thing I measured ended up gaming the system. People naturally want the things they’re being measured on to turn out well for them. I even tried to measure without the teams knowing what I was measuring, and it still ended up influencing behavior.
I think your question, “What decisions will this metric help you make?” is the right one to ask, regardless of whether its motivation is gut-level resistance. It challenges the person wanting to put the metric in place to truly think through what they’re trying to accomplish. If they can get to the root of what they want, very often you can have a meaningful and productive discussion with them about that.
I’m wanting to +1 your whole rant, I’d like to nail it to the front doors, I’m thinking about a tattoo, but unsure where on my bosses body it should go…
I have sometimes fantasied about asking the VP that want’s a new metric, if it would be good for us to add one that measured their leadership of our group - I’ll call this metric Mean Time between Disruptions (MTD). MTD is calculated much like the old factory sign that said “its been 23 days since we killed someone at this factory”. So let’s start counting (I suggest in weeks) the time between a major disruption to the team. For this basic metric we are looking at basic team formation dynamics (your familiar with Tuckman’s Forming, Storming, Norming, Performing) and you mr. VP want the P word - but it comes after 3 stages of development beyond the F word) So let’s start at the beginning and count weeks between Forming and ReForming. You know like when you move a person on/off a team. When you move the team’s physical location, when you give the team a new objective.
The metrics I’ve seen range from MTD = 0 - 20 weeks for many teams I’ve worked with. And they say they desire persistent teams.
A friend Jay Packlick has done a few talks and presentations around this concept - Metrics that Matter
https://speakerdeck.com/improving/visualizing-agility-agile-metrics-that-matter
He has suggested that we plot the metric on a grid - Y axis - Potential for Value and X axis - Potential for Evil. When I started doing this… it became much clearer when to say no to a metric and when to say… well OK, as long as we keep this idea true… and circle the area on the grid that we discussed.