Metrics for qualitative requirements

Just how should we handle qualitative requirements in system-design and enterprise-architecture? Should we, for example, reframe them into quantitative terms, as metrics – because it’s a lot easier to keep track of ‘measurable things’?

Over the past couple of days I’ve been having a great Twitter back-and-forth on this with Catherine Walker, with a bit of brief assistance from Dave Snowden and Sally Bean. The start-point was a Tweet by Catherine, quoting software-guru Tom Gilb at a conference on Lean Systems Development:

  • transageo: “Keep on drilling down and decomposing until measurability becomes obvious” @imtomgilb

Hmm, yes, very ‘Tom Gilb’, that… – in fact looking back at some old notes, I discovered that I’d had a fairly big argument with him at the Unicom-EA conference in September last year about exactly this point. His theme there was that we should describe every requirement in a quantitative or measurable form; my response then, and now, is that, yes, in principle we can do that, but whether we should do that is a very different question – and in many cases the answer would be a most definite ‘No’.

Part of the reason I say this is that whilst many system-designers would focus on the functional requirements, much of architecture is ‘non-functional’, in that it often depends on so-called ‘non-functional’ or qualitative elements – and those qualitative requirements need to be kept intact, without being fragmented by overly-insistent analysis. Catherine Walker unintentionally highlighted what is perhaps the deeper problem here in another quote from Tom Gilb:

  • transageo: “Selling an engineering culture to people steeped in a craftsmanship culture” (paraphrasing @imtomgilb)

Sorry, but this is dear old John Zachman’s beloved metaphor of ‘engineering the enterprise’ all over again: valid enough when used exclusively within a software-engineering context, but completely the wrong metaphor in any human context, and hence in many (most?) qualitative-contexts too…

Okay, yes, I take the point about ‘a craftsmanship culture’, by which I presume he means a technical-culture with little or no understanding or discipline with formal-rigour. But in essence we covered that in the previous post here: and in real-world practice the problem is often not too little engineering-style formal-rigour, but too much of it – or rather, applied too much to the exclusion of everything else. The problem, in fact, is usually the almost exact opposite of Tom Gilb’s phrasing above: the need to build awareness of craftsmanship and respect for the uncertainties of ‘It depends…‘ amongst people so steeped in engineering-culture that who can’t or won’t acknowledge that no amount of analysis can ‘solve’ inherently-irreducible complexity.

There’s also the point that Sally Bean posted in a Tweet shortly before all of this, in a quote from Dave Snowden:

  • Cybersal: “The problem with user requirement capture is that someone assumes there’s a requirement.” @snowded #cynhki

Who determines that something is ‘a requirement’, and that something else is not? That’s an extremely important question that gets too-easily skipped-over in the rush to reduce everything to quantitative measures – and it’s not a trivial question, either…

Anyway, I reTweeted Catherine’s quote of Tom Gilb’s “Keep on drilling down…” assertion, with an addendum that said something like “huh??? in a systems-context???”. (I’ve managed to lose my actual Tweet somehow…). Later that evening she came back on Twitter with the following clarification:

  • transageo: A requirement for a s/w project could be, eg ‘High Availability’. Drill down: availability = (eg) reliability + maintainability // Then “Maintainability” can be expressed as (eg) av. fault fix time. “Reliability” as (eg) av time between system crashes. // A fuzzy requirement becomes quantifiable. More: http://www.gilb.com/dl437 [PDF] (though slides a poor substitute for the speaker)

However, I was still very unhappy with the notion of “a fuzzy requirement becomes quantifiable”, for reasons as above, and also illustrated in a great quote the other day by Harold Jarche, posted by Mark Foden:

  • markwfoden: “One should never bring a knife to a gun-fight, nor a cookie-cutter to a complex adaptive system” @hjarche

The context of that quote was about the ‘cookie-cutter’ methods used by too many large-consultancies, but it’s the same core-problem as in this case: an over-reliance on analysis to the exclusion of everything else. Hence I followed up to Catherine’s comments with another Twitterstream-reply:

  • tetradian: agreed re how drill-down to metrics works, but is inherently fragile relative to whole-system level, the original ‘why’ // danger is that fragmentary metrics in whole-systems lead to misuse of eg. Six Sigma – ask eg. @snowded for advice/comment on this // key point is that fuzzy-requirement _can_ be ‘quantified’ but also _always_ remains fuzzy: Complex, not Complicated. // Simple drill-down analysis tends toward presenting Complicated-only world, eventually pretending Complex/unorder doesn’t exist // (I suggest to talk w @snowded b/c you know his work, and it fits well with this case re fuzzy-requirements in whole-systems)

As per those Tweets, I suggested that she contact Dave Snowden, in part because of that comment quoted by Sally Bean above, but even more because this is exactly the kind of context he works in. (We disagree in some areas of our respective work, but not in this one.) He was kind enough to send in a couple of comments overnight:

  • snowded: New Scientist: “extrinsic rewards destroy intrinsic motivation”  6S OK stable systems but damages innovation etc. // reductionist approach (drill down) may end up with you measuring wrong thing – hope that helps

It did help; and in the morning, Catherine and I returned in earnest to our back-and-forth:

  • transageo: Thanks. Understood re. risk of small system components achieving undue weight/important connections missed. // I risk finding complexity so appealing that I deny/forget that quantification ever enhances shared understanding
  • tetradian: I concur w @snowded on both points (extrinsic blocking intrinsic; risk of ‘measuring wrong thing’) – also break of link w whole // (classic example of fragmentation-by-misplaced-metric is ever-expanding breakdown of UK NHS – “death by targets” etc)
  • transageo: so @imtomgilb ideas good discipline for me.  I could def benefit from #CognitiveEdge education alongside, though 😉
  • tetradian: @transageo: “so @imtomgilb ideas good discipline” – yes, is good discipline: yet must also balance w discipline to maintain systems-as-whole

At this point, Catherine came up with a very important question:

  • transageo: Doesn’t that conflate metrics w/ targets? Measures should not have to become targets (tho agree they often, inappropriately, do)
  • tetradian: in the hands of most managers, _every_ metric becomes a ‘target’ – especially if behaviour/bonus etc are linked to it… // _every_ metric/quantification needs a huge “It Depends…” label attached to it! 🙂 #entarch #bizarch
  • transageo: Yes. @imtomgilb now talking about defining & quantifying “circs in which …(ie depends) “. Something will be missed.
  • tetradian: “Something will be missed.” – (or “_may_ be missed”?) – yes. Hence need for systematic balance of drill-down and whole-system.

The trap here is that most managers still believe too strongly in the old dictum that “if you can’t measure it, you can’t manage it” – and therefore that whatever we can measure becomes a ‘target’ that must be trimmed back and trimmed back relentlessly in the name of ‘efficiency’, whilst whatever we can’t measure is deemed not to exist at all. As too many organisations have found to their cost, that mistake usually has disastrous consequences – for which the only ‘solution’ offered is, yes, yet more mangled ‘management’… Or, as Catherine commented:

  • transageo: Or the system may meet reqs, missing greater needs, but material is in place to cover arses if needed
  • tetradian: yep: CYA-centric view of ‘requirements’ / metrics is _exactly_ the danger here… 🙁 – “operation successful but patient died”..

That last quote was all too popular amongst Victorian surgeons – a somewhat extreme case of missing the point…

Anyway, Catherine did a wrap-up at this point, and the conversation came to a close:

  • transageo: Yes, danger is clear 😉 I do think some degree of quantification can have value in underpinning shared understanding, though // Fully accept your point & @snowded’s that drill-down does not dissolve complexity, it must be embraced & absorbed
  • tetradian: yes, quantification definitely has value (if only as a focus-discipline) – it just needs to be kept in balance & context, is all

That’s probably the best summary here: “drill-down does not dissolve complexity, it must be embraced and absorbed“. It’s often very useful to apply a drill-down to qualitative-factors, as a way of creating a better understanding about them, and better agreement amongst stakeholders as to how they themselves interpret those factors: yet it’s essential always to remember to also keep the qualitative-factors as themselves, whole and complete in their own right.

To put it at perhaps its simplest, there’s a qualitative difference between quantitative-requirements and qualitative ones: and the latter cannot and must not be reduced solely to some form of quantitative metric, else the quality that makes it ‘qualitative’ will itself be lost.

Over to you, anyway: your comments, perhaps?

12 Comments on “Metrics for qualitative requirements

  1. Three comments supporting Gilb.
    1) He’s spent a long time going on a out this stuff, and presumably there must be evidence of whether it works or not.
    2) At some point some bastard has to write the code and build the server, at which point it’s just 0s and 1s. They need to know how well to do it.
    3) My IT experience (mainly legacy banking/insurance estates) is a lack of engineering rigour rather than too much. There is a fascination with functionality, what the system must do, a passing interest in availability and throughput, and no care at all a out other quality requirements.

    • Yep, fully agreed. And yes, I’ve done my time writing acres of code, too, so I do know all too well that there are valid points there.

      And yet the blunt fact remains that putting quantitative figures on qualitative-factors is a darn dangerous thing to do, because people get hung up on the numbers and forget the qualitative-issue itself.

      I know I’m not going to win on this one, because Tom Gilb is, well, Tom Gilb. I’d just like to invite people to recognise that there’s a real risk there that often isn’t addressed, that’s all.

  2. Hi Tom

    It all seems to be a question of degrees and where you go on the spectrum of ‘total reductionism/objectivity’ to ‘no reductionism/subjectivity’.. and a function of the problem domains you work in.

    Problem domains such as software development clearly need objectivity – at some point someone needs to carve code, execute a test case etc. It’s a (relatively) controlled box with little room for ambiguity.. at least at the detail level.

    Same goes for availability and performance requirements which was the example given above somewhere – there is a tendency sometimes to use woolly language (‘generally available’) because thinking about it is too hard. This is where push back is necessary to drive for metrics.

    But even that has limits.. for example, say, for a website – “delight the customer” – one can subjectively (!) reduce that to response-time metrics, click-counts, font-consistency, etc.. but by taking that path exclusively there is a real danger of completely missing the point.. the wood for the trees, etc.

    Err, I think I just restated your position – so what I mean to say is … I agree!

    • @Mike: “there is a tendency sometimes to use woolly language (‘generally available’) because thinking about it is too hard.”

      Yes, exactly: this is the real problem here. Where I agree with Tom Gilb – and agree strongly with him – is that these things need to be thought about, and thought about carefully.

      @Mike: “This is where push back is necessary to drive for metrics.”

      Agreed, up to a point: some kind of quantification is usually necessary, especially at the code-implementation level. The trap is that the metrics themselves become a substitute for thinking: ‘hard numbers’ used as means to evade the even harder thinking about ‘requisite fuzziness’. If we stop at Tom Gilb’s “Keep on drilling down and decomposing until measurability becomes obvious”, we’ve only gone halfway towards what we actually need to do in order properly understand and design for the requirement: we also need to link those measurable ‘hard numbers’ back to the original fuzziness to ensure that they make sense under all variances of the requirement-context.

      In a sense, it’s much like the challenge that architects and other consultants know so well, that the ‘hard-skills’ are (relatively) easy, whereas the ‘soft-skills’ are often really hard. Given a bit of thought and engineering-discipline, the ‘hard-numbers’ are actually quite easy; it’s working with the ‘soft-numbers’ – working with inherent-uncertainty, rather than pretending that it doesn’t exist – that’s really hard. That’s essentially the challenge that I’m exploring here.

      More on this in another blog-post, anyway.

      • “The trap is that the metrics themselves become a substitute for thinking: ‘hard numbers’ used as means to evade the even harder thinking about ‘requisite fuzziness’”

        Indeed…at some point, you should be able to quantify what is acceptable vs what is not. Too often, however, the numbers are demanded up front and pulled out of thin air without any real idea of what the number means (“should this take 3 seconds or 5?” “Dunno, read somewhere that everything should happen in 3.” “Go with that then.”).

        Premature quantification is as evil as premature optimization.

        • Hi Gene – apologies I’ve been a bit slow on replying on this one…

          In essence, strongly agree with all of this, especially with this comment of yours: “Premature quantification is as evil as premature optimization.” Yes indeed… a very good way to put it…

    • Thanks for that, Peter. And yep, already have it on my e-reader – bought it after talking with Tom Gilb last year, in fact. It’s useful as far as it goes, but will have to admit that I think it falls straight into the same trap as I’ve described in this post: an unwillingness to accept that some things should only be addressed in qualitative terms, because any attempt to describe them in quantitative terms will damage the quality itself.

  3. I don’t think Dave Snowden’s point: “The problem with user requirement capture is that someone assumes there’s a requirement” is one Tom Gilb ignores, or views as trivial. He spent most of Day 1 of a 3-day course talking about how to identify stakeholders of different types, and how to elicit their most important values and wishes.

    I questioned him about how he decides when to stop looking for new stakeholders and new values, and to start diving into design to meet the wishes he’s identified. He probably didn’t like the question much, but, interestingly, he gave an answer that wasn’t data-driven. While he likes “about 10 requirements”, he acknowledges that the right number depends on context, and that the point often is to make progress on the “top 10” so as to build up stakeholder goodwill. So it was encouraging to see contextual awareness, and some acknowledgement of complex human factors playing a part in his approach.

    • Thanks, Catherine – and yes, that is encouraging, because I didn’t get any kind of acknowledgement of ‘It depends…’ from him when we talked last year.

      The real point is that his main focus is (rightly) on the ‘How’, on how to get real code written properly to deal with as-concrete-as-possible business needs. My focus is perhaps too much on the ‘Why’, on how to make sure that the business-need is clear enough – including clarity on inherent-uncertainties – such that when people write code from those ‘requirements’, it actually does do something useful towards what does need to be done. They’re not the same concerns, and whilst it’s very important that they do connect up well with each other, they also can’t be assessed solely in each other’s terms. Hence some of the confusion.

      On “Dave Snowden’s point” – as I understand it, anyway, from my own experience with ‘requirements capture’ and the like – is that there’s a very important yet often-forgotten knowledge-issue here, that the whole concept of ‘requirements-capture’ is riddled with all sorts of assumptions such as (to use his example) what is or is not a ‘requirement’. Historically, Dave has done a lot of work on challenging and ‘surfacing’ those assumptions, with a lot more care and rigour than most people would as yet apply. From what you say above, Tom Gilb does address some of those issues – or more than he used to, at any rate – yet the fact that he still insists on converting every qualitative-requirement to some kind of quantitative-metric suggests that he still doesn’t as yet acknowledge the deeper challenges and doubts on that point. My interpretation alone, of course.

  4. Nice to see both Catherine and yourself reaching a consensus I can agree with.

    I agree with a lot of what Tom says and I believe the world needs Tom’s methods and Tom himself. We need Tom’s methods because they can bring new insights, they can tackle parts of “the problem”, they are another tool in the toolbox.

    However knowing when to use them, when to push them to the fore and when to hold back requires judgements and that requires thinking. Because, I believe, applying them in every situation creates exactly the kind of problems you describe.

    And the world needs Tom Gilb because too few people know these tools exist or give quantification of requirements much thought. “Must be quick” “Must be easy to use” are still common requirements.

    Tom, by being Tom and pushing these points to the extreme causes people to sit up and think about these points and these tools.

    Now all that said, the one thing I find missing from your conversation is a direct reference to Goodhart’s Law, http://en.wikipedia.org/wiki/Goodhart's_law “Any metric used as a target will change its behaviour” – you alude to it but it is not mentioned by name.

    My fear is: if Tom’s methods are over used (e.g. in complex environments, when innovation is required) then the method, and the numbers, will undermine themselves.

    • Hi Allan

      Good points indeed – particularly about Goodhart’s Law (which, yes, I didn’t know as ‘Goodhart’s Law’).

      The one trouble I have here is that I don’t know which Tom you’re referring to in each comment… – and just to make things worse, we’re both ‘Tom G’ as well… Perhaps label us as ‘the sensible one’ (Tom Gilb) and ‘the crazy one’ (me)? 🙂

Leave a Reply to Tom G Cancel reply

Your email address will not be published. Required fields are marked *

*