Checklists and complexity

Re-reading Atul Gawande’s The Checklist Manifesto, to write a book-review for the current edition of the Journal of Enterprise Architecture, it struck me that the SCAN frame provides a useful means to understand and describe the relationship between checklists and complexity.

At first glance, checklists look simple – far too simple to be of any use in decision-making and, especially, action within contexts of high or extreme complexity. Yet as Gawande illustrates so well in that book, they definitely do work. So how? And why?

For that matter, how would we apply this in enterprise-architecture?

To me, the keys here are variety-weather and time-before-action.

Plain sailing in stormy weather

Again at first glance, checklists seem like a classic tool for command-and-control: do the action in robot-fashion whilst a supervisor ticks off the boxes. And it’s true that some checklists are intended for that kind of purpose: Six Sigma tends to (mis)use them in exactly that way, for example.

But that kind of method only works well when the variety in that context doesn’t vary (much) – which lack of variety in turn makes it amenable to notions of ‘control’. In other words, the context is, and always remains, extremely simple – or at least always remains on the ‘Simple’ side of the Inverse Einstein Test, the blue-coloured left-side of the SCAN frame:

By contrast, what we’re concerned with in most checklists is contexts that contain significant variety-weather in their variety – contexts that can sometimes switch in an instant from seemingly-simple to anything-but-simple… These are high-VUCA contexts (Volatility, Uncertainty, Complexity, Ambiguity), contexts in which we’re likely to find ourselves on the right-hand side of the SCAN frame – which can sometimes be stormy-weather indeed. As Gawande put it in The Checklist Manifesto, working in such contexts has become

the art of managing extreme complexity – and a test of whether such complexity can, in fact, be humanly mastered (p.19).

Gawande was talking about his own disciplines of surgery and medicine, but as he warns, “extreme complexity is the rule for almost everyone” (p.21). Professional work has always moved back and forth between certainty and uncertainty, the known and the not-known, but it’s getting harder to manage all the time:

One of the most common diagnoses, it turned out, was ‘Other.’ … The complexity is increasing so fast that even [those who maintain the pulldown category-lists in] the computers cannot keep up. (p.22)

So it seems to me that a significant part of our role as enterprise-architects should be about helping people to manage the complexity in their work-contexts – and preferably to help them to work with it, rather than pretend it doesn’t exist.

Or, to paraphrase Gawande again (p.182), we need to help our people understand where they should not improvise (left-side of the SCAN frame), and where they do need to improvise (right-side of the SCAN frame), in any context where the variety in that context is inherently uncertain – especially where time-to-action is compressed down to nothing and there’s no time to think:

The classic procedure-manuals and 200-page ‘best-practice guidelines’ are of little use there – or rather, they don’t get used, because there’s no time to use them. Yet we still need something to provide some kind of guidance there – otherwise we risk collapsing into the wrong kind of chaos. This is where checklists come into their own:

The checklist gets the dumb stuff out of the way, the routines your brain shouldn’t have to occupy itself with … and lets it rise above to focus on the hard stuff. (p.177)

In effect, checklists help to pull us back over to the left-side of the SCAN frame, providing a kind of anchor in the midst of uncertainty. Checklists help to reduce the cognitive load, by reminding us of concerns that must not be forgotten – concerns which, if they are forgotten, are likely to increase the potential variety (and hence potential risk) in the context. And by providing a clear, explicit structure and sequence, checklists also help to keep the panic at bay when the variety-weather suddenly turns stormy. To quote Gawande again, about what doesn’t work:

He tried the usual surgical approach to remedy this [lack of success] – yelling at everyone to get their act together. But they still had no saves. So he and a couple of his colleagues decided to try something new. They made a checklist. (p.45)

In short, checklists are useful. But what do they look like? How do they work? And how do we design good checklists for our own enterprise needs?

Shaping a checklist

There’s a lot going on in the overall context of a checklist:

the distinctions between sensemaking, decision-making and action (perhaps best typified by John Boyd’s OODA frame)
the variety in the context (the extent of potential variance or change)
the variety of that variety over time, or in different places or instances of that type of context (which determines the probable applicability or reliability of predefined work-instructions)
how sensemaking and decision-making methods need to change at difference distance-from-action (for example, the distinctions between ‘considered’ versus real-time sensemaking)
how sensemaking and decision-making need to change with differing variety and different ‘controllability’ of the context

The SCAN frame can help a lot here, in making sense of all of this. First, the two different versions of the SCAN frame help to describe the distinctions between sensemaking (Boyd’s Observe and Orient) and decision-making (Boyd’s Decide, and then Act). Here’s the sensemaking version:

And the decision-making version:

Both versions have the same axes. The horizontal-axis represents the combination of modality and ‘controllability’ that underpins the Inverse Einstein test: whether doing the same thing leads (or ‘should’ lead) to the same results, or to different results – respectively left or right of the red line. The vertical-axis represents ‘distance-from-action’, the amount of time remaining before action must take place – with the horizontal dotted-line representing the variable boundary between ‘considered’ versus real-time sensemaking and decision-making. This gives us four quadrants:

simple real-time action (lower-left) – the classic domain of step-by-step work-instructions
considered analysis of known factors (upper-left) – the classic domain of hard-systems theory and formal change-one-factor-at-a-time scientific-method
considered experiment (upper-right) – the classic domain of soft-systems theory and pattern-exploration
non-simple real-time action (lower right) – the classic domain of improvisation and skill-in-action

Iterative innovation tends to focus on the upper quadrants; real-time action necessarily on the lower quadrants.

(Note that the effective boundaries between the quadrants will vary over time, for different people, and in different contexts. In other words, the frame itself is another checklist: something seemingly simple, to summarise and guide action in something that is actually immensely complex.)

In reality, checklists move us around throughout all of this space – and that’s where this can get very confusing…

Rather than talking about this only in the abstract, let’s use a real checklist – in this case the Orange County Flight Center’s checklist for the Cessna 172 aircraft [PDF], as linked pointed to by the ‘Checklist Manifesto’ website:

Cessna172-checklist — (c) OCFlightCenter – click to see original PDF

(To make sense of this, you’ll probably need to download that checklist and keep it to hand here.)

As you’ll see, it consists of two pages, typically printed either side of a single laminated card:

the page printed in black is for routine checks – in other words, for considered decision-making
the page printed in red is for non-routine emergencies – in other words, for urgent real-time decision-making

Every item in the checklists has the same core structure:

the item to be checked (left side, in mixed-case)
the check itself (right side, in capitals)

Note, though, that are several variations on this structure:

usually the item is some physical thing or setting or action, but sometimes it’s a keyword (‘Lights, Camera, Action’)
often the check is just a confirm (‘CHECK’), sometimes it’s a setting (‘Seatbelts: ADJUST’, ‘Master Switch: OFF’) or an action (‘OH Speaker-Handmike: TRY BOTH’)
keywords are used as mnemonics for essential actions (‘Lights / Camera / Action’ immediately before take-off and immediately after landing – in this case primarily signals for others rather than for self)
occasionally there’s an additional advisory comment (such as ‘FLY THE AIRPLANE’ – an important reminder when trying to deal with in-flight engine-failure!)

The routine checks (black checklists) are often longer, and usually in a ‘do-confirm-do-confirm’ step-by-step sequence. (The shortest checklists are those closest to the point of committed-action – especially take-off and landing – which makes sense precisely because it is closest to the point of real-time action.) Many of these checklists are prior to any action, and in effect can take as long as is needed – or call a halt to the overall activity.

The emergency checks (red checklists) are much shorter, much more focussed, and often fit more into a mould of ‘check this before you panic’ or ‘make sure this is as safe as possible’ or ‘try this it might work’ (the latter as in ‘Increase Airspeed: BLOW OUT FIRE’, for example). All of these checklists are for use during some kind of action, and each can be read and acted on literally within seconds.

Note how the purpose of each of these checks, whether routine or emergency, is to reduce the amount of uncertainty in the overall context, and, in effect, the amount of variety that we have to deal with – often quite a bit later on in the story. For example, a tyre-burst on take-off or landing is less likely to happen if we’ve properly checked the tyres during the pre-start ‘Exterior Inspection’ checklist: it’s not that the inspection-check prevents a tyre-burst as such, but that if we do find a worn tyre or partially-stuck brake, we wouldn’t attempt to take off until the issue had been resolved.

In SCAN terms, the routine-checklists sit mainly in the upper two quadrants, coming down into the lower quadrants at the point of action; the emergency-checklists sit mainly or exclusively in the lower two quadrants. (In reality, all checklists are framed in the form of a set of ‘belief’-type statements – lower-left quadrant – but that isn’t necessarily where they sit in a conceptual sense.)

In the emergency-checklists, the distance-from-action is compressed right down such that there’s little to no time available for analysis/assertion or experiment/use. Instead, there is only time for those kinds of sensemaking and decision-making that can support action in the ‘now’ – which forces us into those two domains of belief (certainty, ‘inside-out‘) and faith (trust-in-the-moment, ‘outside-in‘):

Note, though, that a checklist of this kind is radically different from a work-instruction:

— A step-by-step work-instruction aims to act as a substitute for skill. In principle, it should be usable with no training (as in a machine or an IT-application) or minimal training (in a so-called ‘manual process’) – the real-world is expected to conform exactly to the assumptions of the work-instruction. In effect, everything is assumed to sit solely on the Simple side of the Inverse-Einstein Test.

— By contrast, the checklist assumes that it is likely that the real-world will not conform to the assumptions of the check: the whole point is to make us aware of when we’re in a context which is actually on the Not-Simple side of the Inverse-Einstein Test. The checklist will usually also give us some advice as to how to bring it back to the Simple side again – rather than having to rely solely on skill, experience, luck and faith…

Another difference is that the classic work-instruction assumes little to no skill, and hence has to attempt to cover all possible eventualities. By contrast, it’s often noticeable how much is not in a checklist such as that for the Cessna-172 above, and how much skill and knowledge is actually implied. The purpose of a checklist is to reduce the cognitive load in high complexity, not as an attempt to pretend that the complexity does not exist.

Implications for enterprise-architecture

As Gawande demonstrates in his book, checklists provide a proven means to reduce risks and undesirable outcomes in inherently-complex contexts. They do not represent an attempt to ‘remove’ complexity, or even to reduce it, but do provide a key part of tactics to better manage it. The implication is therefore that well-designed checklists will have a very high value in most enterprise-architectures.

In SCAN terms, checklists are the direct counterpart of principles. Where principles guide decision-making whilst the experience of a complex context traverses through the Not-simple (‘faith’) side of the frame, checklists guide decision-making whilst the experience traverses through the Simple (‘belief’) side of the frame. Each checklist-item in a checklist represents an Assertion (a purported ‘truth’ to be tested), compressed down into real-time action; each principle represents guidance derived from experience and Use (a ‘value’ to act as a filter on the total context).

In the same sense that a principle is an actionable expression of a value, a checklist-item is an actionable and testable expression of an assertion or belief. The keyword here is actionable: it must lead to and/or test some form of action that is relevant and of value within the context.

Much as with principles, checklists will need considerable care in their development and description, if they are to be of practical use in the respective context. The outcome of that care is the difference between a good checklist and a bad checklist – one that is a genuine help or a potential hindrance at the point of action.

In developing a checklist, the focus should always be on simplicity – on reducing the cognitive load of managing complexity in real-time action. Almost always the aim will be to pare away to the barest minimum:

include items that assessment shows are sometimes or often forgotten or missed, especially if the fact of missing that step represents significant increased risk
exclude items or steps that are always remembered, and/or that are both rare and relatively insignificant in their consequences

For guidance on development, drafting and validation of checklists, see Atul Gawande and Dan Boorman’s ‘Checklist for Checklists‘. (For checklists in enterprise-architecture, this is, in effect, an exact analogue of the Open Group’s section in TOGAF 9 on how to develop and document principles in enterprise-architecture.) Note the warning at the end of that document, that “a checklist is not a teaching tool or an algorithm”: it is an adjunct to skill and experience, not a substitute for it.

Whilst designing a checklist, remember to take account of the context in which the checklist will be used – including the politics of that context, because in practice, the use of checklists will often necessarily challenge traditional hierarchies and suchlike. Business-processes and business-rules may need to be adapted to support the use of the checklist – such as in this operating-theatre example from Gawande’s book:

The new rule made it clear that if doctors didn’t follow every step [of the checklist], the nurses would have backup from the administration to intervene. (p.39)

Checklists also operate in tandem with enterprise vision, values and principles. The role of the latter are summarised well, for example, in Gawande’s description of WalMart’s response to Hurricane Katrina:

[WalMart CEO] Lee Scott issued a simple edict. “This company will respond to the level of this disaster. … A lot of you are going to have to make decisions above your level. Make the best decision you can with the information that’s available to you at the time, and above all, do the right thing.” (p.76)

In this high-VUCA context above, the overall ‘commander’s intent‘ is clear, and likewise the explicit permission to act in accordance with personal judgement. What is not clear is where to start, or any guidance as to what to do. That latter is the real value that checklists can provide.

—-

Okay, will stop there for now – hope it’s useful, anyway. Any comments/suggestions, anyone?

2 Comments on “Checklists and complexity”

Peter Bakker says:

2 September 2012 at 8:00 am

Hi Tom,

I just want to add this quote from Boeing’s “The simple genius of checklists” pdf:

“I don’t think the future will hold significant change to the
concept of checklists,” Mahr said. “Technology may change
the methodology, but the principal remains the same. It is by
far the best tool to contain errors.”

Tom G says:

10 September 2012 at 6:26 am

Peter – many thanks to the link for that Boeing piece – very useful.

Plain sailing in stormy weather

Shaping a checklist

Implications for enterprise-architecture

2 Comments on “Checklists and complexity”

Leave a Reply Cancel reply