Rethinking the DIKW hierarchy

What are the relationships between data, information, knowledge and wisdom?

This is one of the classic challenges in the knowledge-management [KM] space. The usual way to describe those relationships is that it’s a stack, or a hierarchy, or a pyramid, or a linear-progression, always in the same sequence, with data at the ‘bottom’ and wisdom at the ‘top’:

data -> information -> knowledge -> wisdom

It looks straightforward, and it certainly aligns with the way many IT-folks think about the relationship. Or at least, they know that they manage the data, which somehow becomes ‘information’ somewhere in that blurry space called ‘the business’, after which it’s all sort of Somebody Else’s Problem.

The catch is that it doesn’t actually make sense in real-world KM practice. In fact, it causes so many problems for KM that, as David Gurteen reTweeted the other day:

RT @DavidGurteen: The DIKW Pyramid Must Die!  #KMWorld #km

It’s not a pyramid. It’s not a hierarchy. It’s not a stack. And it’s not a linear progression. All of that is pretty much obvious to just about everyone now in the KM field now. Yet there is a relationship of some sort between them: that much is obvious, too. So what is it?

One option I’d suggest is that we can view them not as ‘layers’, but as distinct dimensions in a concept-space:

In this type of frame, those purported progressions, or layers, or whatever, would be seen as transitions or cross-linkages or regions or suchlike within that concept-space. A supposed ‘hierarchy’-relationship is therefore just a choice – a ‘way of seeing’ – rather than a purported ‘fact’.

If only as metaphor, there’s also a strong crosslink here to the tetradian dimensions, with clear (to me, anyway! 🙂 ) parallels between the respective dimension-sets:

I’ll say straight away that ‘Wisdom’ doesn’t as yet sound like the right term. Yet it does sort-of make sense if it’s viewed more as a plural – ‘wisdoms‘ – in much the same sense that ‘Data’ is actually a plural noun. What I’m seeing in this is that kind of ‘wisdoms’ that are presented as prepackaged interpretations, or perhaps more as principles for interpretation: proverbs such as “Many hands make light work”, or “Too many cooks spoil the broth”.

The point is that those kind of ‘wisdoms’ don’t actually make much sense on their own: they always need a context, and more, to anchor them into something more useful and meaningful. Meaning is the end-point that we’re actually after here.

So, given that proviso about ‘wisdoms’, we can expand somewhat on those dimensions – or perhaps reframe them would be a better way to put it:

  • Data: the content (often, or even usually, about something tangible) that underpins the meaning
  • Information: the context-frame or virtual schema within which content could be placed
  • Knowledge: the connections that link items and context in relations with each other
  • Wisdom: structures to give guidance on purpose or meaning, in terms of aim or aspiration

Or, in tetradian layout again:

Content, context, connections, purpose in tetradian layout

Which, yes, does give a very different slant on the definitional-meaning of each of those terms – which in its own way is probably not a good idea. Yet if we think more of the way in which those terms relate with each other, rather than only in terms of how they each stand on their own, then it does make more sense – or at least, more sense than the dreaded DIKW ‘stack’.

For example, we can have context-descriptors just on their own: that’s what an entity-descriptor in metamodel is, or a database table-schema. But it’s literally empty without content, or data. And likewise data is just, well, data: it doesn’t tell us anything at all on its own – its just a bunch of numbers. But when we put content together with context, then we have something that genuinely informs – in other words, information. In that sense, information is a region of the concept-space.

We can have connection-descriptors on their own: entity-relationships in a metamodel, for example, or table-relationships in a database-schema. Much the same applies to relationships between people. Again, it isn’t much use on its own: it tends to be abstract, the potential for something rather than a ‘something’ in its own right. A complete metamodel or database-schema – context plus connections – starts to look a bit more useful, but it’s still literally empty without the real-world data. And going the other direction, we can build connections between the proto-meanings of those ‘wisdoms’ – and hence see the clash between ‘many hands make light work’ versus ‘too many cooks spoil the broth’ – but again it’s still empty because it’s not yet anchored into anything ‘real’. So again, these are distinct regions within the overall concept-space – regions that, in this case, are not described by the DIKW ‘stack’.

To take it one step further, a complete database-schema is not just the combination of context-dimension and connections-dimension, but the assumptions and pre-interpretations that underpin it – the dimension of proto-purpose or proto-meaning that’s too often merely implied or assumed, rather than actually explicit. Again, this fills out a broader region of the concept-space – and this region too is not described by the classic DIKW ‘stack’. In a sense, it’s ‘IKW’ without the ‘D’: to make it useful, and meaningful, we need to complete it with the D-dimension, the content-dimension, the anchor into real-world data.

Or, to anchor this in a more human example of ‘knowledge’ and suchlike, we could run the whole thing backwards as follows:

  • the culture we inhabit presents us with a suite of aphorisms or proverbs or other ‘wisdoms’ about ‘how the world works’ (W-dimension)
  • we link those aphorisms together into a broader picture of how ideas and people ‘should’ relate, a kind of ‘social truth’ (KW-region)
  • we then link all of this into a conceptual framework – a worldview and/or paradigm – that gives us a more complete picture of ‘how the world works’ (IKW-region)
  • we use that conceptual framework to filter the raw-information (data) coming in from the real-world, to determine what is ‘true’ or ‘relevant’, and what is not (selected DIKW-region)

In other words, it’s not just about the dimensions themselves – it’s also how we move about in that concept-space, changing the focus or whatever within each dimension and region to restructure our own information, knowledge and wisdom.

If we can’t do that kind of restructure, we’re literally stuck with the worldviews that we have. In essence, that’s the case with most IT-systems: they don’t restructure their schemas on-the-fly. (Okay, yes, some do – which is actually the point here.) In turn, that’s one of the reasons why the cross-map to the original tetradian is useful, because the parallel between the ‘knowledge’/connections and ‘wisdom’/purpose dimensions respectively with the relational and aspirational dimensions gives some useful pointers as to how change on those respective dimensions actually takes place. That’s probably a discussion for yet another post, though.

And to bring it back to the original DIKW ‘stack’, or pyramid or whatever, in essence that supposed linear-sequence is just one way in which the concept-space could be populated, and it’s certainly not the only one. It’s almost certainly true that to achieve completeness of meaning, we do need to populate a complete region within the DIKW concept-space, but there are many different ways we can get there – and it’s always dynamic, not static, and almost certainly iterative, not linear.

Best leave it there for now, perhaps. All of this is just an idea, an alternate way to look at the ‘DIKW problem’. It’s in no way ‘the answer’ – it’s just a suggestion, an option, that’s all. And I’ll freely admit that the terminology is nowhere near right as yet: it needs a lot more work before it’d be robust enough for general use. But I hope it’s useful enough to play with in its present state? Over to you for comments and suggestions, anyway.

Posted in Enterprise architecture, Knowledge Tagged with: , , , , , , , ,
3 comments on “Rethinking the DIKW hierarchy
  1. Ric says:

    Hi Tom.

    This post is clearly thinking in process so I am going to respond to your approach not you conclusions.

    In a couple of weeks I will be giving a talk to EA’s in Auckland on information architecture. My own thinking also includes a growing distaste for the DIKW ontology. But in addition I think many experts (business experts I should say) come to questions about information with a strongly reified concept.

    Information is difference. To be useful, difference has to be represented and communicated in a useful way. We give forms to difference – with lines on paper, with bits, with sound waves. We encode difference and then confuse the form we give to information with the information itself.

    When talking about information we constantly employ the metaphor of a ‘thing’. Information is in things, reacts with things, can be stored, and so on. But this is confusing the forms we give to information with information.

    DIKW tries to get past this reification – but fails because it used the object- attribute ‘metaphysics’ of things.

    Over and above this the spatial metaphors that are leading your thinking are also unhelpful. Information is temporal, not spatial.

    (Poetically – information only ‘exists’ on the edge of a collapsing wave. The wonder of intelligence is how it manages, through symbols, to eternally dance from one wave to the next.)

    DIKW is a classic case of inappropriate metaphors leading to a violation of Occams’ razor. Entities and attributes from the metaphor’s subject are transfered to its object which then need to be accounted for in any theory.

    In use, the distinctions between Data and Information in DIKW are effectively useless. It would be better to say data is frozen difference, and information is communicated difference. But the way we characterised information as data that is structured to be meaningful is stuck on the idea of a thing.

    In terms of what information really is, all applied forms that create difference are the same. It’s a basic category error. Information isn’t a higher or different form of difference that makes it essentially different to data.

    Ironically, humans are very good at given form to the differences that don’t matter, or should not, the whole concept of race is a good case in point.

    The idea of ‘rawness’ is often applied. Again – bad metaphors. Raw data – as if information is somehow ‘cooked’ data.

    DIKW is what I call a sterile model – like race. I am coming to the conclusion it is irredeemable.

    I certainly would caution against the urge to reconcile the contradictions, confusion and absurdities the model throws up.

    If information is difference. There are two things that can go wrong with giving it representative forms. We can give it insufficient form – the noise to signal problem. Or we can give it incorrect or unrepresentative forms. DIKW does too much of the latter. It can’t be fixed with a clearer signal.

    DIKW is a management theory – driven by the need to adequately control information. But in reality the best metaphor for the enterprise information problem is “Where’s Wally” or “Where’s Waldo” – I can never remember which.

    Instead of seeking a better way of finding Wally – they could make most of the headway they need by sticking a blue shirt on him.


    • Tom G says:

      Ric – “Information is difference” – that’s a very different way of looking at it, and a very good one. I’ll need to think a lot more about that.

      It’s way past my bedtime ( 😉 ) but I’ll toss in just one other comment, about ‘rawness’. The thinking behind that was my experience of dealing with all-too-raw data from an old aircraft-engineering test: all we had were columns of figures, with no identifying metadata, no indication as to which column related to which strain-gauge on the test-article, no conversion-factors, and so on. It was a nightmare to reconstruct the original metadata from what was known (in people’s memories, mostly) about the actual behaviour of the test-article. It’s in that sense that I think of ‘raw data’ literally as just numbers-without-context, and data when linked to a context (as per the metadata) as ‘information’. I probably have it all wrong, but that’s where that particular way of looking at the concept-space came from.

      (More later when sanity and sleep permit! 🙂 )

  2. Leonard Fehskens says:

    A few things.

    First, “difference” strikes me as a kind of relationship, so in Tom’s model it would be subsumed by the “knowledge” perspective.

    Second, epistemologists would argue that knowing something requires that we have both a justification for believing it and evidence of its truth. How we justify belief and how we judge evidence to be legitimate are not inherent in the data or information, they are externally imposed. For example, many members of the EA community “know” many things about EA to be true because they believe that some “respected authority” having said it thirty years ago is both an acceptable basis for believing it and legitimate evidence of its truth.

    For a long time I have had a hunch that information “emerges” from relationships between data, that knowledge emerges from relationships between information (which is how the epistemological context becomes part of the consideration), and that wisdom emerges from relationships between knowledge. However I have been too distracted by too many other concerns to follow up this line of thinking.

    Finally, data, information, knowledge and wisdom do not exist in isolation from human perception. As is our wont, we regularly try to pretend this is not the case, and that we can talk about such things without taking account of the role(s) of people. As such, I am concerned that the DIKW “hierarchy” (an oversimplification if ever there was one) ignores such issues as observation, inference, assertion, opinion, facts, representation, interpretation and understanding.

Leave a Reply

Your email address will not be published. Required fields are marked *