Big-data, big-information, big-knowledge

For all the hype around big-data at present, is that all that we really need? I have my doubts…

Let’s take the classic ‘DIKW stack’: data, information, knowledge, wisdom. And then realise that it’s not a stack – it’s more like a set of near-orthogonal dimensions, perhaps best illustrated in a tetradian-type layout:

We can expand on – or reframe – those dimensions as follows:

  • data: the content that underpins the meaning
  • information: the context-frame or schema of metadata – ‘data about data’ – within which that content could be placed
  • knowledge: the connections that link items and context in relation with each other
  • wisdom: structures to give guidance on purpose or meaning of the data, metadata and interpretation

Or, in tetradian layout again:

Content, context, connections, purpose in tetradian layout

Hence, to be blunt, big-data is another one of those themes that’s best described by Andrew McAfee’s infamous phrase of “it’s not not about the technology”…

To have the outcomes described by that term ‘big-data’, we’re going to need a lot of technology: there’s no doubt about that at all. There are huge data-storage issues; huge data-management issues; huge data-quality issues; and perhaps even more – though too often glossed-over – huge data-infrastructure issues too. No-one should doubt that we’re going to get nowhere unless those technology-level concerns are properly addressed, and properly resolved.

That’s big-data.

But if we were to stop there, we would be guilty of IT-centrism. Data – whether big, small, medium, or whatever – is literally meaningless on its own.

As a very first step, we need to link big-data to big-metadata, to give that data its context. Data-plus-metadata is probably what most IT-folks mean when they talk about ‘big-data’: but, if so it would kinda help if they were a bit more accurate and honest about their terminology, ‘cos data-on-its-own is not the same as data-plus-metadata – there are some very different technical-challenges in handling the latter, for example.

Big-data plus big-metadata to give the data its key context is what we could reasonably describe as big-information.

But if we were to stop there, we would again almost certainly be guilty of IT-centrism. Information, no matter how big, is just information: it’s nothing more than that unless and until it’s put to use in some practical context.

Which means that we need the connections to link between the two types of context – the context of the data and its metadata, and the business-context in which it can be put to practical use.

Big-data plus big-metadata plus big-connections is something we could reasonably describe as big-knowledge.

Yet practical use – and, beyond it, practical meaning – is not something that we can do with IT alone: it needs real people, with a people-oriented view of meaning.

But because big-data is something we can perhaps do with IT alone, using the term ‘big-data’ gives the easy illusion that the whole thing is about IT. Which it isn’t: IT is just the enabler.

True, we couldn’t do any of the big-whatever here without the IT: but the moment we think it’s primarily about the IT, we’d have lost the plot.

It’s not not about the technology; but it’s also not about the technology as such. If we ever get that wrong, we’d be in deep, deep trouble, perhaps even before we start…

So ‘big-data’ isn’t about big-data, as such: not really.

And ‘big-data’ isn’t about big-information as such, either: not really.

As a minimum, ‘big-data’ is actually about big-knowledge – the human uses for big-content, big-context and big-connections.

(Big-wisdom might be nice, too – but is probably too much too ask for, I guess…? 😐 🙂 )

3 Comments on “Big-data, big-information, big-knowledge

  1. I would only add a quote I’ve used elsewhere from Claudia Perlich, Chief Scientist at Dstillery via CMSWire. “The data does not lie. It just does not (always) mean what you think it does”.
    In her example it took a human (Claudia herself) to figure out what the data really meant. That insight could then be used to smarten up the analytics engine. Which is fine until the next time the data doesn’t mean what you think it does.
    Which is your point really, isn’t it?

Leave a Reply

Your email address will not be published. Required fields are marked *