CIO Straight Talk - Issue 3 - 49
record, you’ve got to have a lot of intelligence about
how to interpret what that log record means.
I have to say that the big data vendors don’t
understand this. They vaguely understand that “Oh
yeah, there are some challenges with unstructured
data.” But in terms of actually understanding what
those are, the vendors are much more concerned with
how they interface with Hadoop, how they use
MongoDB, how they use HBase, how they use Hive.
They’re concerned with the technology of big data.
They just assume that once the organization gets its
hands on big data and the new technologies, it’s
going to automatically be able to understand how to
turn that big data into business value. To be honest
with you, the big data vendors just don’t get it.
The advice I would give companies considering
investing in big data is that once they’ve got a handle
on the technologies they should address the
challenges of context and interpretation in order to
unlock business value from unstructured data.
Could you give an example
of unlocking business value
from unstructured data?
It’s not enough, however, to build a database of
warranty claims. You must understand the context for
all the data in the claim. If you fail to understand or
manage the context properly, you’re going to be
making incorrect decisions on how the warranty
claim is to be handled. And so the consumer is going
to be mad and even have grounds for a lawsuit.
The answer is no, but there’s a good reason. The
whole world of unstructured data and taking business
value out of it is brand new. It’s like the big data
vendors. They just assumed that since the data is
there, it could be done. They assumed that if one or
two companies have done it, then everybody can do
it. That’s simply not true.
It’s like going into a bear’s den in the month of
March and poking the bear with a stick. The bear is in
hibernation and very soon is going to come out of
hibernation and be hungry. The world and the
marketplace are just now waking up to the fact that
they’re going to have to deal with unstructured data,
and how you deal with it is not how you dealt with
the structured data of the past. The techniques and
approaches are completely different. Is that
understood by either the vendors or the marketplace
now? No. There are a few people who are awakening
to it, but it’s generally so early in the life of the
technology and the marketplace that the world is still
asleep, in a state of hibernation.
Are there parallels
perception of big data
today – including their
ability to extract value
from it – and people’s
perception of data
warehousing when you
were first talking to them
about this in the late
Actually, yes, very much so. In the early days of data
warehousing, people thought a data warehouse was
merely where you stack together a bunch of data like
CIO Straight Talk
I was talking to a major auto manufacturer just the
other day about doing warranty claims analysis. Do
you know how they do warranty claims analysis?
They have individuals who sit in this huge building
and look at a warranty claim and say, “Well, is this
company responsible for it? Are we going to honor
the claim? If we aren’t, why not?” They process over
eight million warranty claims a year. They do them
all manually because they can’t figure out how to
take the raw text and treat it in an analytical fashion. I
asked them, “When you finish processing the claim,
do you enter it in a database?” The answer was,
“Heavens, no. We have a hard enough time
processing eight million of these warranty claims
manually.” But a database built from warranty claims
could be incredibly valuable to automobile
manufacturers because it would relate directly to the
quality of the manufacturing process.
Have you seen any
examples of companies
that seem to be
somewhat ahead of the