About Data, Information, Knowledge Relationships and their representation

Loris G. Navoni

In the Information Era, understanding the appropriate meaning of words as data, Information, knowledge is an essential need that would allow to better understand the world we live in. Iconographic support to the analysis allows an immediate understanding of the enunciated concepts. In some cases, graphics should be enriched to better explain complexity and relationships. In this paper I present few suggestions about it.

The Data, information, knowledge relationship

When information attracts someone’s attention, it becomes clear that it cannot be seen as an isolated element in the knowledge acquisition process: data are the basis for the composition of information, while knowledge over a specific topic is given by information reprocessing. This is often seen as a linear sequence from data to information, and from information to knowledge, until hopefully reaching the wisdom.

However, not always relationships are straightforward, as scientists know when they harvest data, but they are not capable to extract appropriate information, being difficult to analyze them in their semantic content.

Goal of this work is not to study and analyze relationships between data, information and knowledge, (it is beyond my competences) but to propose a more complete graphical representation of the complexity of the process that leads from the data to the knowledge.

Simplest linear relationship is shortened by the DIKW model (Data, Information, Knowledge, Wisdom)[i] that highlights the increasing semantic content of each item owned by a term of the chain: information comes from an harvesting and analysis of data, while knowledge is a coherent collection of information, and lastly wisdom comes from the “corpus” of knowledge an individual collects during his life.

The model is often represented graphically by a pyramid or through a chain.

If this model says about the sequence needed to develop knowledge based on given data, it says nothing about motivations that moved the data harvesting, neither about the level of accuracy of knowledge about a defined topic. The proposed model is a first attempt to integrate motivational components and quantitative structure in the straight line of the knowledge process.

Technical premises

There is a boundless literature on the descriptions and explanations of the topics covered in this work. A compendium can be found in Zins[ii]. Here below a non-exhaustive definition for data, information and knowledge.

Data

Data is an atomic representation of a concept, an observed fact, a sign or symbol, being natural or artificial. It is result of observation more or less accurate that can, or cannot, be inspired by an issue, a problem to be solved.

Data are results of an objective categorization, measure, comparison of what observed. They could be meaningless for the actor who collect them.

The should be fixed on a medium to allow to be sent, managed, processed by a human being or automated system

Information

Information is the result of data processing to make them significant for the user. The process gives meaningful to raw data, this implies that the symmetry for that phenomenon or subject is broken[iii], something significant appears over the uniformity of the sea of data.

A definition of information[iv] has been adopted as operational standard for analysis in several fields of information study. The General Definition of Information (GDI) says:

GDI) σ is an instance of information, understood as semantic content, if σ and only if:
GDI.1) σ consists of one or more data;
GDI.2) the data in σ are well-formed;
GDI.3) the well-formed data in σ are meaningful.

The GDI definition establish relationship of information with data, which are the basic components and with the meaning of information, which contribute to generate knowledge.

This highlight also that an appropriate syntax is needed to make data usable to compose information.

Knowledge

Knowledge is the higher level of comprehension regarding the harvest of information about a specific topic. It is an abstraction respect the “what” and “where” and “how”.

Knowledge is the level that permit to use a generic law rather a collection of information, it is internalized by the people and shaped by their experiences and perceptions

Increasing knowledge means to gain information that could enrich the level of information one has on a topic. The process to gain knowledge is incremental. One goes back and forth from data harvesting, information composition and knowledge creation, until the knowledge level is satisfied

What drive the knowledge process is the interest one has upon a specific topic.

Dependencies between these topics could be summarized by the following formalisms:

This teach also that knowledge is a collection of information that may or may not have interaction each other ( i.e. rotation speed and physical composition of a satellite are not in relation each other, but both concur to increase the knowledge about that space object)

DIKW graphical representation

The problem in the Knowledge Management domain to graphically represent the transition from data to information to knowledge found a first answer in the DIKW model, often represented as a pyramid or as a sequential process.

These diagrams, as many others, analyze the relationships between the four categories: data, information, knowledge and wisdom (the latter in some cases is not included, and it is not relevant for the thesis presented in this paper).

Figure 1: DIKW model graphical representations

A simple relationship (quantitative in case of the pyramid, dependency in case of DIKW hierarchy) is represented there.

However, it is well know that the knowledge process is not always linear.

The growing process of knowledge acquisition by a human being is a back and forth mechanism. A draft knowledge should be verified and enriched, raw data must be refined, information should be understood and put in the appropriate context. This is a continuous process (from time to time, also the knowledge should be refreshed) and it could be meaningful to include it in the DIKW model, adding some arrows backward.

Arrows can be seen as the interest one can have for a deepening, or for clarification. The achievement of an high degree of knowledge, or wisdom, must be done by a great amount of loops

Figure 2: the revised DIKW model

Introducing Information Degree

When you talk to a six-year old kid about the light speed, or when you when you have to calculate the approximate distance of a thunder storm, you have to consider the speed as instantaneous[1].

When kid grows, or when you calculated conversation transmission delay between Earth and Moon, light speed value can be set as a close value of the real speed, i.e. three hundred kilometer per second.

But when scientist need to evaluate orbit or distance of a space object, they must use the exact value which is 299,792.458 km/s.

So we have at least three degrees of information for the light speed:

1 Light speed is instantaneous (not instrumental human perception)
2 Light speed is about three hundred km/sec (astronomical distance evaluation)
3 Light speed is 299,792.458 km/sec (accurate calculation for spaceships)

Knowledge in this case is a process that see all the information degree as a whole pertinent to the same topic, and the holder of this knowledge is able to use the specific information degree properly

Introducing the information degree definition also gives a quantification instrument that could be useful in the knowledge analysis.

A further level of knowledge process representation

Revised DIKW model permit to include interest in the knowledge process, but does not say anything about the progress of the knowledge acquisition.

Considering the loop one should do to improve her knowledge (data harvesting, information building and knowledge construction) this can be draw as a helical coil, in which each volute is a further step in the knowledge deepening.

First volute will be the lower learning level. When a higher information degree is needed, to expand knowledge about a topic, then others loops becomes necessary.

Illustrating knowledge acquisition process helps in understanding complexity beneath it and, as well as the information degree, gives a tool for the analysis of the Data, information and knowledge relationship.

Figure 3: Loops for knowledge acquisition process


[1] Distance of a storm can be calculated by counting the seconds elapsed between the sight of lightning and the sound of thunder. Considering the light speed almost instantaneous and being the sound speed in the air about 331 meter per second, you can measure the distance.


[i] Hey J., The Data, Information, Knowledge, Wisdom Chain: The Metaphorical Link, Dec. 2004

[ii] Zins C., Conceptual approaches for Defining Data, Information and Knowledge, Jrnl.of the American Society for Information Science and Technology, Wiley, Jan 2007

[iii] Colecchia N. Zaccardi F., La rottura della simmetria nella comunicazione visiva, F.Angeli, 2000

[iv] Floridi L., Information, a very short Introduction, Oxford Press, 2010

Show your support

Clapping shows how much you appreciated Loris Giuseppe Navoni’s story.