Greenbar printer paper
Jurassic Data Store

Data vs Knowledge

J. Braun
4 min readFeb 12, 2020

In the long distant past, before most of medium.com’s readers were born (maybe even before their parents were born), there were rooms of gigantic machines that produced green bar. Green bar was not some hipster, sustainable, free range, vegan, fern bar serving gluten free, non-GMO, curated drinks. It was acres of 36" wide paper with endless amounts of dot matrix printed data.

Row upon row of numbers and codes that, as you might imagine, were totally incomprehensible to any but a few wonks who would stare at them for hours on end hoping that some magical information would pop out at them. The very fact that the paper had to be printed with alternating bands of color so that the reader would have some chance of making sense of it all is an indication of how cryptic it was.

And yet, green bar and its ranks of data was a huge advance over the previous books and ledgers that aggregated this information. It might be incomprehensible but at least it was all there. As a person remarked to me in Home Depot one day, “It’s all here someplace, if only I could find it.” Some lowly minion in a dark and lonely corner could eventually discover that sales of widgets was up 2.8% in Zanzibar and down 13.9% in Bakersfield.

All the data was there but with absolutely no knowledge attached.

Fast forward to the spreadsheet and the data could be loaded and visualized so that the number for Zanzibar was in white and Bakersfield was in red. Not much knowledge to be sure but at least there was a clue to where to look for problems and answers.

A few years ago a friend and I were talking and he told me about a problem his group was having at work. There were two sets of numbers that had to be reconciled; production and accounting. Seven or eight people worked for days on end to make sure these numbers matched, or if they didn’t, to find out why. My friend explained that they were always behind in their work and the company was losing money because when the deadline came they always chose the lower of the two numbers for billing in order to keep the customer happy.

As he explained it, the problem seemed to be too much data and no enough knowledge. I suggested a solution that would reduce the problem and thought no more about it. A few days later he called and said his boss wanted to talk to me. I agreed on the condition that somebody buy me lunch. I never got lunch but I did get a 6 month contract to fix the problem.

Ask and learn.

Greenbar printer paper

I had a general outline of the problem but no understanding. I went to a couple of meetings with managers, smiled politely, and took notes that I knew would be useless other than identifying who would be supportive and who would be a roadblock. Then I went and sat with the people who were doing the work. They were the ones who understood the situation completely. They explained the problems they encountered with unreliable data feeds, dirty data, and just plain too much data.

They also explained how they dealt with this mess in order to make some sense of it and what limits they operated in to get the best possible outcome. Luckily, everyone agreed that once this was cleaned up nobody would lose their job as there was plenty of other work waiting to be done.

Greenbar printer paper

Long story short, the answer was to aggregate the data feeds into an intermediate database and run a brute force comparison. Everything that matched within allowable limits was ignored. Those instances that were exceptions were put into a queue for the people to investigate. The queue was on a network so that each worker could pick up the next exception and work on it.

As we got experience with the workflow the feedback was that certain exceptions were obvious and we could let the computer handle them. More feedback led to better filtering. Instead of massive spreadsheets the user was presented with all the relevant information in one place. Where there was raw data before, now there was knowledge that had been assembled for action. What had taken 7 people a week or more to process now took only 3 people who worked with a 1 day latency. Best of all, the money recouped from accurate billing earned me a nice bonus.

I’m no genius but here are the lessons to be learned:
1. Ask the people who do the work, listen to them, and learn.
2. Be flexible in thinking. My first impressions were close but needed refinement.
3. Think of what people need, not what they say they want.
Henry Ford famously said, “If I had asked people they would have told me they wanted faster horses”
4. Get your hands dirty and try out the solutions in the real world. Unit testing is fine to make sure the code doesn’t break but it doesn’t guarantee that the problem has been solved!

People are not computers, they want actionable knowledge not mountains of data. That’s why computers were invented.

--

--

J. Braun

Software engineer who works hard in order to travel a lot. 17 languages and 5 continents and counting …