Getting Around Unstructured Data

Dite Gashi
Decissio
Published in
4 min readMay 31, 2017

“The best vision is insight.” — Malcolm Forbes

Investor making sense of unstructured data

Businesses have always been striving to make use of all available information to make the smartest decisions. The challenge lies on the fact that strategic insights are often hidden in unstructured bits of information, and success often is determined by the ability to discover these insights. Approaching the new age where we see data as the new oil sounds promising. However for plenty of companies handling data the right way is added to the pile of challenges they face. Tools that enable insight gathering and platforms development have been fueled by the need of organizations of having the right information at the right time as a key element to their success. As that is significant, according to the concept of garbage in, garbage out (GIGO), the quality of output is determined by the quality of the input and if the data processing is flawed, decision-making might end up being flawed as well.

How Does Data End Up Unstructured?

When every system is designed, people who do it usually try to design it the best they can. Nobody consciously designs for unstructured data since it beats the purpose. The reality is that almost 80% of the information in the real world is unstructured and businesses spend 60% of their time to process and clean all the information they get ¹. There are several forms of unstructured data that have different characteristics. We can roughly split them into:

  • Non-Textual unstructured data includes images, colors, sounds and shapes while
  • Textual unstructured data includes data found in emails, reports, documents, medical records and spreadsheets.

Textual unstructured data is almost everywhere and represents both a challenge and an opportunity to the organization that wants to use it for decision-making purposes. Text makes sense for us humans, while machines do prefer data that can be quantified. Even text processing AI algorithms turn text into quantifiable metrics before they can actually derive any insights. Whether that means scanning streams of social media to detect real-time information such as rumors about a stock, scanning communications like emails to detect spam etc.

Financial Sector and Data

It is no secret that financial sector loves data. Looking closely into recent developments it’s safe to say that the most straight-forward way big data has added value can be found within the Financial Industry. Investment decisions and risk management run on data by using machine learning tools to detect patterns and trends. Magda Ramanda, a consultant from Willis Towers Watson, claims that no industry relied more upon data last year than financial investment decision making ². Less conventional data sources are becoming popular. Social Media profiles, web-browsing, loyalty cards and phone-location trackers can all help determine the riskiness of investments. In a trial, FICO, America’s main credit-scorer, discovered that the words someone uses in their Facebook status could help predict their creditworthiness. This discovery led to FICO leapfrogging into data science giving them that competitive advantage over its competitors.

The Sweet Middle Ground

For humans unstructured data in free text forms is an interesting way to consume information and gather insights. Even data such as this article, if it is equipped with more interesting quotes, anecdotes, and fun facts, makes us process it better. That is very confusing to computers and machines who can try to make sense of this text. Computers seem to love numbers and quantifiable data in order to gather insights. At its core every computational operation is broken down into 0 and 1’s. Therefore it is important to keep both users in mind when designing data systems. The sweet middle ground does exist and is constantly shifting, as humans grow more acquainted with technology and machines learn to deduce meaning from complex data.

No doubt that technologies and data developments are evolving at an increasingly fast pace and only to keep track of these changes is becoming a whole new labor market. Information about Competitors or Market research are some examples of data that is available to the public but comes unstructured, processing it is time-consuming and data by itself is useless. So, it’s safe to assume that we are not running short on data, but what’s needed is an effective way to gather, structure and analyze it. A funnel that helps filters the information in a meaningful way and consequently becomes a tool for the decision makers since the beginning. Not having to fish for the right data in an ocean of information saves you time, costs and gives value to the every step you take towards your investment.

--

--

Dite Gashi
Decissio

Co-Founder at Decissio, Blocknify. MBA, coder, hacker, dApp builder, blockchain developer. Loves hiking and cold brew coffee!