Why is Data Integration So Hard?

Julia Geller
Datalogue
Published in
4 min readAug 3, 2020

Today we think about data all the time. So much so that the phrase “data-driven” is now a core part of corporate lexicons, synonymous with “good” or “well informed” and the opposite of let’s say, bad or speculative. Data and data analytics are so prevalent today, it’s hard to imagine a time where they weren’t.

But that time existed, and not too long ago.

While companies have been collecting data for, arguably, hundreds of years, it was only in the 80s that storing large amounts of data became fiscally feasible, as the cost of hard drives started to plummet.

And it wasn’t until the 90s that the idea of Business Intelligence (BI) was solidified.

As with every new technology, mass adoption of data warehouses and BI tools did not happen overnight. That is to say that in the early 80s not every enterprise was diligently collecting and storing large amounts of data in structured ways — the early adopters were.

Similarly, on January 1st 1990, not every employee of every large company went into work that day to discover a BI platform installed on their desktop computers. Mass adoption of new technologies, while it happens faster than ever today, still takes time.

Basically, until quite recently, enterprises didn’t know the value their data could bring them.

The enterprise of the very recent past grew and scaled without data in mind.

That means that when processes that included data generation or collection were designed, they were often not accompanied by the building of data platforms, systems and warehouses that were standardized in formats, schemas, credentials, etc.

What does that non-data centric legacy leave for the enterprise for today?

A mess.

A mess of disparate data systems housing data of disparate formats with different levels of accessibility, without a clear map of what data lives where.

This mess is compounded by some combination of, if not all, of these characteristics in the modern enterprise:

Growth via Mergers and Acquisitions

Most of the behemoth enterprises we know today, got to be as big as they are through mergers and acquisitions. With each merger and with each acquisition, companies acquire the data systems of their subsidiaries. In other words, they inherit the mess of the companies acquired, and that mess is multiplied in severity when you start talking about merging it with the pre-existing mess of the parent company.

If even the explanation sounds messy, it should come as no surprise. The data landscapes formed by these conditions are often a nightmare.

Contracts with various suppliers and vendors

Lets add in yet another source of complexity; external data sources. Big companies rely on many various suppliers and vendors to fulfill customer orders and manufacturing requirements. The problem? Each vendor or supplier an enterprise interacts with generates data. How that data is generated ( in what formats etc.) varies from vendor to vendor.

Global/international physical and digital storefronts

Today’s society is global, and so is the modern enterprise. While international presence is great for sales, brand equity and ultimately profit, it does not do much in mitigating the challenges of a complex data landscape.

Different data systems, now housing data in different languages and country-specific formats, become part of the equation (as if that equation wasn’t difficult enough to solve already).

Widespread use and adoption of IoT edge devices

Haven’t had enough? Since the late 90s/early 2000s when IoT devices have become more and more prevalent, the amount of data generated by the enterprise has once again exploded.

However, not all IoT devices output data in the same format and structure. In fact, data outputs change not just from device vendor to vendor, but from model of device to model of device of the same make as well. Yikes.

Data security regulations and requirement

And finally, everyone’s favorite topic, data security. Regulations like GDPR, HIPAA and HITRUST have made the already challenging task of data integration even more, well, challenging.

With GDPR, data must remain in region and must have subsequent access restrictions. More generally, these regulations make integrating and working with your data safely difficult especially when you’re unsure where your sensitive data resides. All of this creates challenges in accessing, manipulating and moving data around.

In a perfect world, all these “compounding factors” we’ve laid out would be positives, not negatives.

Why?

Because most, if not all, are simply symptoms of having a massive amount of data at hand. Today, that’s worth more than gold if you could only harness it and get it into the hands of the right people safely and securely.

Good news is, you finally can. Why don’t you get started today and reduce the time and energy spent on data integration so that your analytics teams can do more, faster.

Not ready to get started? Read on for more information on what an enterprise grade data integration solution must bring to the table to be successful.

--

--