How to solve your data silos problem with AI*
Predictive analytics, bots, data science, robotics, artificial intelligence, advances in the application of data keep on coming. Mainstream newspaper headlines announcing breaking through AI technologies are attacking from all sides, making us believe that AGI (Artificial General Intelligence) is just around the corner and that it will solve all our business challenges. Don’t fall for it.
In fact, every CIO I meet tells me that they are excited at the potential of data analytics for their business. Yet, when I ask them what they’ve tried this far, they usually admit defeated that they can’t get their hands on the data in the first place. It looks like data technology advancement is spoken of far more often than it’s practiced.
Technological advancement of AI can turn your data into actionable steps, but before you can run, you need to learn how to walk. Embracing data as a competitive advantage is a necessity for today’s business, so why is it so hard to get access to the data we need? The common enemy? Data silos.
What is a data silo?
First things first, what exactly is a data silo? Strictly speaking, a data silo is a repository of fixed data that remains under the control of a single department and is isolated from the rest of the organisation. Which is a fancy way of saying ‘unshared data’. It can take the form of many things, files, emails, etc. But the key is that it is potentially useful information (e.g. to other teams in your company) that remains unshared or hidden. Data silos are counter-productive and detrimental to your business goals. These silos are isolated islands of data, and they make it prohibitively costly to extract data and put it to other uses. Projects are being rewritten from scratch as inability to search through company know-how kills any attempts to build on what’s been done in the past. Sounds familiar?
Good news, you can solve your data silo problem once you’ve realised you’ve got one. Bad news? It’s a fairly complex process.
The prep work
There is a cost to using data. Behind the glamor of powerful analytical insights is a backlog of tedious data preparation. Data scientists estimate that around 80% of the work involved in analytics is prior acquiring and preparing data.
First, you can’t cleanly separate the data from its intended use. Depending on your desired application, you most likely need to unify data format, filter by manipulating the data accordingly.
Second, data confers insight and advantage. Once you have harvested the low hanging fruit (the easy-to-prepare data), the next level of insight is most likely going to demand exponentially more work. Pursuing the data which is harder to find and use, naturally drives the amount of time spent in prep up.
The main reasons for data silos
Software applications, so called “legacy systems” are written at one point in time, for a particular group in the company. With limited resources available, applications are optimised for their main function. The incentives of individual teams are unlikely to encourage data sharing as a primary requirement.
Company evolves with its culture and leadership style. A company that has grown through multiple generations of leaders, philosophies, and acquisition often results in multiple incompatible systems. Data duplication is common. Even if there are no political issues in integrating data, it is costly to reconcile and integrate sets of data that embody different approaches to important business concepts.
When a problem hasn’t been solved yet with technology means that it’s a people problem. So it happens to be with knowledge sharing.
Knowledge is power, and groups within an organisation become suspicious of others wanting to use their hard acquired data. Permissions and limits are often put by default to reduce any misuse, even accidental. Data isn’t a neutral entity — you must interpret it with knowledge of its history and context. This sense of proprietorship can act against the interests of the organisation as a whole.
Software vendors are among the first to know that access to data is power. There are no technological contraindications to allow export data in an easy way, yet, when a vendor isn’t forced to offer that, most likely they won’t.
This is particularly dangerous with software-as-a-service applications, where the vendor wants to keep you within their cloud platform. Vendors have worked hard to create entire job functions (e.g. customer satisfaction managers) and career paths centred around their software. Any hint of move from that artificially created world could threaten the livelihood of a trained software professional.
Drowning in data lakes
Using data costs money. To move to the higher value uses and maintain a competitive edge, executives are aware of the need to lessen the impact of data silos to business.
The end goal is usually a data-driven, proactive approach. Unfortunately, few companies have the luxury of building a suitable infrastructure from scratch, so companies must figure out a way to get there in an incremental way without interrupting business as usual.
Don’t fall for another, industry new favourite buzzword — the “data lake.” I hate to break it up for you but things aren’t as beautifully simple as the image of clear water and mountain springs might conjure. One can’t just pour all our data into one system, expecting orderly miracle to result. Your business is unique, and so one-fits all off the shelf solution won’t solve all your ills. Care, planning, and investment is required. Otherwise, you’re certain to end up with a data swamp, seething with liability and mismatched pieces of infrastructure.
There is another way, a better way.
Instead, look to identify high-value opportunities. Analyze your business needs, and choose a problem where data could provide a tangible benefit, perhaps in enhancing sales or preemptive incident response.
This is where Machine Learning and Natural Language Processing (broadly AI) can help boost your efforts.
Identify your opportunity for cost saving or /and profit driving. Draw in the data from around the organisation and invest in these use cases first. Tie the data integration to its application, so you get value early.
Then, move with the goal of integration in mind. Each progressive step should build also toward an integrated platform for your enterprise data. You don’t want to recreate a whole new set of silos, albeit with advanced capabilities.
Often, a critical information is often “locked away” in textual reports within data silos. Enterprise search tools can provide some access, but the issue with keyword search is search recall and precision are low, and the user has considerable work to do once the results are returned in order to gain any actionable insight. Platforms using Natural Language Processing (NLP) like Untrite, can overcome this challenge by extracting structured facts from unstructured documents and textual data.
In today’s data-first economy, the ability to use data you generated represents a real and essential competitive advantage. Done properly, will get you to a future state of mature analytical competency and will lead you to develop experience and a data infrastructure that unlocks every next step.
*Truth be told, no advanced technology (even AI) won’t magically solve your data silos. AI alone is not the panacea for the silo problems that afflict large organisations. AI has its own set of issues and limitations, including restricted and often heavy-lifting in form of training data, no assurance that the data will be understood, and flawed designs that lead to hidden biases transferred from humans to machines. And of course, the ultimate responsibility for eliminating organisational silos resides in us, humans.
Still, many managers are already seeing AI’s promise to help create more connected, coordinated systems and information flow, both inside and outside the organisation.
In the workplaces of tomorrow, silos may be a thing of the past as the contextual information you need will be available right when you need it.
Untrite helps to unify information from different silos, automatically enriching it with context to derive business value.
Our software provides clarity and augments know-how in data you already have. By using Machine Learning it helps computers understand complexity of human language in contracts, reports, emails and other documents. Organisations get a better view of the most relevant information in real time to make more informed, data-driven decisions.
Curious to see what can we do for you? Arrange for a demo.