An Era of Trust: Using AI to Break Boundaries and Build Understanding

Tamara Nall
Authority Magazine
Published in
14 min readApr 30, 2018
Stan Christiaens, Co-founder and CTO, Collibra

In an environment where we place so much emphasis on who we trust, it is not hard to fathom that there would be a need to place emphasis on what we trust as well. When data governance is involved, trust must be at an all time high. Stan Christiaens and the dedicated team at Collibra are fully aware of just how important it is for the data we use each day to not only be “transparent and trustworthy”, but also easy to understand. As the Co-founder and CTO of Collibra, Stan wants data to be user-friendly, and therefore guarantees that there are no boundaries to stand in the way for customers. See how Stan and his team have taken rigid rules and replaced them with “open collaboration”.

Tamara: Can you share a story that inspired you to get involved in AI?

Stan: Since I was young, I’ve been fascinated by the magical notion of human-like intelligence in machines. My curiosity began with science fiction books, movies such as 2001: A Space Odyssey and The Terminator, and futuristic video games like Dune. It was at the University of Leuven (KU Leuven) in Belgium, where I first dabbled in machine learning and received a master’s degree in AI.

Compared to today, AI was really a niche focus twenty years ago. Many of the techniques were (and still are) inspired by nature, such as neural networks, genetic algorithms and swarm intelligence. Of course, they all needed a lot of data and a lot of processing power. Today we have AI applications that need even more data and more processing power than back then.

AI has definitely gone through a few hype cycles and low points. The biggest was called the “AI Winter” in 1974, when researchers became concerned that enthusiasm for AI had spiraled out of control, and disappointment and budget cuts followed. I believe that there was a low point for AI sometime in the early nineties, too.

AI never really went away; it was applied everywhere, but more silently. It is definitely going through a golden age today.

Tamara: Describe your company and the AI/predictive analytics/data analytics products/services you offer.

Stan: Once a futuristic concept, AI now is seemingly cropping up everywhere you turn, between virtual assistants that track personal information and conversations, to smart technologies connected to the internet monitoring our home temperature and more. AI is also driving decisions in finance, manufacturing, medicine, and marketing. This transformative technology can even flag fraudulent credit card charges, predict internal failures along an assembly line, diagnose dangerous lung disease, and anticipate what we’re most likely to buy online.

AI is an interesting beast, and it feeds on data. As AI, machines and predictive analytics become more common in corporate enterprises and everyday life (global AI revenues are projected to grow up $89.8 billion by 2025), managing and tracking data takes on critical importance, along with further complexity.

First and foremost, for the true potential of AI to be realized, such as in decision making based on machine learning, business users need to know that the data being fed into the algorithms is the right data. It needs to be accurate and complete, and organized in a way that can be easily accessed from all sources. Additionally, with the majority of business transactions related to customers and personal data, this places further responsibility on enterprises to protect that data, as well as the outcomes of the AI algorithms. There’s also a growing ethical concern about the ways corporations (even the government) use data and AI for their own commercial or political use.

These and other business trends are driving the growth of the data governance category. Data governance supports processes, standards and policies that enable business users to easily find, understand and trust data, offering a solution to growing concerns of data management, tracking and consumer protection. As the leader in data governance and catalog software, Collibra helps organizations across the world gain competitive advantage by maximizing the value of their data across the enterprise. Collibra is the only solution purpose built to address the gamut of data stewardship, governance, and management needs of the most complex, data-intensive industries. Our flexible and configurable cloud-based or on-premises solution puts people and processes first — automating data governance and management to quickly and securely deliver trusted data to data scientists and business users for analytics, AI and more.

Stan Christiaens

Tamara: How do you see the AI/data analytics/predictive analysis industry evolving in the future?

Stan: Predictive analytics, AI, and machine learning have the potential to completely transform the way companies do business. But for business users to apply the predictions of an analytics model in their decision-making, they must be able to trust the data along with the algorithms themselves.

Imagine it as follows: Old school methods of decision making were very manual: somebody had to get the data together, build a report (and underlying data infrastructure), present the report, and then a decision could be made. The volume of this has increased with self-service BI to the point that we now have a multitude of reports to base decision making. Still, the decision is only as good as the report, and the report only as good as the data that goes into it. If any ingredient is wrong, most likely your decision will be wrong as well. Yet these approaches are still very manual, so the rate of potential wrong decisions is low. AI is automated decision making: there is machinery that literally makes thousands of decisions per second. So, if any of the ingredients are wrong you are shooting yourself in the foot, quite often and very rapidly.

Data governance helps AI developers be more successful by providing information about what the data means and how it can be used, and it enables them to feel confident in the knowledge and accuracy of the data. A good data governance platform can also provide critical information about who developed the model, who owns the model now, what data that model uses, and how trustworthy it is.

In the coming years, we will see new partnerships between development teams and data scientists emerge. Data policies established in a governed environment will provide users with a common language, helping everyone understand when and how the data was collected, trace its lineage, and assess whether the data is likely to produce unbiased and predicted results.

Governed AI won’t just be a differentiator. Access to the right data and insight into how that data informs the decisions reached by an organization’s AI models will soon be foundational business practices.

To take this confidence one step further, a data catalog integrated with data governance empowers an organization with quick and efficient data discovery, so data users spend less time searching for the trusted data they need to feed into AI applications or models, and devote more time to creating and refining the models. Similar to Amazon, a sophisticated data catalog allows business users to shop for and find trusted data in one central location, while also viewing the complete meaning, lineage, and relationships of the data. Through ML functionality, the catalog serves up relevant data based on previous searches; it makes specific recommendations for “data purchases,” much like Amazon does for frequent shoppers. The catalog provides a valuable service to business users and data scientists because it’s reliable, convenient, fast, and provides the trusted data they need for business analysis and decision making.

Additional catalog functionality links all sources of metadata — data sources, business applications, data lakes, data quality systems, data warehouses — into a responsive system. These connections enable changes to be detected and policies applied immediately, without manual steps. This ensures reliable data training is fed into the AI model, resulting in that greater efficiency addressed earlier.

We still have a way to go to discover AI’s true capabilities for the enterprise, but data governance and a data catalog offer a strong foundation for ensuring the trust and integrity of the data for broader AI and ML efforts to come.

Tamara: What is the biggest challenge facing the industry today in your opinion?

Stan: While AI’s potential is exciting, it can also bring challenges. One of the biggest risks is the potential of negative data outcomes if the data input is inaccurate, and/or machine learning fails. For instance, AI can turn out adverse biases toward individuals and groups, as in the case of Google’s photo app which mislabeled a couple’s travel pictures resulting in racist auto-tagging. Data bias also has the potential to affect the kind of medical treatment we receive, whether we are approved for a loan, or how fairly we’re treated by the courts. This typically happens when an error is introduced into the training data, which then gets embedded in the model. Even worse, when the model creates new training data, it replicates — and indeed amplifies — the original bias. The results can be devastating.

The good news is that data scientists are thinking hard about how to eliminate data bias, but they need help. Finding the right data, understanding what it means, and trusting the integrity of the data sets will be an important first step, which can be achieved through data governance.

Equally as risky for both consumers, and companies interacting with consumer data through AI and predictive analytics applications, is data privacy. Data governance offers businesses confidence in data privacy and compliance, particularly with stricter regulations such as the European General Data Protection Regulation (GDPR) coming into effect in May 2018. Under these guidelines, organizations are accountable for the personal data they’ve collected through external sources or applications and how they’re using that data. This process is of critical importance for companies when it comes to applying AI since data privacy can be easily compromised as endlessly-available data indicators can be parsed in myriad ways to reveal our past behaviors and predict our future actions. Just because an organization has obtained the data, doesn’t mean they have the rights to use that data.

Data governance offers a simple and direct way to ensure that organizations are using the right data, but also identifies data errors and quickly flags and resolves those errors to help maintain (and/or restore) the organization’s confidence.

Tamara: How do you see your products/services evolving going forward?

Stan: It’s easy to get caught up in the hype of AI, with all of its promise to transform people’s lives and work. But when it comes down to it, AI is just another technology that enables business processes to be more easily and quickly automated. The technology’s full capacity is not possible without data.

The foundation of AI, specifically ML for business advantage, comes down to well-understood, intensively curated, and trusted data. Ensuring trust in the data is best achieved through Data Governance.

Collibra is perfectly poised to help companies through the adoption of AI because we work with customers across systems and industries to help them realize the value of their data. Focusing on the business user and business process at hand, Collibra takes an enterprise-wide and systematic approach to handling data, orchestrating many users and groups across the organization to ensure the availability, usability, integrity and security of data. Collibra also helps organizations stay compliant with shifting industry regulations (e.g., GDPR, BCBS 239), and enables dramatically increased use of the data by individuals throughout the organization. Once people know they can find, understand, and trust the data, they will feel more inclined to apply the data to new AI applications for business advantage.

Tamara: What is your favorite AI movie and why?

Stan: Although there are more realistic and more recent ones out there, I would have to say The Terminator is my favorite movie related to AI. It does a wonderful job of looking forward and tackling the hypothetical fear of machines taking over the world, to the point of destruction (or the Apocalypse). I am fascinated by the powerful possibilities of AI in its full capacity, but also believe we should approach this technology with caution. If you truly “let it rip,” then you need to be prepared for the consequences of automatization, posing a threat to human existence: jobs, leadership and more.

Inspired by the book, Superintelligence: Paths, Dangers, Strategies, Elon Musk has been warning business leaders and national governors about the potential threats of AI: “AI can become an existential threat for humans if not built properly.”

Per Nick Bostrom’s book, which explored the potentially dire challenges humans could face should AIs ever make the leap from Siri to Skynet, “If you give artificial intelligence an explicit goal — like maximizing the number of paper clips in the world — and that artificial intelligence has gotten smart enough to the point where it is capable of inventing its own super-technologies and building its own manufacturing plants, then, well, be careful what you wish for.”

I’m often a skeptic when it comes to technology hype, and AI is definitely in its hype cycle, a second or third time around. But with so many organizations investing in AI technologies, the hype this time around is becoming a reality. And there are many traps and uncertainties leaders need to be aware of before jumping into the AI abyss of no return.

Tamara: What type of advice would you give my readers about AI?

Stan: There’s no silver bullet when it comes to AI. Organizations need to be careful investing in this technology full-force with inflated expectations. It’s always best to test the AI waters first with specific scenarios where you know your data is sound and your application is clear. Pick areas of the business where AI can truly add value and where you know the business has made strides to ensure data management has been a priority — the data is well organized, easy to understand and trustworthy.

To be successful with AI, you need to ensure that you have the appropriate controls in place around your data. If you don’t, then the output delivered by the machine or the robot or the algorithm will be flawed and lead you down a path of faulty decisions.

Data is the food that fuels AI and what goes in will determine what comes out. Ensure the appropriate governance and controls are in place, both with regard to the data and how it is used in the AI machinery, as well as on the algorithm’s outcomes. If you fail to do so, then you run the risk of poor data driving even poorer insights.

Tamara: How does AI, particularly your product/service, bring goodness to the world? Can you explain how you help people?

Stan: The advancement of new technologies like AI brings even greater complexity for companies managing and monitoring their data. As well, AI brings greater risk to the digital consumer due to greater exposure of personal information, interests and habits, even whereabouts up to the last minute. This can cause a total invasion of user privacy as well as security and compliance risks for corporations. Provider transparency and data trust are becoming stronger imperatives for businesses to address if they want to succeed in the world of data and AI transformation.

The Collibra Data Governance and Data Catalog solution enables business users to quickly and securely access, understand and trust data for business intelligence, analytics, machine learning and more. Data governance is the set of processes, standards and policies upon which data owners within the organization agree to make the data usable enterprise-wide and ensure the complete accuracy of the data so that the data outcomes through analytics and AI are reliable and unbiased.

Data governance can serve as a gauge for organizations to determine data quality for GDPR compliance, data transparency and accurate results from machine outputs. I also believe that ethics will get their place in data governance, especially when applied to AI.

Tamara: What would be the funniest or most interesting story that occurred to you during your company’s evolution?

Stan: That is a tough question to answer, as there are so many interesting stories at Collibra. Working here is counted in Collibra years: every day is a week, every week a month and every month a year ☺. You’d have to shut me up once I begin. If I would have to pick one story I would go for a shared one that we see currently unfolding, but unfortunately in my perspective it is more a worrying story rather than a funny one: AI out of control.

A few years ago, I was devouring Warren Ellis’ Transmetropolitan. It tells the story of a cyberpunk society and a rebel journalist who takes on corrupt politics. His main nemesis is a president who goes by the nickname The Smiler and plows his way through society happily applying “everything is great” type propaganda, powered by advanced technology.

If you read the news today you can see similar things happening with data and AI at the core of it: government sponsored social credit systems, technology giants that produce racists bots, social media platforms that sell your personal data, clustering driven echo chambers, and individually targeted propaganda that influences societies.

What worries me is that AI techniques allow the creation of things like fake images and text, or even very realistically looking video where you can have anyone say anything they want. The data is out there in volume to achieve this, as well as to apply it to target people and influence them.

What gives me hope is that people seem to be waking up to data: journalists use it, analyze it and bring stories about it, and regulators are starting to understand the laws and regulations we need going forward. The Data Citizens, anyone who touches data to perform their job, are waking up.

This is an interesting story for a company like Collibra: it definitely helps drive business, and it also provides a huge, interesting challenge as to how we could help businesses and society even more. My hope is that we can make a small difference by helping people do data well and thus do well with data.

Tamara: What are the 3–5 things that most excite you about AI? Why? (industry specific)

Stan:

  • AI has the potential to improve productivity by 30% in some industries such as manufacturing, saving corporations in labor and overhead costs.
  • In healthcare, ML offers the ability to ingest large volumes of data such as billions of medical records, cull through that data, and draw recommendations on potential diagnoses. This type of cognitive learning is revolutionary considering AI automates this process in record time and off-loads the burden of what is almost humanly impossible — to stay on top of the volume of data being produced.
  • AI as a new instrument to help artists in their creativity, whether in new ways of painting, making music or even creating new stories.
  • The development of AI and robots’ performance moving beyond manual jobs, such as factory assembly and household chores to being able to “think” and perform analytical tasks once seen as requiring human judgment. Or better yet, help humans become more intelligent in analysis for tasks such as medical research and treating disease.

Tamara: What are the 3–5 things that worry you about AI? Why? (industry specific)

Stan:

  • Businesses neglecting to place necessary controls and guidelines (as with Data Governance) around data applied to AI models, resulting in negative outcomes and biases that can be detrimental to businesses and consumers.
  • AI posing a threat to the job security of millions of workers worldwide.
  • Automatically doctored media (images, video and more) used as propaganda on a very wide scale. This is something we as individual human beings are currently not very equipped to handle, and our societal systems (laws, regulation, etc.) are playing catch up.

Tamara: Over the next three years, name at least one thing that we can expect in the future related to AI?

Stan: Given the potential for artificial intelligence and the promise that it holds, it’s easy to see how many technology professionals view it as a magic wand that can solve all of their companies’ problems. I truly believe that AI will be successful but not immediately in very generic business applications, as many believe.

I believe we’ll see another slump, at least in the public eye: the dreams of an artificial general intelligence (AGI) will again take more time than people had hoped/feared, while at the same time specialized applications will definitely thrive, and thrive at scale (e.g., coordination of transportation systems, improved markets, disease identification, and more).

Again, what’s most important when it comes to AI advancements is investing in the data upfront to ensure its in good order and of the highest quality. Implementing an automated data governance solution is a great place to start.

--

--

Tamara Nall
Authority Magazine

CEO; Data analytics expert; Keynote speaker; Consultant; Founder of Nall-Edge (NE)