Don’t expect a data democracy if you don’t participate

Why sharing is the ONLY way towards a data democracy

Bill Su
Analytics for Humans
7 min readFeb 22, 2018

--

“Data is created free, but everywhere it is in chains.”

- Jean-Jacques Rousseau (modified, of course)

Jean-Jacque Rousseau may have been gone for decades, but his words still ring true today.

The term “Data Democracy” is becoming more and more popular in the analytics and business industry. Accordingly, more people are falling under the illusion that we are closer than ever to the dream of free data for all.

Over my time in the analytics industry, however, I’ve made two observations that have snapped me out of this illusion and showed me that in reality we are still extremely far away from the dream of “Data Democracy”.

First: most of the small and medium businesses we interact with on a daily basis barely use analytics in their weekly workflow. Most of them aspire to be an data-driven organization, but don’t have the time or talent to make that happen.

While adoption of data analytics and machine learning is accelerating greatly in small and medium sized businesses, most of these projects remain in the realm of business intelligence and data consolidation — rather than in advanced spaces like Artificial Intelligence.

Second: getting data from popular tools like Google Analytics is extremely difficult and time-consuming, requiring not only understanding of data analytics, but also basic programming and data manipulation techniques.

A key trait of a Data Democracy is that everyone can benefit from analytics. Currently, however, converting raw data into insights and business actions remains incredibly challenging for individuals who are not specially trained

Today, we are going to examine a few of the major challenges preventing Data Democratization from occurring, and show why YOUR participation is most crucial in moving us towards the Data Democracy we all dream of.

The two central challenges of “Data Democracy”

In my opinion, our path towards democratizing data and data analytics faces two primary challenges.

The first challenge is what I call “Data Feudalism”.

Currently, most our data are controlled by only the privileged few.

Large companies such as Google, Facebook, and Apple collects an extremely large amount of data we produce everyday. They then use those data to deduce who we are and how to segment us for their clients’ products and services.

Don’t like it? Well, you can always opt out, but the vast majority of the connected internet relies on services from these massive companies. We, as smaller companies and ordinary users, are constantly at the mercy of those large organizations.

True, some of these companies (like Google) have started initiatives to help developers interpret and understand their data better. However, those initiatives are primarily designed to encourage repeat business and to optimize your messages on THEIR platform, rather than your own.

The easiest rebuttal here is that these companies are being generous in providing those tools for us to use in the first place. After all, we wouldn’t be able to access our data without it.

However, those data are generated by OUR businesses and OUR customers; blocking us from our own data would rightly be considered unjust and undemocratic.

Let’s pause for a second.

It is important to understand that there is very little that one can do about “Data Feudalism” as a private individual or small business. These are issues that span the entirety of the connected internet. Severing ones one node is not only ineffective, it’s counterproductive to the cause of Data Democracy.

More effective is participating fully in the principles Data Democracy. An example comes from the European Union, which has bowed to consumer pressure in ensuring that data are free and fair for everyone with strict consumer protections and technical disclosure regulations.

Further than this, however, is using ones data on their own. Though it may seem that Data Feudalism prevents all data from becoming available in the first place, most small businesses and individuals actually do have enough free data to use for analytics at their disposal.

This brings us to our next point — the external challenges that keep people from using their data fully and independently.

Despite the apparent abundance of data, businesses and individuals are prevented from taking control of their own analytics due to talent and cost issues.

The easiest solution for this is sharing (turns out your Kindergarten teacher had a point, after all). If a consultant or AI software, can see the data examples and implementation experiences of various companies and train their algorithm/experiences on top of it, it can offer everyone an relatively cheap analytical solution to remove the implementation barrier.

Unfortunately, most analytical and business intelligence softwares on the market do not build this sense of “sharing” into their system. Most of them even placing a hard “no sharing” clause in their privacy statement, preventing those mutually beneficial interactions from happening in the first place.

This lack of willingness and infrastructure to share is what I consider the second, and also the bigger challenge to the process of Data Democratization.

As many of us may know, the essence of democracy is participation and sharing. It is about synthesizing everyone’s views and opinions so that we can create a governance that best fit the need of each individuals. But, a democracy doesn’t function if the people do not participate.

The need for participation is particularly true for a Data Democracy.

If everyone in society agrees to informed usage and sharing of their data, then it would be possible to create an AI that could be used by people for for or a very low price. But participation is key. Without it, the system would not have enough data to build its own internal algorithms.

Many business owners and individuals have justified concerns about data security, competitive intelligence, and privacy. However, the truth is that these concerns can be allayed with some simple steps.

Some institutions fear, for instance, that sharing their data for the development of AI knowledge bases risks handing over their information and internal practices to their competitors. It’s a valid theory, but easily countered by a few simple security measures.

Here’s an example: When building analytical algorithms, our hypothetical data collective would scrub the names and details of any company or individual joining the hive as part of an anonymization process.

That data then would be mingled with the data of hundreds of thousands of companies. At the end of the day, even the data scientists who built the original collective wouldn’t be able to tell whose data came from where.

While there may seem like there are some risks in sharing your data, the benefits it brings are enormous.

Theoretically, that knowledge base will grant you access to a knowledge of all business situations that ever existed, and result of all outcomes under different situations conducted by tens if not hundreds of companies similar to you in similar situations you are in.

You then can choose among different decisions based on their likely outcomes, or tread your own path and create something people have never done before.

This will add a lot more certainty to your business decisions, and help your business both succeed in terms of revenue, and create true value for your customers and the society.

How we are going to promote sharing at Humanlytics?

The system I described in the last section is that we want to make happen at Humanlytics.

Our beta (which releases in about 2 months!) will be different than other analytics tools in one major way — it will ask you to agree to share your data in order to use the product — no exceptions.

Your data will be carefully protected and anonymized when used to construct our algorithm for AI-based features such as automated goal setting, automated data cleaning, and key metrics recommendations. At the core of all this will be sharing and interaction.

We believe in Data Democratization. We’re happy to tell you more about our policies, and how we believe that they will result in a product that helps you take unlock the power of Predictive Analytics. But we believe the best way to convince you of that is to give you a chance to beta our product. Seeing is believing — and we want to help you believe.

A world of “Data Democracy” is a world worth creating and pursuing — and we’d like you to join us on one of the first steps towards that world.

This article was produced by Humanlytics. Looking for more content just like this? Check us out on Twitter and Medium, and join our Analytics for Humans Facebook community to discuss more ideas and topics like this!

--

--

Bill Su
Analytics for Humans

CEO, Humanlytics. Bringing data analytics to everyone.