CDO or CIO or CINO in the age of Big Data

Joyeeta Das
Gyana Limited
Published in
5 min readMar 20, 2017
Big data strategy in a non-tech organisation

Big data is a fuzzy word — almost everyone is doing “AI” and there is a FOMO going around. Macy’s, Sears are shutting down. There are seminars everywhere. How can you, as Chief Data Officer or Chief Innovation Officer or Chief Information Officer, do the right thing? It is a tough job — tougher than filming a hungry lion upfront and close.

Problems

  1. Data scientists — rare skill, hard to hire and hard to keep (happy)

Data scientist was the hottest job in 2016, earning up to a base salary of $250k per annum. But 37% to 65% of their work includes the “annoying act of munging” — cleaning up messy datasets before they are palatable to any data science techniques. And they are annoyed about it — very. Even tech organisations are struggling with making an effective Big data science strategy- how will the non tech sector compete.

1 million data scientists will be needed by 2018, and a deficit of 200k scientists will emerge by 2020.

2. DIY is still a myth — all tools require training or external help

Many self-prep data companies have come to the forefront. Yet, in reality, even the “high ease of use” tools like Domo, Alteryx, Tableau still require (frustrating) training even in 2017! Also, all of them tell you WHAT happened in your data, no one tells you WHY that happened. How is the decision maker truly empowered by data science?

3. Big data and Machine Learning do not always deliver despite the promise

Let us look at a sector that is really spending a lot on machine learning. Hedge funds like Aidiya are powered completely by AI and machine learning hedge funds have outperformed both traditional quants and the average hedge fund since 2010, delivering annualised returns of 8.44% over the period until last year compared with 4.27% for the average global hedge fund respectively.

But Predata backed by hedge fund manager Kyle Bass erroneously concluded a week before November 8 that Hillary Clinton will win!

So, what do these dichotomies mean, how do they point to the future and how do they impact your role?

Solutions

  1. Self-Service Data Science tools are the key to success

“By 2019, 90% of organisation will have appointed a Chief Data Officer out of which 50%, would have failed”- Gartner.

Can you afford to make such an expensive mistake?

Let us say one goes to consultants like Nielsen and Ipsos, who specialise in retail consulting. Here is the catch: they’ll be working with millenials and GenZ inside retail organisations. This is the Snapchat generation, and they have neither the interest nor patience to work for 6 weeks with consultants, only to achieve a static report that doesn’t integrate with their systems. Consultants won’t last very long with these guys.

2. Do search for true DIY

With Tableau, Domo, Qlik, Clearstory, Sisense, Alteryx — we can see there is clear movement to simple visualisation appeal. Yet organisations from non-tech sectors are struggling to hire tech savvy or data scientist employees who will actually use these.

The question is: who will learn them, who will use them, and how will this knowledge become actionable across the board inside the organisation?

3. Multiple external sources and aggregation of streaming + historic data creates value.

In 2015, many hedge funds mistakenly assumed that the rumour of food borne illnesses was causing sales decline in Chipotle chains, because they were using only footfall traffic apps. However, upon cross-checking with credit card transactions, Matei Zatreaneu of System2 found that in fact, meals were being ordered indoors due to the cold. Talk of “alternative data sets” !

Many companies are making this error: introducing a single data source such as Twitter or hyper weather as the only measure of understanding a trend. This is dangerous and can blow up in your face, and the real value is created upon introduction of *at least* one more source.

The power of 3 is a well known principle.Big Data (ROI ) is proportional to Total insights ( of Total N data sets ) / (Total Discovery costs * N)[Contact me if interested in the detailed mathematical derivation]. Basically, as you plot Big Data ROI against the number of independent data sets characterised by N, while the total knowledge gained exponentially increases for each additional independent data set added, the return of investment asymptotically approaches a finite limit as N approaches infinity. So, given a limited discovery investment (money or budget), a minimum of two subjects is needed, three ensure some level of sufficiency.

This means that adding more datasets, including both streaming and historic data sets and capability to seamlessly process them all in the same platform will be key to truly transformative Big Data initiatives in non-tech organisations. “By 2019, 75% of solutions will include 10 or more exogenous sources” says Gartner.

Conclusion

Reed Hastings of Netflix says “Gut is better than Big data” — it is the gut of a domain experience that helps us choose truly good decisions out of all the combinations that a data science engine may throw. You are better off training solid existing employees to use easy data science tools, than to train new data scientists to learn a new domain.

Thus emerges the trend of the “light quant”, the one who is inside the organisation and now, is empowered with data science. This is when you start winning.

73% of organisations have said that they will use or want to use Big data in the next 18–24 months and only 13% of them have actually taken steps towards this. These 13% will make all the difference, and you want to be in that category.

Your organisation generates data everyday — so use it to make more money, cut losses and stay ahead in your strategy. How? Connect it to a platform like GyanaAI that effortlessly integrates with its own exogenous geo-located macro data sources, and then provides insights led by AI. It’s your own unique business advantage, driven by your own unique data.

You need Big data and AI sciences to monetise your own data and connect it to outside world, but you need this to be super easy so there is no headache of training and recruitment. You need this to thrive as well as to survive.

“In the midst of chaos, there is also opportunity”

Sun Tzu, The Art of War

--

--