Advancement of Analytics as a Community Pt 1

An “open-source” movement in data & analytics — is it possible?

Tony Ko
State of Analytics
3 min readJun 1, 2016

--

Over a decade ago, I was frequently met with raised eyebrows and eyes glazed over when I mentioned analytics or business intelligence. Today, I can’t walk across the street without hearing someone’s zealous explanation of their recent breakthrough in data analytics. It’s pretty amazing to see. At the same time, it’s unsettling that we are still treating analytics as a competitive edge and the problems being solved in silos within organizations & industries are exactly the same. When data and analytics was not as prevalent as it is today, it was deemed as a key component of an organization’s competitive edge. Today, the competitive edge is marginal at best. Everyone has some form of personalization, segmentation, forecasting, etc. The more relevant question now is: Has the legacy view of “proprietary” analytics become an impediment to greater advancements?

To be clear, this topic is not about an open-sourced data warehouse platform or technology. It’s about an open-sourced collection — a crowdsourced library — of data and the analytics built on top of that data that spans organizations and industries. A community of contributors that share accomplishments as well as lessons learned for the sake of the greater good.

I won’t spend much time arguing that collaboration expedites progress. However, there would have to be a very compelling reason why an organization would share their proprietary data & analytics that have provided them with a competitive edge in their market / industry over past decades. It may even take a philosophical perspective on the overall purpose of the organization’s existence.

For example, if you’re pharmaceutical company and your company purpose is to provide personalized treatment options, it would behoove you to share (or trade) treatment and study results with others, creating more data points to feed into drug development and/or clinical trial enrollment. As another example, if you’re in financial services, and a business objective is to provide secure financial services to your customers, you should be interested in fraudulent alerts in any of the credit cards that have a high probability of sitting next to your credit card in your customer’s wallet. These used cases are purposefully highly charged, littered with privacy and security warning flags. Because they also shed light on how impactful this can be on someone’s life and someone’s sense of security.

I anticipate two reactions to this —

Camp #1: Sharing my proprietary data and analytics will expose my organization’s differentiator and threaten our competitive edge

Camp #2: Gaining access to other organizations’ data and analytics will enable us to advance the industry quicker, leading to bigger & better solutions for the people that we service

For those in the 1st camp, your stance is understandable. There may be a “camp 1.5” where we find a happy medium that allows for collaborative advancement without compromising the core business model of many organizations.

For the sake of imagining what’s possible, let’s consider camp 2. Is it that far-fetched? Have we seen this evolution before? The answer is yes:

  • The open-source concept isn’t new — Linux. Red Hat. Ubuntu. WordPress. MySQL. Apache. Android.
  • Crowdsourcing data isn’t new — Wikipedia. Acxiom/Experian/DNB. Apple iTunes. Salesforce AppExchange.

Now that we know it’s possible, how do we mirror this evolution for data & analytics beyond 1-off competitions — e.g. Kaggle? How do we replicate the “copyleft” movement of the 1980s by Richard Stallman in the software industry, breaking out of the proprietary data & analytics mold that’s holding organizations back from collaborating for more rapid advancements?

As I continue my exploration, the keys to success, which I intend to address via subsequent posts are having the following:

  • A manifesto
  • A strong community of talented & passionate contributors
  • A viable platform
  • A starting point — i.e. a targeted, achievable outcome that can be shared across organizations & industries
  • Persistence

To be continued…

--

--