Data Science : Time To Change !

[This post was initially published on Zelros blog]

Using efficiently their Data is a complex process for companies : only 17% of them consider themselves mature when it comes to data analysis.

You probably have witnessed it : a majority of Data projects remain at the stage of “concept”, never giving applications in production.

By application, we mean an intelligent Data software, actively integrated into a business process or a service, and transforming it. This goes beyond the simple data visualization or dashboard — sometimes (severely) depicted as Minitels of Big Data !

Difficulties to go to production can come from several factors :

  • Data Scientists findings remain trapped into their computers, because their are not conceived to be shared in a compatible way with the enterprise IT, or not enough user friendly
  • End users are not involved early enough in the project — at the end the question of the business alignment is raised
  • Organization is unsuitable : silos between the “thinkers” and the “doers”, source of frustration
  • and so on …

The traditional way of doing things

Kaggle, internet, MOOCs, … teach us how to drive a “Big Data” project : scoping, data gathering, cleaning, analysis (data visualization), modeling, reporting (dashboards), production.

This approach is firstly focused on the technical aspects, to lead secondly to the application. It has been suitable during the last 3 years, while entry barriers were mainly technological.

But a breakthrough is happening : Data Science tools are now becomingmature, more and more easy to use, by an increasing number of collaborators. Amazon, Google, Microsoft launched their specialized data platforms, Spark 2 will be launched, deep learning is going mainstream.

Let’s take an analogy : for a startup in the digital economy, in many cases it isn’t technology that makes the difference. (but rather, it is a well-designed interface, an exceptional customer experience). This is the same for Data projects : usage, people and design are the critical success factors — no more technical aspects only.

Towards an new approach

That’s why it’s time to change, and reverse the cycle of Data Science projects:

No more starting from the Data Lake, to reach the application, but starting from the application and going back to the Data Lake

In many cases, everybody can find satisfaction:

  • Business users can express their needs early in the project
  • IT teams can anticipate production, and prepare the suitable environment
  • Data Scientists finally know which precise problem they have to crack, and can concentrate their energy on it
  • Software developers are involved early in the application development, which is favorable to agile increments, and a better engagement

This innovative approach, breaking the culture of the permanent POC, is at the center of Zelros product value proposition. From day 1 of the project, a minimum viable Data Application is deployed : software still limited, but functional, prone to refinements. These Data Applications help our customers to incarnate their projects, and concentrate teams towards acommon goal, reachable in a few weeks — and no more several months.

Try this recipe in you future project, we would love to hear about your feedbacks. What are the results in your context ?