Continuous value-generation with DataOps

Rubén Acevedo
Dev Environment
Published in
5 min readDec 8, 2020

In this article we’ll analyze the pros of using DataOps in your business model and the ways to successfully create business strategies and develop better products using data analytics.

After this post you’ll be able to:

  1. Understand the concepts of DataOps and how it generates value continuously
  2. Know the DataOps Manifesto and its principles
  3. Recognize the ways to implement DataOps

What is DataOps?

DataOps can be explained as a process-oriented methodology used to acquire, store, process, assure quality and performance and deliver relevant information to the end user, in an ongoing and reliable procedure. It is the analytics capability to continuously generate insights to an organization, aiming the creation of business strategies or better products development that leverage the companies results.

DataOps was born after DevOps popularity growth a few years ago, and a similar perspective to manage the delivery services of data analytics rose integrating data science with agile methodologies.

At that time, the DataOps Manifesto was create using part of the philosophy of the Agile Manifesto to create the guidelines to implement the methodology.

The DataOps Manifesto

The Manifesto was created by many individuals and organizations supporting DataOps, and here you can check it:

Individuals and interactions over processes and tools

Working analytics over comprehensive documentation

Customer collaboration over contract negotiation

Experimentation, iteration, and feedback over extensive upfront design

Cross-functional ownership of operations over siloed responsibilities

DataOps Principles

1. Continually satisfy your customer:

Our highest priority is to satisfy the customer through the early and continuous delivery of valuable analytic insights from a couple of minutes to weeks.

2. Value working analytics:

We believe the primary measure of data analytics performance is the degree to which insightful analytics are delivered, incorporating accurate data, atop robust frameworks and systems.

3. Embrace change:

We welcome evolving customer needs, and in fact, we embrace them to generate competitive advantage. We believe that the most efficient, effective, and agile method of communication with customers is face-to-face conversation.

4. It’s a team sport:

Analytic teams will always have a variety of roles, skills, favorite tools, and titles.

5. Daily interactions:

Customers, analytic teams, and operations must work together daily throughout the project.

6. Self-organize:

We believe that the best analytic insight, algorithms, architectures, requirements, and designs emerge from self-organizing teams.

7. Reduce heroism:

As the pace and breadth of need for analytic insights ever increases, we believe analytic teams should strive to reduce heroism and create sustainable and scalable data analytic teams and processes.

8. Reflect:

Analytic teams should fine-tune their operational performance by self-reflecting, at regular intervals, on feedback provided by their customers, themselves, and operational statistics.

9. Analytics is code:

Analytic teams use a variety of individual tools to access, integrate, model, and visualize data. Fundamentally, each of these tools generates code and configuration which describes the actions taken upon data to deliver insight.

10. Orchestrate:

The beginning-to-end orchestration of data, tools, code, environments, and the analytic teams work is a key driver of analytic success.

11. Make it reproducible:

Reproducible results are required and therefore we version everything: data, low-level hardware and software configurations, and the code and configuration specific to each tool in the toolchain.

12. Disposable environments:

We believe it is important to minimize the cost for analytic team members to experiment by giving them easy to create, isolated, safe, and disposable technical environments that reflect their production environment.

13. Simplicity:

We believe that continuous attention to technical excellence and good design enhances agility; likewise simplicity — the art of maximizing the amount of work not done — is essential.

14. Analytics is manufacturing:

Analytic pipelines are analogous to lean manufacturing lines. We believe a fundamental concept of DataOps is a focus on process-thinking aimed at achieving continuous efficiencies in the manufacture of analytic insight.

15. Quality is paramount:

Analytic pipelines should be built with a foundation capable of automated detection of abnormalities (jidoka) and security issues in code, configuration, and data, and should provide continuous feedback to operators for error avoidance (poka yoke).

16. Monitor quality and performance:

Our goal is to have performance, security and quality measures that are monitored continuously to detect unexpected variation and generate operational statistics.

17. Reuse:

We believe a foundational aspect of analytic insight manufacturing efficiency is to avoid the repetition of previous work by the individual or team.

18. Improve cycle times:

We should strive to minimize the time and effort to turn a customer need into an analytic idea, create it in development, release it as a repeatable production process, and finally refactor and reuse that product.

Implementing DataOps

DataOps pipeline works aiming the continuous value-generation inside an organization, bringing the insights necessary to build business strategies.

So, what we can understand is that the work process of the data scientist and the related areas must be continuous, creating a cycle of data management that treat and interprete new data gathered by the company, as shown below:

For applying that concept, similar practices to agile methodologies are necessary, so the quality assurance of the service become a pattern in all projects:

We can conclude some facts from that information:

  1. Understanding the business model and the needs of the end user are extremely important, so the research and management of data could be precise on a matter that makes sense for the organization.
  2. A data gathering system is necessary so data analyst can continuously analyze and create high value insights.
  3. Data scientist team must be self-organized, so the decision making can always be in favor of making high quality analysis. The idea follows the same self-organizing principle of Scrum framework.
  4. Teamwork and qualified long-term teams (stable teams) are necessary so the business knowledge can always be refined as its best.

Conclusion

These informations are considered the 101 of DataOps, which it will become an important role on companies that wish to manage data and collect the benefits of it. Everyday we can see that data management is becoming a life-or-death matter to companies and the results of a well implemented DataOps are huge in comparison of the companies that don’t treat their data as they could.

And you, what are your expectations and thoughts about data management and DataOps?

References:

https://www.datascience.com/blog/what-is-dataops

https://www.dataopsmanifesto.org/

https://medium.com/data-ops/dataops-in-7-steps-f72ff2b37812

--

--

Rubén Acevedo
Dev Environment

Data scientist, caring brother and passionate writer.