DataOps and the DataOps Manifesto

Editor’s note: Chris is speaking at ODSC East 2019! Check out his talk this April 30-May 3 — “The DataOps Manifesto” and many more!


Why drive to the store to buy a stapler, when you can order it from Amazon who will deliver it to your doorstep in two days for around the same price? The list of goods and services that consumers purchase online seems to grow each year: movies, travel, software, taxi rides, lodging, food, IT resources, and more. Mike Jaconi calls this the on-demand economy. In his Business Insider article, he defines the on-demand economy as the “economic activity created by technology companies that fulfill consumer demand via the immediate provisioning of goods and services.” Immediate provisioning is not always instantaneous, but it is fast and it will keep getting faster. Companies will be under tremendous pressure to be agile and lean. Those who can adapt will become the pacesetters and the market leaders.

The market leaders will succeed because they have better information about customers, trends, and markets. This comes down to data science and analytics. If you examine the most successful companies, data analytics is at the heart of their success. Companies will sink and soar with the quality and flexibility of their data science.

[Related article: The Difference Between Data Scientists and Data Engineers]

https://bit.ly/2tERxaG

Imagine working at a company that has the industry’s best data scientists. The data analytics team would respond to requests for changes with previously unimaginable speed. The automated data analytics pipeline would incorporate changes rapidly and continuously. A robust test suite would ensure system quality. No more enterprise-critical IT emergencies at odd hours. Instead of fighting fires and being consumed with system maintenance, the data analytics team is freed to focus on higher-value activities that provide the company with more insight into the marketplace. The data science and analytics team aren’t relegated to some dark corner, they are in the center of the company’s strategy advancing its most strategic and important objectives.

We call this new approach to data analytics, DataOps. As we explained in our previous blogs, DataOps is built upon a foundation that includes Agile Software Development, DevOps and statistical process controls (SPC). Agile development and DevOps have enabled IT and software development organizations to advance from performing one release about every twelve months, in the 1980s, to releasing code many times per hour today. SPC guards against failures, controls, quality, and provides real-time alerts when data metrics drift outside defined limits. DataOps is a rapid-response, flexible and robust data-analytics capability, which is able to keep up with the creativity of internal stakeholders and users.

DataOps is an analytic development method that emphasizes communication, collaboration, integration, automation, measurement and cooperation between data scientists, analysts, data/ETL (extract, transform, load) engineers, information technology (IT), and quality assurance/governance. The method acknowledges the interdependence of the entire end-to-end analytic process. It aims to help organizations rapidly produce insight, turn that insight into operational tools, and continuously improve analytic operations and performance. It enables the whole analytic team involved in the analytic process to follow the values laid out in the DataOps Manifesto.

[Related article: 3 Sought-After Data Science Skills to Get Hired in 2019]

DataOps doesn’t require you to throw away your existing tools and start from scratch. With DataOps, you can keep the tools you currently use and love. DataOps views the data-analytics pipeline as a process and as such focuses on how to make the entire process run more rapidly and with higher quality, rather than optimizing the productivity of any single individual or tool by itself.

DataOps can accelerate the ability of data-analytics teams to create and publish new analytics to users. DataOps spans the entire analytic process, from data acquisition to insight delivery. Its goal is to achieve more insight and better analysis, while still being faster, cheaper and higher quality.

Editor’s note: Check out Christopher’s talk at ODSC East 2019 this April 30-May 3 — “The DataOps Manifesto” and many more!

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.