Member-only story

How to do data quality with DataOps

Leveraging data tests and safe environments makes data quality an everyday activity for everyone who touches your data

Ryan Gross
Towards Data Science
9 min readDec 2, 2019

--

Image Source: Pixbay

The cost of Bad Data

The costs of poor data quality are so high that many have trouble believing the stats. Gartner estimated that the average organization takes a $15M hit due to poor data quality every year. For some organizations, it can even be fatal. I’m often reminded of a story told by my Data Science Innovation Summit co-presenter, Dan Enthoven from Domino Data Labs, about a high-frequency trading firm, Knight Capital, who deployed a faulty update to their algorithm without testing its effect. Within a day, the firm had automated away nearly all of their capital and had to orchestrate an emergency sale to another firm.

He also speaks of a credit card company that failed to validate the FICO© Credit Score field from a 3rd party provider. When the company later switched providers, the new provider indicated “no credit” with an illegal value of 999 (850 is the highest legal value). Because there was no data quality check in place, their automated approvals algorithm started approving these applications with huge credit limits, leading to major losses.

DataOps ensures Data Quality is…

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Ryan Gross
Ryan Gross

Written by Ryan Gross

Emerging Tech & Data Leader at Credera | Interested in how people & machines learn, and how to bring them together.

Responses (1)