Show them the data; they’ll tell you when it’s wrong

Lak Lakshmanan
Jun 28 · 2 min read

How often do you build data pipelines to ingest some data, but are not quite sure you are processing the data correctly? Are the outliers you are clipping really a result of malfunctioning equipment? Is the timestamp really in UTC? Is this field populated only if the customer accepts the order?

If you are diligent, you will ask a stakeholder these questions at the time you are building the pipeline. But what about the questions you didn’t know you had to ask? What if the answer changes next month?

One of the best ways to get many eyes, especially eyes that belong to domain experts, continually on your data pipeline is to build a visual representation of the data flowing through it. By this, I don’t mean the engineering bits — not the amount of data flowing through, the number of errors, the number of connections, etc. You should build a visual representation of the business data flowing through.

Use real-time dashboards to get more eyes on your data pipeline

Build a dashboard of what’s meaningful about your data to your stakeholders. This means that you will show the number of times a particular equipment malfunctioned in the past, and whether it is malfunctioning now. To figure this out, you will use the number of outliers that you were clipping out in your data pipeline. Build the dashboard, share it widely, and wait.

Real-time dashboards are catnip. People are drawn to them. The day that those outlier values are produced because of some reason other than malfunctioning equipment, someone will call you and let you know.

This works only if the dashboard in question is web-based or integrated into the everyday systems that decision-makes use. There are free dashboard tools like Data Studio, and tools like Tableau and Looker have free tiers. Learn how to use them to spread your semantic burden.

[For more tips on building data science pipelines, read my O’Reilly Media book Data Science on the Google Cloud Platform.]

97 Things

Tap into the wisdom of experts to learn what every great practitioner should know, no matter what technology or techniques you use. With the 97 short and extremely useful tips, you’ll expand your skills by adopting new approaches while learning best practices.

Lak Lakshmanan

Written by

Professional Services @ Google

97 Things

97 Things

Tap into the wisdom of experts to learn what every great practitioner should know, no matter what technology or techniques you use. With the 97 short and extremely useful tips, you’ll expand your skills by adopting new approaches while learning best practices.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade