Inside the Architecture Powering Data Quality Management at Uber
Data Quality Monitor implements novel statistical methods for anomaly detection and quality management in large data infrastructures.
I recently started an AI-focused educational newsletter, that already has over 70,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:
Data quality management is one of those often forgotten aspects of machine learning workflows. Small inconsistencies or missing values can have a drastic negative impact on the training of machine learning models. In any medium to large organization, the proliferation of disparate data sources make their quality control a tremendous challenge. In the case of Uber, the transportation giant relies on thousands of data sources to power machine learning processes that ensure missing critical…