Easy, not Simple: Diagnosing Data Quality Issues
No data science program will be successful if its source data quality issues aren’t addressed. Anyone that says their source data doesn’t have data quality issues hasn’t looked at it hard enough, talked to business users or data warehouse analysts enough. Every source system will have endemic quality issues — it is the duty of the data professional to address them in a meaningful way.
Many organizations struggle at even defining what the issues are because it seems like an overly simple exercise in Who, What, When, and Where — but it’s easy — not…