Build quality into your data warehouse from the start

More businesses — even small to medium ones — are finding they need a data warehouse to manage their business intelligence functions. And as they learn about the challenges of building these systems, they also learn that building in quality with rigorous testing is the best foundation for success.

So when should you begin to think about testing? The answer is simple — at the beginning of the project. Quality must be baked into the data warehouse or users will quickly lose faith in the business intelligence produced. It then becomes very difficult to get people back on board.

Who should be involved with testing?

The right team is essential to success:

  • Business analysts gather and document requirements;
  • QA testers develop and execute test plans and test scripts;
  • Infrastructure experts set up test environments;
  • Developers perform unit tests of their deliverables
  • DBAs test for performance and stress;
  • Business Users perform functional tests including User Acceptance Tests (UAT);

Trust, but verify

Even wise IT managers who follow the old Russian proverb — trust, but verify — need to maintain vigilance to avoid pitfalls when the rush to get a data warehouse into service can circumvent effective testing.

On the other hand, it is unrealistic to test every single condition. This would be a very difficult bar to meet. Plus, requirements can turn out to be unattainable when tested against production information; business rules can turn out to be false or incredibly more complex than originally thought; and data warehousing applications continue to evolve with changing requirements. Needing to balance risk with cost, we choose tests that:

  • Minimize bottlenecks;
  • Verify source data which must be pure, full and filtered as much as possible. Without this, discrepancies and failures accumulate and snowball into bigger problems;
  • Verify the ETL process, as it is the main component of a data warehouse;
  • Analyze and verify reports, since these are the ultimate purpose and decision making tool of a data warehouse