Principles & best practices for an acceptable maturity level in your data infrastructure
2 min readSep 17, 2019
Data related technologies are evolving fast these days, with more tools, more architectural approaches, and a growing data ecosystem, we need to back to basics, stop for a while, and assess our data maturity level to correct possible deviations and to blue-print a solid data strategy.
Tips for an acceptable maturity level
- Every data item should have one defined master source (Single Source of Truth)
- This single-source can be mirrored to versions of truth but these must have a way of reliably keeping in sync with the source
- Versions of truth are read-only
- All data movement and mappings from the source are traceable and documented
- Data must be available to end-users unless appropriate restrictive access is specified under data protection regulations
- Every entity object in the operational and information management systems must contain a unique identifier
- Data capture requirements and acceptance testing are built into our data development life-cycle.
- Treat data as an asset
- Every data Item should have an owner
- Every data Item should have a retention schedule
- All services and processes should event their actions conforming to the eventing standards
- Shared data entity/attributes must be identical in the domain, value, and definition
Best Practices
for data ACQUISITION
- Delineate Batch feeds versus message-based data versus real-time needs.
- Build a potential unified data bus triggered by events.
- Build Traceability of the source attributes for Error Reconciliation.
- Standardized Data Structures as input to the Data Integration layer.
for data PROCESSING
- Avoid Data Overload
- Follow Layered approach, separate persistence from processing
- Run a rules-based job scheduling
- Create a service enabled/event-driven real-time data processing using microservices
- Use the process-one / persist-many approach
- Increase the frequency of data loads to near-real-time
- Ensure that data distribution in downstream applications stay insulated from each other
Data governance foundations
The overall data strategy must sit on top of the enterprise-wide data governance policies and tooling set.
- Ensure data security and compliance
- Integration within the enterprise data cataloging solution, enhanced metadata services, through data governance and stewardship.
- Facilitate data lineage from top to bottom to be able to track cascading impact due to business data consumption patterns
- Use data discovery as a mechanism to keep track of the available data and the current usage