Principles & best practices for an acceptable maturity level in your data infrastructure

Hermes Romero
2 min readSep 17, 2019

--

Data related technologies are evolving fast these days, with more tools, more architectural approaches, and a growing data ecosystem, we need to back to basics, stop for a while, and assess our data maturity level to correct possible deviations and to blue-print a solid data strategy.

Tips for an acceptable maturity level

  • Every data item should have one defined master source (Single Source of Truth)
  • This single-source can be mirrored to versions of truth but these must have a way of reliably keeping in sync with the source
  • Versions of truth are read-only
  • All data movement and mappings from the source are traceable and documented
  • Data must be available to end-users unless appropriate restrictive access is specified under data protection regulations
  • Every entity object in the operational and information management systems must contain a unique identifier
  • Data capture requirements and acceptance testing are built into our data development life-cycle.
  • Treat data as an asset
  • Every data Item should have an owner
  • Every data Item should have a retention schedule
  • All services and processes should event their actions conforming to the eventing standards
  • Shared data entity/attributes must be identical in the domain, value, and definition

Best Practices

for data ACQUISITION

  • Delineate Batch feeds versus message-based data versus real-time needs.
  • Build a potential unified data bus triggered by events.
  • Build Traceability of the source attributes for Error Reconciliation.
  • Standardized Data Structures as input to the Data Integration layer.

for data PROCESSING

  • Avoid Data Overload
  • Follow Layered approach, separate persistence from processing
  • Run a rules-based job scheduling
  • Create a service enabled/event-driven real-time data processing using microservices
  • Use the process-one / persist-many approach
  • Increase the frequency of data loads to near-real-time
  • Ensure that data distribution in downstream applications stay insulated from each other

Data governance foundations

The overall data strategy must sit on top of the enterprise-wide data governance policies and tooling set.

  • Ensure data security and compliance
  • Integration within the enterprise data cataloging solution, enhanced metadata services, through data governance and stewardship.
  • Facilitate data lineage from top to bottom to be able to track cascading impact due to business data consumption patterns
  • Use data discovery as a mechanism to keep track of the available data and the current usage

--

--

Hermes Romero

Entrepreneur, data evangelist, expert in data strategy, data architecture, coach, trainer…