In search of a practical data maturity model for California

Joy Bonaguro
The State of CalData
5 min readDec 1, 2022

We want a simple diagnostic tool for data maturity. Learn more and give us feedback.

Why we want a data maturity model

We want a tool to help departments diagnose where to make improvements in how they manage and use data. We want this tool to describe data issues from the business side so departments can self-diagnose regardless of data or technical literacy.

We also want to build our data services and curriculums around our data maturity model.

Limits of data maturity models we’ve seen

We haven’t done an exhaustive analysis of every model out there. But a few limitations bubbled up from what we have seen:

  • Proprietary models. We don’t want a data maturity model that costs money or is proprietary. Given our goals, we need something we can share freely and iterate on in the open as we learn.
  • Wrong audience with lots of jargon. Many models are designed for technology teams versus the business or program side. We think the business and program side needs to drive data maturity and we want our model to reflect that. An analogy we like is that IT helps maintain the pipes (backend systems) but business is responsible for what flows through the pipes (program data).
  • False linearity. Many data maturity models present it as a linear set of steps. This assumes that data maturity is a linear and orderly progression: all data efforts should move towards a single optimal level through one route. We think in practice that different teams need different approaches.

Our current data maturity model

Our working data maturity model is structured in two parts:

  1. Step 1: Get your data house in order. Part 1 of our model does actually assume a linear progression. We want all of our departments to level up to “data on demand”.
  2. Step 2: Use data in decision-making. Part 2 presents a menu of options for using data. It doesn’t assume that one method of data informed decision making is the best and ultimate destination. Instead, it’s about best fit for the problem at hand. It also allows for ongoing data development and exploratory analysis.

Wrapped around this are other key governance elements like security, privacy, data sharing, data standards, etc. We will bake those into our services, guidance, standards, and curriculum.

Many data maturity models assume a linear set of steps leading to an optimal use of data. We think it’s more complicated and that there is no single data destination.

Below are more details on these steps:

In Step 1, we have three levels:

  1. Level 1: Data Void. At this level departments struggle to answer basic questions about their programs and services. In reality, we find most departments are at least at level 2.
  2. Level 2: Data Fire Drills. At this level departments can answer basic or even ad hoc questions about their programs and services, but it often requires a lot of manual effort and can be exhausting. Some times the data scramble leads to inconsistent metrics over time, which can reduce trust in the data. We see a good number of programs operating at this level.
  3. Level 3: Data on Demand. One of our goals is to get all departments to this level so that core reporting needs and processes are automated and at the ready. This frees up analyst time to work on novel analyses or new reports.

In Step 2, we have a menu of options for using data in decision making, including:

  • Performance management
  • Evaluation & experiments, and
  • Advanced analytics, including data science and machine learning.
  • We also include ongoing data development and exploration to feed new questions and needs.

Not all programs and services will use each menu item. Instead we want to cultivate a shared understanding of when to use what approach then pair that with training and guidance.

We want our data maturity model to serve as a quick diagnostic tool written in plain language that the business and program side understand.

Example of the model in practice

A common diagnosis we see in the public sector is departments or programs that are stuck at Level 2: data fire drills. This can look like the following:

  1. Many departments have statutory obligations to produce annual or quarterly reports.
  2. The first step of the report is some type of data dump from the source system. The IT department will drop an excel file in a folder or email it to those responsible for the report.
  3. The analyst in charge of the report will recreate charts and tables from prior reports manually and usually in excel.
  4. Once the charts and tables are complete, they paste them into word documents and add text analysis and commentary.
  5. The end result is often a lengthy PDF that was created manually, sometimes with a substantial time lag in the data.

We call this the Sisyphean reporting cycle. It is time consuming and manual. This pattern sets the stage for data fire drills:

  • When the report is published or a new issue or topic surfaces, that department may be asked new or additional questions that the current report doesn’t address. For example, maybe they want the data by a new geography, or demographic, or some other grouping.
  • Since the reporting process relies on manual and one time data extracts, the department scrambles to answer the new questions.

For CalData, we want to design services to help departments get to level 3 and then become smart users of the various data methods in step 2.

For departments at level 2, we’ve piloted a service called the “Analytics Accelerator”. This service supercharges department reporting needs by helping them use modern tools to automate data feeds and reporting. It moves them from data dumps to data pipelines and from manual tables and charts to auto-updated interactive dashboards.

This program is based on a successful program incubated at DataSF. So far we’ve piloted it with two departments and will offer it as a service through the data and innovation fund.

Send us your thoughts and feedback

Part of why we are sharing this is to get feedback. Please let us know your thoughts or if you think there are other models that we should review and consider.

Also, can you identify your department’s place in this model? We want this to be a useful diagnostic tool. Tell us if it’s failing that test.

--

--

Joy Bonaguro
The State of CalData

Former Chief Data Officer of California. Former scaler @ cyber security startup Corelight. First CDO of San Francisco. Expert generalist :-)