Unmaking the Nonprofit Data Mess

Genevieve Smith | GV Advisory
The Startup
Published in
6 min readJan 26, 2021


How organizations can take a mission-driven approach to improving, maintaining, and leveraging their data ecosystems.

DISCLAIMER: This is isn’t a piece about big data or data-as-commodity. This is an article about how data exist alongside people, places, and things — and what that means for our communities, organizations, and policies.

Data-driven, data-informed, data-enabled, data-centric, data data data. Data. Data are everywhere. We’ve heard.

Where are those data? How are they collected, analyzed, shared, and stored? Why are they collected, analyzed, shared, and stored? Who owns them? This conversation can teeter into territory that is overly-academic, existential, or hopeless-feeling, so I’m going to keep it somewhat manageable: what do these questions mean for mission-driven and values-aligned organizations?

I think about this in terms of data ecosystems. Data ecosystems exist at many levels: organizational, governmental, social, etc., and are comprised of:

-Data (structured, unstructured, and semi-structured)

-Systems (social and technological)



-The interactions between all of the above

Mission-driven organizations depend on data — whether stored in an enterprise-wide CRM, shadow spreadsheets on employees’ laptops, or post-it notes compiled after a community workshop or annual planning session. These data support grant proposals and reporting criteria, compliance reporting, program design, research, organizational learning, donor relationship management, and more. When these data are disorganized, inaccurate, or missing, the rest of the ecosystem suffers.

The most common refrains I hear are, “we need more data,” “we need more tech,” or “we need some data consultants.” These aren’t always wrong, but they’re rarely right — missing, messy, or unwieldy data are usually the tip of an organizational iceberg.

When our organizations declare that they need a data or tech solution, we must explore the issues we’re trying to solve. Are they truly data and tech issues? Usually, organizational capacity, culture, and strategy are tangled up together at the root of these issues. Understanding data and tech quality is important, yes, but so is the softer stuff — the who, how, and why of data and operations.

There are many issues at the root of what we’ll call ‘messy data’ in mission-driven organizations. Unintentional scale wreaks havoc: what was a volunteer organization thirty years ago now has thousands of employees who require access to a customized grantmaking system built and maintained by a three-employee company that’s on the verge of going out of business. Workaround cultures multiply once manageable issues: though an organization may have one set of common issues, lack of clarity, communication, and resources leads staff to invent workarounds. What may have been solved with a few standard solutions has spawned 15 solutions; none of which are compatible. The list goes on.

Enter the Data Ecosystem Diagnostic. AUTHOR’S NOTE: I am currently accepting suggestions of snappy names for practical frameworks. The Data Ecosystem Diagnostic, or DED, explores an organizational data ecosystem through the lenses of

1. A data lifecycle (acquire > clean > use/reuse > publish/share > retain/destroy)

2. Mission via a simplified theory of change

3. Systems overlay: structures (tools) | attitudes (people & culture) | transactions (processes)

None of these layers are completely separate — as in any ecosystem, everything is connected. To grasp the interconnected nature of this framework, imagine small, multidirectional arrows at every level. Overwhelming? Perhaps. Hairy? Yes. Manageable when viewed piece-by-piece? Absolutely.

GV Advisory’s Data Ecosystem Diagnostic Frame (simplified)

Initially, all data are relevant to a DED. From operations and finance to marketing and development to program monitoring and evaluation, we want to look at everything. For practicality’s sake, this is the light-touch phase: it’s a surface-level assessment of an organization’s data flows, lifecycles, and data health across departments. The DED’s more granular assessment begins with the area of the organization invested in the mission work — usually it’s programs or fundraising and development. These departments have swaths of survey data, interview notes, census data, and other data that may indicate contribution to social impact.

The first layer — the data lifecycle layer — of the DED categorizes how these data are acquired, cleaned, used and reused, published and/or shared, and retained or destroyed. These categories range from passive to active, automated to manual, ad-hoc to systematized. This layer usually highlights some gaps on the road from policy to practice — the organization may have written a process out for staff to collect and store interview notes, but staff may still rely on their notebooks and post-its when conducting evaluations. This layer also uncovers inconsistencies across teams. One team may conduct analyses using a set of assumptions and parameters that are at odds with the team next door.

Once we’ve got a picture of the strengths and weaknesses in the data lifecycle, it’s on to layer two — the mission layer. This, usually built on a simplified theory of change and its assumptions (Inputs > Activities > Outputs > Outcomes) help us outline and affirm the “why?” of an organization’s data. Essentially, this layer asks: how does our current data landscape help us understand whether we’re delivering on our mission or not?

This is where I usually ask clients if they’re asking their data questions they can answer. There’s often a breakdown in an organization’s data landscape at the data collection and analysis phases of the data lifecycle when mapped to the organization’s activities and outputs. In plain English: often, the data collected can’t always answer an organization’s questions. It’s not unlike asking a friend who is a known hockey fanatic with no interest in baseball, “Who won the 1986 World Series?” and getting angry with that friend when they don’t know the answer (go Mets). Yeah, it’s all sports, but we must account for nuance and expertise. If you’re ever making decisions about data collection and analysis, always ask: can these data answer our questions? If not, you’ll need different data, different questions, or both.

The first two layers — data lifecycle and mission — of the DED represent the core of the data ecosystem. We’ve got a good picture of where and how data are living, and we’re exploring the “why” in a way that aligns with the mission.

The third layer — the systems layer — is my favorite layer. Informed by Omidyar’s Systems Practice, it uses a Structural-Attitudinal-Transactional (SAT) systems lens. In an organizational context,

‘Structural’ refers to the social, natural, and built environment in which people work — are staff remote or onsite? Do the organizations’ software platforms meet its needs? Structures = Tools

‘Attitudinal’ refers to sociocultural factors that influence individual and group behavior — how do staff work together across departments and hierarchy? Attitudes = Culture

‘Transactional’ refers to the processes and interactions among formal and informal leaders as they navigate their work — are processes effective? Are they codified? Are they relevant? Transactions = Processes

These drivers have upstream causes and downstream effects, all of which influence the data lifecycle and how data support or hinder an organization’s mission alignment and data capacity.

Interactions between the first, second, and third layers all inform the fourth: tactical and strategic action to strengthen and leverage organizational data ecosystems. This may feel big. It is. I think of this as a capacity-building tool, though: we’ve got our issues out on the table, so let’s look at them in the context of our organizational priorities and decide which lever to pull. This may mean that we focus on one or two issues with data acquisition that impact our understanding of programmatic outcomes. Once we’re confident in our adjustments, we’ll revisit the DED and see what’s next.

A guiding principle of this approach is one of manageable tasks: data use in the social sector is a nebulous (and at times, overwhelming) issue — the DED helps to root-cause data use and capacity issues to highlight high-impact, sustainable solutions in and outside of organizations.

Genevieve Smith is a social change expert, an organizational behavior and strategy consultant, speaker and facilitator. She runs GV Advisory, where she works with corporate leaders, INGOs, and investors to align their actions and their values. Genevieve is available for project-based work, coaching, and advisory services. Learn more about her work here or get in touch: genevieve@gv-advisory.com



Genevieve Smith | GV Advisory
The Startup

Professional bummer. GV is a social change expert, organizational behavior and strategy consultant, speaker and facilitator. She asks pointed questions kindly.