How do we measure systems change?

12 min readSep 19, 2023

By Søren Vester Haldrup

A bit like the blind men and the elephant

UNDP has set up an M&E Sandbox to nurture and learn from new ways of doing monitoring and evaluation (M&E) that are coherent with the complex nature of the challenges facing the world today. Our hunch is that portfolios are a vehicle that can affect system change through a connected set of interventions (see here for more info about our work and have a look at our portfolio primer). So, our selfish interest in the M&E Sandbox is to surface new practices and learn from others who are reimagining how to understand change in fast moving contexts and across a connected and evolving set of interventions. In the Sandbox we prioritize action and practical insights ‘from the front lines’ about how to do it, rather than abstract theory and general principles (though those are important too!). In this connection, we seek to explore how M&E needs to be different when we work on complex problems in uncertain and rapidly changing contexts, rather than through the more ‘linear’, control-focused, and projectized M&E that tends to dominate today.

We convene a series of participatory sessions as part of the M&E Sandbox. In each session we collectively explore a theme in depth, inviting practitioners to speak about their experience testing new ways of doing M&E that helps them navigate complexity. You can read digests and watch recordings of our previous Sandbox sessions here: using M&E as a vehicle for learning, measurement and complexity, and progress tracking and reporting. Do also consult our overview piece on innovative M&E initiatives and resources.

In our most recent Sandbox session we explored the challenge of measuring systems change. This blog post provides a summary of the discussion and includes the recording, an overview of questions and answers from the discussion, as well as the many resources shared during the session.

If this post has sparked your interest, I recommend that you watch the full recording right here:

An increasing number of organizations, movements, and people aim to contribute to systems change (or systems transformation). In UNDP we are approaching this by adopting a portfolio approach to understanding and effecting change on complex issues. But what should we look at to know if a system is changing? What types of data can we rely on? Who should be part of this process? How do ensure that we measure change in ways that capture the lived experiences of real people out there (be they smallholder farmers or urban youth)?

We explored these questions with a great panel of speakers during our recent M&E Sandbox session. The panelists were: Gaitano Simiyu from the Alliance for a Green Revolution in Africa (AGRA), Randall Blair from Mathematica, Roopa Roshan Sahoo from the Poverty and Human Development Monitoring Agency (PHDMA), and Marisa de Andrade from the University of Edinburgh.

The session brought out a number of important themes. Three stand out to me (I unpack these in the concluding paragraph):

Be conscious and inclusive when deciding what to measure in a system
Combine various types of (quantitative and qualitative) data to measure change in a more holistic way
Remember to use new insights about system change in decision-making

Tracking change in Africa’s food systems

Simiyu (AGRA) and Randall (Mathematica) kicked us off with an outline of how AGRA seeks to measure food systems change. AGRA seeks to measure food systems change by tracking change along 10 different dimensions (see visual below). For each of these dimensions, the team has selected a set of key indicators that they track on a regular basis. Overall, this gives a team a sort of ‘heatmap’ overview of what the system looks like and where change is (and isn’t) happening. This framework incorporates thinking on systems change (such as the water of systems change framework and the COM-B model of behaviour change) as well as agriculture and food systems. In designing the framework the team sought to build on existing indices and scorecards rather than designing new ones.

*AGRA’s framework for measuring systems change*

Questions and Answers — AGRA and Mathematica

Question from Sandbox community: How did you work together to create those goals? And in which platform are you imagining building this dashboard?

Answer: The AGRA M&E unit first organized and workshopped corporate-level goals with the business line teams (sustainable farming, inclusive markets and trade, etc.) and AGRA leadership. Once these were well-documented and agreed-upon, the goals were cascaded to the country level through another round of consultation with country teams. This became a somewhat iterative process, as conversations with country teams sometimes revealed some new information that led to some changes to AGRA’s corporate and business line goals.

We have not settled on a platform to build the dashboard. PowerBI has discussed as an option that will allow for timely internal updates as new data come in. We are also discussing using an external vendor who can build a customized platform. A customized platform would likely be more visually attractive and user-friendly — but could be more difficult to update.

—

Question from Sandbox community: The AGRA team mentioned they kept the TOC and the system’s change framework connected. Can they elaborate how they did this in practice? How are these two connected?

Answer: With AGRA, it was easy to keep the ToC and systems change framework one and the same because AGRA has always focused on systems change through some key levers — policy, large-scale public programming, etc. AGRA’s ToC, when distilled, shows (1) which AGRA investments and activities are planned, (2) how those will lead to catalytic changes in systems (increased investment, changed policies, practices, capacity, etc.), and (3) how these catalytic changes will result in healthier, better functioning systems and improved stakeholder outcomes. The systems change literature generally focuses on (2) — but the ToC needs to make the causal connections between (1), (2), and (3). So in a way the ToC is the full vision for how AGRA will meet its ultimate goals — and any systems change thinking needs to be an explicit part of that vision. I’d recommend using ‘The Water of Systems Change’ by Kania et al. to explore whether any given ToC is truly oriented toward systems change.

—

Question from Sandbox community: Both presentations so far have talked about the role of these frameworks in performance monitoring/management. How are they addressing the challenge that data collected for performance management is significantly gamed/corrupted?

Answer: Yes, grantee-reported monitoring data can be biased toward ‘good news’. AGRA validates grantee-reported outputs — such as people trained — and has moved away from asking grantees to report more difficult-to-measure outcomes (like increased investment or sales linked to AGRA’s work). We cannot expect grantees to estimate difficult counterfactuals — that is our job as researchers. Instead, we should support grantees in reporting outputs and some easier-to-measure outcomes in a valid, consistent way — and we should verify some of what they report. The hope is that the increased measurement support (positive incentives) and validation (negative incentives) will result in more valid reports of programmatic progress and results.

—

Question from Sandbox community: During Randall’s presentation, which shared a framework dashboard that identified goals that looked essentially like targets, I would like to know to what degree these needed/need to be revisited based on changes in the dynamic contexts/systems the intervention is implemented, and to what degree donors/senior leadership had/have an appetite for such changes?

Answer: Agreed that systems change is often not linear, and rarely proceeds along the timelines we initially envisioned. This results in the need to refresh and revise targets on a regular basis. We’ve seen appetite from AGRA and its donors to refresh targets annually, and all stakeholders seem to respect the need for these refreshes. There isn’t much appetite for quarterly refreshes, from what we’ve seen, as they would require almost continual dialogue about targets. Most stakeholders seem to favor annual or midterm refreshes (every 2.5 years for a 5-year strategy) at most.

People-centric monitoring of poverty and human development

Next up, Roopa and her team described how PHDMA is adopting a more narrative-based approach to measuring change in poverty and human development in the Indian State of Odisha. This approach is based on the notion that quantitative metrics alone often fail to capture the subtleties of impact, while qualitative measurement and stories can help uncover and capture deeper shifts and attitudinal changes essential for lasting transformation. Stories, they noted, have the power to evoke emotions, empathy, and a sense of connection. They allow us to step into the shoes of characters and understand their feelings, motivations, and struggles: “Prosperity is not in numbers it is in the stories that we see and will tell.”

PHDMA seeks to measure change as felt by everyday citizens on the ground. This happens through three human centered design treatment labs (see visual below).

PHDMA’s Human Centered Design Treatment Labs

Questions and Answers — PHDMA

Question from Sandbox community: Would it be possible for the PHDMA team to share some resources about these more qualitative evidence? I am curious to see what it looks like and the type of information it includes.

Answer: Please visit our website to see our way of looking at qualitative M&E: http://phdma.odisha.gov.in/

Measuring humanity in complex systems

Marisa from the Centre for Creative-Relational Inquiry and Binks Hub at the University of Edinburgh provided an additional perspective on how we may begin to capture the unmeasurable aspects of people’s lived experience in a system. Marisa presented work from Scotland focused on surfacing the felt experiences of people facing intergenerational trauma, displacement, and social injustice in local communities. This included the use of sensory gardens and activities such as turning waste into art as mechanisms to help people express their own lived experiences. These efforts, Marisa explained, allowed stakeholders in the Scottish Highlands to see a system (or problem) through different lenses — similar to looking through a crystal where different angles give different perspectives. However, work still remains to build ‘creative literacy for policymakers’ — helping them develop the structures and capabilities required to turn new perspectives and appreciation of communities’ lived experiences into policy and action.

*Using sensory gardens and art to help traumatized communities express their lived experiences in a system*

Questions and Answers — Centre for Creative-Relational Inquiry

Question from Sandbox community: Very interesting way to consider measurement! Curious about whether this methodology can be transposed on programs and projects that were designed in a more linear way? Or is that a prerequisite?

Answer: The initiatives that I initially applied this methodology to were very much grounded in linearity. In fact, the first iteration of this framework was incredibly (far too!) reductionist and became self-defeating — follow my thinking here and here. I then became much more comfortable with embracing non-linearity in complex systems — when it comes to wicked issues that are all entangled, where does something start and where does it end? That’s when the magic happened. Through this lens, I was able to ‘measure’ something completely different by questioning and troubling my own relationship with linear thinking.

—

Question from Sandbox community: Noting how far these questions typically are from where ‘establishment’ institutions are, it leaves me wondering about the challenge of bridging between ideas on “other ways of knowing” etc and institutional demands for more reductionist forms of data.

Answer: I hear you. I’ve been on — and building — this ‘data-evidence-research-practice-policy’ bridge for the last 10 years or so. There was a lot more resistance at first, but there’s genuine movement and momentum now especially since the pandemic when everything we believed to be real, true and consistent was flipped on its head. Much of the work I do now is asking policymakers, practitioners and researchers to (at least try) to question what we know, how we know it, whose knowledge matters and different ways of being in the world. Sometimes this is framed as activism in the academy and in the world. At other times, it’s very practically about creative literacy for policymakers and understanding and measuring change in local communities. Change happens one person at a time.

—

Question from Sandbox community: Do you have a publication on this?

Answer: Recently I wrote these ideas down and it manifested in Joint Winner for the 2023 International Congress of Qualitative Inquiry Book Award for Public Health, Humanities and Magical Realism: A Creative-Relational Approach to Researching Human Experience, Routledge. We’re also currently publishing these ideas from REALITIES in Health Disparities: Researching Evidence-based Alternatives in Living, Imaginative, Traumatised, Integrated, Embodied Systems. This is a UK Research and Innovation (UKRI) consortium hub award to tackle health inequalities through community assets such as parks and arts spaces. Watch this space.

Key take aways

I walked away from this session with a range of new insights and ideas. Three stand out:

Be conscious and inclusive when deciding what to measure in a system: you can look at many different things to get a sense of whether change is happening, and you can articulate these things alone or with a broader group of stakeholders. What you choose to look at reflects (conscious or implicit) decisions about what is important (and not important) in a system and what desirable change looks like. For instance, do we focus primarily on policies, laws and resource flows, or do we also look at power dynamics, mindsets and lived experiences among society’s most vulnerable?
Combine quantitative and qualitative data for more holistic measurement: different types of data provide different types of insights. Numbers and statistics can capture some aspects of a problem, while qualitative information provides very different insights and nuance. One is not better than the other, but we need to use both if we want to capture and understand systemic change in a more holistic way.
Don’t forget to use insights about a system in decision-making: it is easy to lose focus on why we measure something. At times, we may get lost in the details of how and what to measure while forgetting to ask what we are learning from this measure and what this means for what we do next. Furthermore, we may end up measuring change in a system primarily for accountability (reporting) purposes rather than to learn. To avoid this, we should build decision points into any systems change measurement effort and always ask ourselves “what is this telling us about what is going on in the system and what does this mean for what we do next?”

Additional Resources

Here’s a list of resources shared during and after the session:

The Water Systems of Change (FSG)
Rethinking monitoring and evaluation in complex systems — when learning is a result in itself (UNDP Strategic Innovation, Medium)
Innovative M&E from the Sandbox and beyond (UNDP Strategic Innovation, Medium)
Indikit — an on-line guidance on the use of relief and development indicators across various sectors
Rebel with Causation
Organisational Learning in NGOs (INTRAC)
Three out of Four Practical Lessons about right fitting MEL for Additive Effects (Florencia Guerzovich, Medium)
How does the commissioning process inhibit the uptake of complexity-appropriate evaluation? (CECAN)
Are you really ready for developmental evaluation? You may have to get out of your own way. (Center for Evaluation Innovation, Medium)
Citizen Voice and Action for Government Accountability and Improved Services : Maternal, Newborn, Infant and Child Health Services : Final Evaluation Report (World Bank)
Environmental approach for generational impact (Better Evaluation)
Relational Systems Thinking: that’s how change is going to come, from our Earth Mother (Journal of Awareness-Based Systems Change)
Evaluating Outside the Box: Evaluation’s Transformational Potential (Social Innovations Journal)
Shifting Systems Initiative (Medium)

If you would like to join the M&E Sandbox and receive invites for upcoming events, please reach out to contact.sandbox@undp.org.

A bit more about the speakers:

Gaitano Simiyu, Senior Program Officer, Alliance for a Green Revolution in Africa (AGRA). Simiyu leads M&E systems design, development, and implementation across AGRA portfolio countries in Africa.
Randall Blair, Principal Researcher at Mathematica, Leading evaluation and learning efforts for market-oriented agriculture investments in sub-Saharan Africa.
Roopa Roshan Sahoo, Member Secretary for the Poverty and Human Development Monitoring Agency (PHDMA) and Commissioner-cum-Secretary to ST & SC Development, Minorities & Backward Classes Welfare Department in the Government of Odisha, India.
Marisa de Andrade, Associate Director for the Centre for Creative-Relational Inquiry and Co-Director for The Binks Hub at the University of Edinburgh