Now we are all measuring impact — but is anything changing?

Griffith Centre for Systems Innovation
Good Shift
Published in
13 min readJun 17, 2024

--

Impact measurement is everywhere — in social change work, business, government — and yet it can feel like the more we hear about the importance of measuring impact, the less seems to actually change!

Over the last five years here at GCSI we have been involved in many projects where ‘impact measurement’ has been core to the work. Sometimes we have also played a role in measuring impact inside projects or initiatives, or helped to structure measurement and learning approaches alongside the work. In this blog we reflect on our learnings in the process and share how we are framing the landscape that is emerging in the impact measurement territory.

Measuring Impact — A Critical Starting Point

Every culture and society has a connection to measurement in some form — through narrative, country and landscape, navigation, story. Yet, much of what has informed the measurement of change has a much narrower cultural story — one that is grounded in Western philosophical and mathematical traditions. And so, much of the narrative underpinning impact measurement as it is commonly understood today is focused on the utility and benefit of measuring ‘progress’, with overtures of moral imperatives — as in, are you progressing towards the ‘right’ destination’?

As ‘measuring impact’ is now such a pervasive activity, we think it is important to unpack the ways in which measuring impact is actually the product of particular cultural and historical paradigms; whilst professing to present an objective assessment. There is a need for caution, critical reflection and decolonising practices when it comes to measuring impact (see for example, Joyce, 2020, Impact Measurement. A cautionary tale). This is a necessary starting point.

Measuring Impact has a History which is Shaping Current Convergences

Whilst the history of measurement is long and rich, in the context of caution and critical reflection we would like to highlight three historical threads that have shaped the landscape of modern impact measurement. These threads overlap but have been largely separated by disciplinary and contextual boundaries. Understanding the narratives inherent in these threads can also offer some insight into questions we should be asking ourselves, however we are using impact measurement at this moment in time.

Thread One — How Measurement became Core to Solving Problems

The first moment we’d like to highlight sits are the end of a revolution and transition and in the heartland of colonial expansion: London, in the 1840s and 1850s, just as the Industrial Revolution was transforming landscapes in England, Europe, and in the lands which were experiencing new waves of colonisation through population expansion. The story of how John Snow challenged the beliefs and practices of the time by collecting and analysing data, measuring the spread of cholera, and isolating the patterns and connection between those who had succumbed to the disease eventually led to significant shifts in public health, infrastructure and social policy across London and then more broadly (noting that this actually took a long time to eventuate!).

Source: based on Tulchinsky, 2018

We start with this story because ‘data’ and ‘measurement’ have become embedded in how we think about creating change and solving problems. A field such as epidemiology helps us to remember the very real and powerful role that data can play in saving lives and shaping society, policy, practices and people’s beliefs and behaviour. However, it pays to remember that the ‘data’ itself did not generate the change — there were still powerful norms, ideologies and narratives that had to be challenged in order to effect change. This highlights that data and measurement are not enough in and of themselves to ‘solve problems’!

It is also worth noting the nature of the ‘problem’ here — it was complicated and involved ‘unknowns’ for sure. However, it was the type of problem that could be addressed with rigorous and careful research and analysis, by an expert who could help establish clear cause and effect relationships. It was not a complex, multifaceted problem — in the way that issues like addressing multigenerational disadvantage, or responding to the climate crisis, manifest. This is important because the narrative of ‘data + measurement = solutions’ has become embedded in political, policy and funding decision-making — and yet in complex contexts this narrative is just not that easy!

Some of the questions this story surfaces, that we should be asking ourselves are:

  • How is the collection of data and evidence, the measurement of impact, shaping our current beliefs, institutions and practices?
  • What are the frontiers of impact that are reshaping the world at this time of history?
  • How are we shaping data and measurement in relation to the context and the type of ‘problem’ we are responding to?

Thread Two: How ‘Value’ came to be Embedded in Impact Measurement

So much of the impact measurement landscape focuses on how we ‘value’ and create trade-offs by establishing numerical ‘rankings’ of factors — and then often linking them to financial proxies to establish ‘value’.

Cost-benefit analysis is one of the core methods that has featured as a way to compare potential or perceived impacts of activity. Developed in the context of military and public works across two different continents (Europe and the US), cost-benefit analysis grew out of a desire for comparable, rational, politically justifiable measures that could demonstrate the ‘value’ of undertaking large-scale infrastructure work such as building roads, canals and railways.

The history of this method is a fascinating journey into how much we have come to trust mathematical measures to assess costs, benefits and reduce risks — particularly when much research now demonstrates the limitations:

“Forecasters, policymakers, and scholars tend to assume that cost-benefit forecasts are more or less accurate, when in fact they are highly inaccurate and biased, at an overwhelmingly high level of statistical significance” (Flyvbjerg and Bester, 2021).

Mathematical measures of impact have also developed out of other ‘risk’ based fields such as insurance — which started processes of “valuing lives” based on statistics and contributed to the creation of numerical foundations for assessing risk, often on foundations of inequity (see the brilliant analysis undertaken by Dan Bouk, 2015).

The lessons from the exponential growth of the use of such methods should prompt us to ask questions related to

  • What assumptions and biases are built into the methods, and what does that mean for what should be considered in applying them to decision-making?
  • How do numerical methods and the monetisation of measurement create a narrative of certainty and security when many of the contexts we are working in are complex, shifting and unsettled; and what should that alert us to in adopting such methods?

Thread Three: How Predicting and Evaluating Impacts became Separated

Impact Assessment and Evaluation evolved from different starting points, but interestingly have some overlapping narratives focused on how we determine the worth or value of an activity or intervention. The origins of ‘evaluation’ sits across two connected streams (Alkin and King, 2016) . One linked to testing — the start of quantitative assessment in the context of schools in the 18th Century, measuring the performance of individuals in terms of their academic achievement (Hogan, 2007). The other stream related to the use of social research methods in the assessment of broader, collective endeavours such as projects, programs and initiatives. For example, during the Great Depression in the UK, when the public expected government to address record high levels of unemployment there was a significant push to evaluate how the social programs that were funded actually delivered impacts (Alkin and King, 2016). The professional field of evaluation emerged in 1960s and 1970s (Madaus et al, 1983).

Impact Assessment started as a way to determine both potential value and consequences of planned interventions, so that both decision-makers and people affected could understand and participate in responding (see Jacquet, 2014). While it started in the environmental field, it has since grown to encompass assessment of potential social and cultural impacts. The history of Environmental Impact Assessments is relatively brief — with some historians claiming it started with forestry projects in the US in the 1970s, and others taking the history back into mining and extractive industries in the 1960s, often starting with the 1967 example of copper mining in Puerto Rico as the starting point of a movement towards EIA measurement (see Mayda, 1993). Though EIAs are now often embedded in legal and policy frameworks nationally and in more regional jurisdictions, questions remain about whether they actually protect people, places or the planet (see for example, Singh et al, 2020).

The thread that ultimately connects them centres on how we define and determine impact. Both evaluation and impact assessments in the main, focus on relatively small units of activity — in the form of projects, programs or initiatives. Whilst it may appear rational to assume that such projects can achieve outputs and perhaps outcomes, IMPACT is more associated with cumulative actions — where the whole is greater than its parts. So, the questions that should be drawn from both these fields, relate to how we build coherent narratives of impact across parts. In other words, we may have thousands of project-based assessments and evaluations but still have no real understanding of any change, or of the cumulative impact of all these projects. If that’s the case, then how do we really know that any of the activities are generating positive momentum in the desired direction of travel? And in the case of impact assessments, if every one of the 500 industry projects that we assess has completed an environmental impact assessment but none reference the assessments in the 499 others, then how could we really hope to understand the environmental or social impacts potentially created across a particular region or sector?

Why is Impact Measurement so Ubiquitous at this Moment in Time?

We are working in a critical moment — the increasing recognition of climate and biodiversity crises, the growing challenges to peace and democracy, the increasingly evident and grotesque expansion of inequality are unsettling and generate a heightened sense of uncertainty.

While many proponents of impact measurement argue that such methods aim to ‘reduce risk’, it may be more accurate to suggest that they are an aid to navigating uncertainty (for a compelling analysis of the difference see Vaughn Tan’s work — https://vaughntan.org/). The more uncertain things become, the more the push for methods that help us generate a sense of certainty grows — and we think this should also make us a little cautious about the efficacy of those methods. Impact measurement is proliferating across industries, sectors and disciplines — as we seek to understand the real differences being made, reach ambitious goals for transformation and ask questions about what we could do better to address what needs to change. In the diagram below we have collected together some of the imperatives behind the growth of impact measurement at this critical moment.

Making Sense of the Impact Measurement Soup: A Matrix as a Starting Point

Increasingly the landscape of Impact Measurement is crowded, dynamic and contains a diversity of frameworks and approaches — which can mean we end up feeling like we’re looking at alphabet soup.

As we’ve traversed this landscape we’ve tried to make sense of it in various ways, and have begun to explore a matrix to represent the constellation of frameworks, approaches and models we’ve encountered in the process. As shown below, the matrix has two axes:

The horizontal axis provides us with a “time” delineation. Dividing the left and right sides between retrospective (ex post) and prospective (ex-ante) approaches to measuring impact.

More specifically the retrospective quadrants include approaches/frameworks/models that ask about events in the past: What impact did we have? While the prospective quadrants include approaches that ask about the possible future: What impact will we have?

The vertical axis provides us with a “purpose” delineation. Dividing the upper and lower parts between Management + Evaluation.

The top-level Management quadrants focus on methods that count quantifiable data (i.e. time, dollars, widgets). These frameworks tend to measure outputs from activities/interventions. They tend to ask the question what happened or what could happen and rely significantly on quantitative data.

The bottom-level Evaluation quadrants include a range of approaches that look at a broader range of questions beyond counting. They include questions like: what changed and why? What was or might the interrelationships between changes be? They tend to draw on a mixture of quantitative and qualitative data to create a more cohesive understanding of changes that occurred, are occurring or could occur.

A word of warning: As with all frameworks, this matrix is a “construct” — a way for us to engage in sense-making and to critically discuss how impact measurement is being undertaken in our current context. We are sharing this as a starting point for a broader discussion. We welcome feedback, reflections, and challenges around how we have represented different approaches — we are not seeking a ‘true representation’, but rather, a starting point for dialogue about how all the methods that now abound are connected, entangled and constructed.

We hope that will enable us not only to make sense of the methods, but also to engage in some rigorous discussions about the purpose of impact measurement and the challenges we should be raising as it becomes accepted as the norm in any ‘change’ oriented work

Measuring Impact in Complexity and Uncertainty

There are growing calls to ‘act’ for change — and equally loud calls for acting in ways that will actually result in progress or real shifts towards particular outcomes — sustainability, regenerative futures, justice, equity for example. And this is the conundrum that ultimately faces all impact measurement approaches purporting to help address the challenges we face.

The central question is ‘how will we know’ if anything is changing? But this is also where things get hard and messy in complex contexts. In the complex, entangled realities that characterise all the above challenges, we need to be wary of methods that privilege ‘causation’, certainty and targets (for a brilliant analysis see: Snowden https://thecynefin.co/the-banality-of-measurement/). On the other hand, we should also question methods that take the view of complexifying everything so that the hard and real conversations about whether and how things are changing become opaque and accessible only to the few.

In complex contexts we need methods that help us navigate forward in a direction — but, importantly, don’t fixate or confuse that direction with a destination (again, see Snowden, https://thecynefin.co/start-a-journey-with-a-sense-of-direction/). That raises questions for most of the methodologies in the matrix that focus on targets or boil things down to singular notions of ‘value’. These create the illusion of certainty, but may not actually help us figure out whether we are moving in a direction that is good for people, places or the planet.

In complex contexts we should also explore questions around the granularity of what we are measuring, and what that means for the potential that actions may have the kinds of impacts we are seeking. It is getting too common for impact measurement to focus on the granularity of individual projects, companies or initiatives with the expectation that measurement at these levels will somehow add up to overall impact across much broader fields such as sectors, industries and even systems. Measurement does not generate magic.

We have used a broader exploration of impact measurement to explore our own individual journeys in this work. And we have mapped where we think our work has predominantly sat, and where we find ourselves drawn to or working towards. As we work in complex contexts we are increasingly moving towards a focus on learning and experimentation (trying, testing, learning) as a way to figure out next steps in the direction of travel — navigational learning. Within this, our hypothesis is that ‘monitoring’ lead data, with rapid-cycle integration of learning, is a much more useful focus. We try as much as possible to avoid the delusion that we can ‘hit targets’ or set up perfect logic frames as we grapple with organising ourselves within and across tangled networks and practices.

We are interested in how others are framing ‘impact measurement’ in the context of complex challenges and systems shifting work, and what you are seeing that is helping with sensemaking in this increasingly crowded space. We offer these thoughts in the spirit of dialogue — so please feel free to share thoughts and ideas!

Contributors to this post and graphics:
Prof Ingrid Burkett and A/Prof Joanne McNeill

References and Sources:

Abelson, P. and Abelson,P., Cost-Benefit Analysis: Then and Now (April 5, 2022). TTPI — Working Paper 6/2022, Available at SSRN: https://ssrn.com/abstract=4080682 or http://dx.doi.org/10.2139/ssrn.4080682

Alkin, M. and King, J. (2016). The Historical Development of Evaluation Use. American Journal of Evaluation. 37. 10.1177/1098214016665164.

Bouk, D. (2015) How Our Days Became Numbered: Risk and the Rise of the Statistical Individual, University of Chicago Press.

Flyvbjerg, B. and Bester, D., The Cost-Benefit Fallacy: Why Cost-Benefit Analysis Is Broken and How to Fix It (September 6, 2021). Journal of Benefit-Cost Analysis, October, pp. 1–25, doi 10.1017/bca.2021.9., Available at SSRN: https://ssrn.com/abstract=3918328

Harford, T. (2021, Oct 09). The hidden costs of cost-benefit analysis: THE UNDERCOVER ECONOMIST [Europe Region]. Financial Times

Hogan, L. (2007) “The Historical Development of Program Evaluation: Exploring Past and Present,” Online Journal for Workforce Education and Development: Vol. 2: Iss. 4, Article 5. Available at: https://opensiuc.lib.siu.edu/ojwed/vol2/iss4/5

Jacquet, J. (2014). A Short History of Social Impact Assessment. 10.13140/RG.2.1.1470.5686.

Jiang, W. and Marggraf, R., The origin of cost–benefit analysis: a comparative view of France and the United States. Cost Eff Resour Alloc 19, 74 (2021). https://doi.org/10.1186/s12962-021-00330-3

Joyce, M., (2020) Impact Measurement: A cautionary tale, Medium, available at: https://tinyurl.com/5pndpf98

Madaus, G., Stufflebeam, D., and Scriven, M.S. (1983). Program Evaluation. In: Evaluation Models. Evaluation in Education and Human Services, vol 6. Springer, Dordrecht.

Mayda, J. (1993) Historical roots of EIA? Impact Assessment Bulletin 11 (4), 411– 415.

Niţă, A., Fineran, S. and Rozylowicz, L. (2022). Researchers’ perspective on the main strengths and weaknesses of Environmental Impact Assessment (EIA) procedures. Environmental Impact Assessment Review. 92. 106690. 10.1016/j.eiar.2021.106690.

Singh G., Lerner J., Mach M, et al. Scientific shortcomings in environmental impact statements internationally. People Nature. 2020;2:369–379. https://doi.org/10.1002/pan3.10081

Tulchinsky TH. John Snow, Cholera, the Broad Street Pump; Waterborne Diseases Then and Now. Case Studies in Public Health. 2018:77–99. doi: 10.1016/B978–0–12–804571–8.00017–2. Epub 2018 Mar 30. PMCID: PMC7150208.

--

--

Griffith Centre for Systems Innovation
Good Shift

Griffith University's Centre for Systems Innovation exists to accelerate transitions to regenerative and distributive futures through systems innovation