Are you really ready for developmental evaluation? You may have to get out of your own way.

17 min readJul 2, 2019

Tanya Beer, Center for Evaluation Innovation

Developmental evaluation requires changing the way evaluation commissioners and evaluators work together. If it feels like business as usual, you’re probably not doing it.

On May 31, 2019, Community Science sponsored the webinar “Developmental Evaluation: Rewards, Perils, and Anxieties.” A foundation officer (Kristi Ward, Bush Foundation) and two evaluators (Kien Lee, Community Science and Tanya Beer, Center for Evaluation Innovation) discussed what developmental evaluation is intended to be and the realities of practicing it. This article is based on Tanya Beer’s introductory comments.

First, let me clarify that this is not an introduction to developmental evaluation (DE). I am assuming you’ve done some thinking and reading, and perhaps even working, on developmental evaluation. [If not, start here and then come back and read this before you commission or conduct a developmental evaluation.]

It’s time for a candid conversation about how the conventional routines of both funders and evaluators are getting in the way of high-value DE.

I work for a nonprofit that focuses on improving the value and practice of evaluation in philanthropy, where evaluation commissioners have a high degree of freedom to experiment with both their social change strategies and their evaluation approaches. [I suspect, however, that many of the same tensions and lessons below also will apply to public sector DE work, where less flexibility for strategic and evaluative experimentation can make routines and practices even harder to budge.]

These reflections come from my experiences as a developmental evaluator for foundations, and from the Evaluation Roundtable, a network of evaluation leaders from over 100 foundations in the US and Canada. We regularly interview these leaders about their experiences with evaluation, so some of our observations about what gets in the way of meaningful DE engagements come from them.

Let’s start with some level-setting on the definition of developmental evaluation.

I’m drawing on a definition offered by Michael Quinn Patton, who literally wrote the book on developmental evaluation. Patton says:

“Developmental evaluation supports innovation development to guide adaptation to emergent and dynamic realities in complex environments by collecting and analyzing real-time data in ways that lead to informed and ongoing decision making as part of the design, development, and implementation process.”

It’s a meaty definition. Yet we’ve observed that significant aspects of it are either routinely ignored or swamped by competing evaluation demands and mindsets. As a result, we have a hunch that DE has entered the zone of jargon that — just like “learning” or “collaboration” or “systems change” — has come to represent a wide and muddy range of concepts and behaviors. So it’s worth giving the definition some real attention.

First, as Patton regularly reminds folks in his writing and workshops on DE, the primary distinction between DE and other evaluation approaches is a distinction in purpose.

**Developmental evaluation is done for the purpose of supporting innovation development.**

What does this really mean? Is “supporting innovation development” something distinct from applying lessons learned via evaluation to one’s program or strategy? An evaluator colleague and I have had long debates about whether DE is a distinct approach or just good utilization-focused practice. After all, you can learn from and use any kind of evaluation.

We’ve heard many funders say they’re looking for a developmental evaluation, but when we probe, we discover they’re using the term “developmental evaluation” as a stand-in for “learning.” What they really want is an evaluation they find useful for learning, regardless of whether they are working on innovative strategies that are: 1) under development (rather than fully-baked) and 2) expected to adapt in response to an emergent and dynamic context.

We’ve also seen many evaluators pitch DE by peppering proposals with phrases like “real-time learning” or “complexity,” but then they do an evaluation that looks and feels no different from a conventional formative evaluation. They lay their questions and methods out at the start and schedule a predictable supply of “lessons learned” deliverables.

So is there a difference between evaluation that supports learning in general and evaluation that supports innovation development? I think there is.

Our field’s hazy distinction around this is problematic because it makes it difficult for commissioners who really do want DE to find evaluators who operate with a clear understanding of how DE is distinct from formative evaluation or just learning in general — and vice versa. Getting clarity on this distinction requires a quick look at the origins of evaluation practice.

Our conventions for commissioning and engaging in evaluation still tend to assume linear, programmatic conceptions of change, even when we talk a good game about innovation, complexity, and emergence.

The discipline of evaluation was born in a technocratic era of models, projects, and programs, where the nature of change work was about developing a set of components that, if implemented well, would reliably produce the same outcomes across time, locations, and contexts. (I recognize this is reductive and covered elsewhere more thoroughly, but bear with me…)

Now let’s differentiate this program-based mindset from innovation, using the definition offered by Michael Quinn Patton, Kate McKegg, and Nan Wehipeihana in their book Developmental Evaluation Exemplars:

“Innovation as used here is a broad framing that includes creating new approaches to intractable problems, adapting programs to changing conditions, applying effective principles to new contexts (scaling innovation), catalyzing systems change, and improvising rapid responses in crisis conditions.”

This kind of innovation occurs in dynamic conditions where all kinds of factors interact in ways that we cannot fully predict or, of course, control. And despite our best efforts to draw increasingly complicated theories of change that predict it all, innovative strategies must by their very nature emerge and be adapted along the way. Both the pathway to change and even the outcomes themselves are likely to shift as actors negotiate their way through sticky systems dynamics.

This makes the second important part of Michael’s developmental evaluation definition worth pulling out:

“…to guide adaptation to emergent and dynamic realities in complex environments.”

Supporting innovation development requires that evaluators bring data and facilitate sensemaking that offer innovators insight about existing and emergent dynamics and how they might shift their innovations in response.

The language many use to *talk* about their work, which aligns with the definition of DE (“Change is complex! Systems are dynamic! We have to learn our way into solutions!”), is often at odds with the way the work actually is designed (“We must have a detailed theory of change before we can design an evaluation. We must plan out the indicators and outcomes and methods and report at regular intervals over the next five years. We must deliver lessons about progress or impact at predetermined moments as determined by our contractual agreement.”)

So even though evaluators and strategists know the work is complex and uncertain, the way it is carried out does not always reflect this understanding. This can leave strategists feeling that DE failed to help them innovate and adapt in powerful ways, and the evaluator feeling like the learning they offered didn’t matter to strategists’ decisions.

The funder isn’t really innovating and the evaluator isn’t really helping to support it. Everyone is dissatisfied. So what’s the problem?

Some of the most basic and taken-for-granted practices of evaluators and evaluation commissioners are at odds with innovation in complex systems.

These practices are codified into our evaluation commissioning, designing, contracting, and budgeting routines, and even into evaluation firms’ basic business models. They create tough-to-negotiate tensions with DE.

Tension 1: Our routines around setting pre-determined evaluation questions, methods, timelines, and deliverables do not fit with the uncertainty and unpredictability of innovation in systems.

I have seen many evaluation commissioners put out a DE request-for-proposals with all of the evaluation questions already listed, along with a request for applicants to detail their full design, methods, and deliverables.

Additionally, evaluators and evaluation commissioners can find themselves tied in knots at the outset of a DE when evaluators, who likely were trained in a program-based evaluation mindset, ask “What’s your theory of change and how clear is it?”

For innovations in complex systems, the theory of change cannot be (or should not be) fully fleshed out at the start. If it’s actually innovation, the best a DE evaluation commissioner can say is, “We have some initial hypotheses about where to start and what we want our end game to be. But we won’t know how this will unfold until we get into the field and see how people respond to it.” This creates a whole different situation for evaluators who are accustomed to crafting full evaluation designs at the outset.

If you start your DE with pre-set questions and a fully developed theory of change, you already have missed the point. You have focused your lens too narrowly and assumed predictability when what we need is the space for questions to emerge and for deploying methods that are responsive to unanticipated questions and decisions.

Just as developmental evaluators need design flexibility, they need budget and scheduling flexibility since they can’t chart out entirely how the work will look (if they have, they’re not really offering DE). Strategists will need the support of a developmental evaluator with a lot of intensity in some phases of the work and not others. When new roadblocks or challenges arise, evaluators have to collect data rapidly that they were not anticipating. Our traditional contracts with fixed deliverables and timelines generally don’t allow for this kind of flexibility. And evaluation consultants’ staffing models struggle to accommodate it.

Tension 2: With DE, the need for real-time data is in tension with conventional conceptions of methodological rigor, requiring a constant negotiation of trade-offs between level of certainty and speed.

If strategists actually are approaching their work as innovation in complex systems (again, often they’re not when they say they are), DE questions will emerge that we can’t anticipate in advance. Let’s say we tried something on the ground and encountered unanticipated resistance and we don’t understand why. We have to identify quickly what fast-paced methods we can use to get the data that helps us see our way through the dilemma.

Evaluation commissioners and contractors really struggle with DE when they are uncomfortable with small samples and other challenges that limit confidence in the conclusions we’re drawing. The amount of time it takes to collect data and do robust analysis means evaluators often miss decision-making opportunities. For DE to work, evaluation commissioners and consultants have to decide together how much certainty is needed versus how quickly they need the information and what trade-offs they’re willing to make, all the while being honest with themselves about the potential biases and blind spots that information may have.

Tension 3: Conventional contract accountability mechanisms, such as payments triggered when pre-set deliverables are provided, are in tension with the naturally uneven flow of innovation and DE.

Our current evaluation contracting models are transactional. They purchase products or knowledge at predetermined points in time. We hold evaluators accountable for the deliverables they produce. This hamstrings the evaluator’s ability to say, “I don’t think what we thought would be useful six months ago is what we need right now.”

With developmental evaluation, the work is far more consultative, meaning we’re in it together, deciding together what’s needed at what point in time. A lot of time is spent doing sensemaking — not in the production of data and deliverables, but in the actual exchange between strategists and evaluators to say, “What does this mean? What are we seeing? What are the patterns? Where should we head from here?”

The accountability mechanism we build into DE contracts needs to look different so that we’re not locking evaluators into a specific set of actions and timelines. Instead, we need to think about accountability more like we would for a capacity-building technical assistance provider or a strategy consultant, where contractual accountability is based on the expectation that you’re “buying” time and quality thought partnership over the long run.

Tension 4: The inevitable moment when external evaluation users such as foundation boards and leaders ask “what’s our impact” is in tension with the focus and fundamental purpose of DE.

Given DE’s purpose, the primary users are strategists themselves. I have been in many situations where evaluation commissioners say they want DE, and then halfway through, their board asks for conclusive information about outcomes or impact. We are charged with responding to that need even though our DE lens hasn’t been focused on drawing conclusions about impact. We’re caught in the middle of competing demands without a clear sense of how to proceed.

We’ll always have to help boards and other evaluation users think differently about their accountability roles with innovation and emergent strategies. We have to find ways of setting expectations about what a developmental evaluation is and isn’t, and to manage the inevitable questions that will come up about outcomes and impact.

Given these tensions, what kind of disposition do evaluation commissioners need to engage effectively in developmental evaluation?

Everyone has that one friend whose house you would NEVER show up at unexpectedly. The one who breaks into the sweats when you add a fifth person to the four-person dinner reservation at the last minute. Sometimes I suspect there are a disproportionate number of those folks in the evaluation field. (And thank goodness that’s the case.) But this also means that not every evaluator —nor every funder — is cut out for DE, regardless of their historical experience with other kinds of evaluation or even their commitment to “learning.” DE seems to take a certain disposition and orientation towards the unknown.

If you are in an organization that is uncomfortable with ambiguity, or that does not allow for significant shifts and adaptations along the way, you may not be able to use DE in a way that makes it worth the effort. You might get great feedback from DE, but you’re too constrained to use it because strategy changes require a long sequence of approvals.

You also need to truly value continuous learning in ways that you can point to operationally in your everyday work. Do you have sufficient resources and staff bandwidth to engage in ongoing inquiry? Are program strategists hungry for information and open to the idea that their assumptions not be on the mark? Will evaluators have a hard time getting on program staff calendars? (In our experience, this is a chronic problem at foundations.) Are you ready to have an evaluator with you regularly at the strategy table listening for questions, dilemmas, and issues that are emerging?

A general interest in learning does not mean that the organization is a good fit for developmental evaluation. We can learn from any kind of evaluation, but the degree to which we’re actually innovating, adapting, and being responsive to push-back from the system around us is the thing that makes our work ripe for DE.

Evaluation commissioners have to be transparent and candid with their consultants about the degree to which these conditions are in place. Commissioners and consultants have to talk through how dynamics and mindsets within the foundation might affect the work, and foundation evaluation staff have to serve as internal advocates for getting the developmental evaluator the time and flexibility they need.

Developmental evaluators also have some dispositional requirements.

Developmental evaluators, too, have to be comfortable with uncertainty and ambiguity. I like being analytical. I like laying things out in sequence and thinking thoroughly and deeply into the distance to clarify assumptions and theories of change. (In fact, I’m not a big fan of people showing up at my house unexpectedly!) Our first instinct is to force our clients to clarify their theory of change so that we can evaluate against it. But with innovation, sometimes strategists only have a big vision, a general sense of what new ways of working or new approaches are needed, and a set of starting hypotheses for how to make it happen. We evaluators need to be able to sit with that uncertainty, to sense when it’s time to push for clarity and when (and about what) it’s okay for innovators to be unsure.

Developmental evaluators also need tolerance for conflict and disagreement. The evaluator needs to be comfortable pushing back on assumptions and surfacing conflicts in thinking. Evaluators often aren’t accustomed to saying, “I think you’re thinking about this wrong. Let’s go test it.” On the other side, for those commissioning developmental evaluations, you have to create and protect that space for the evaluator. Foundations tell us they want evaluators to push back on their thinking, but evaluators tell us that foundations are not often open to that in practice.

Developmental evaluators need to be comfortable participating in strategy discussions. We’ve all heard that the developmental evaluator functions like a “critical friend.” We underestimate how uncomfortable that role can be for both the commissioner and the evaluator.

This role asks us to move far away from our traditional outsider stance, distanced from strategic discussions and decisions. Developmental evaluators need to hear strategy discussions and bring in evaluative information to help people consider in the moment, “How are you thinking about this? What assumptions are you making? How is it unfolding? Given what we know, what might we anticipate about how this will be received?” This role can be uncomfortable for evaluators who feel like they’re walking a fine line between being a strategist and being an evaluator.

Developmental evaluation can create anxieties and disappointments for evaluation commissioners and evaluators who apply conventional evaluation routines to an evaluation endeavor that is fundamentally different.

DE requires different ways of doing evaluation business on both sides of the evaluation contract. Applying conventional ways of evaluating to unconventional ways of strategizing will lead to anxieties, disappointments, and ultimately wasted evaluation potential.

We’ve learned — often painfully — a few basic tips about how to get out of your own way from the very beginning of a DE:

Tips for evaluation commissioners

Design Requests for Proposals differently. When creating an RFP for developmental evaluation, don’t lay out all of your evaluation questions and ask applicants for a full description of the evaluation design, methods and deliverables. Instead, identify a few early stage questions that are currently puzzling you, and ask for applicants to suggest some ideas for how they might bring relevant data to the table. Focus the bulk of the RFP on understanding how the applicant envisions working with the team over time to identify emerging questions, decide when data collection is necessary, and feed critical information back to the team for join sensemaking in a timely way. How does she/he understand innovation and systems? What has she/he learned about what it takes to navigate organizational dynamics to really integrate DE into the strategic decision making process?
Allow for flexible budgeting. Anticipate higher budgets than other types of evaluations, as evaluators need to be much more present at the strategy table in an ongoing way. Don’t ask for budgeting detail beyond a rough initial outline (Evaluators can only give line-item detail when they know exactly what methods will be used when). Instead, consider setting aside an amount to be expended quarterly or semi-annually, with the evaluator and you together deciding how those resources should be allocated to particular evaluative approaches or methods as the work progresses and questions emerge. Whatever you do, don’t cut the budget for time together in meetings (usually the first thing to go when budgets are being trimmed). Time for the evaluator to listen for hidden or conflicting assumptions and emerging questions, as well as to help the team make sense of what they’re seeing and grapple with what the data might mean for next steps, is an indispensable aspect of DE.
Increase time and access. Plan for your evaluator to be present at the strategy table far more frequently than for other kinds of evaluation, and an extra amount for the first several months of the engagement as they build a deep understanding of the strategy team’s aims and thinking. Invite the evaluator to operate as a core member of the strategy team, albeit with a distinct role. Do not reserve separate end-of-meeting time slots for presentations on evaluation, but rather invite evaluators to bring data and tough questions to the table throughout strategy discussions, grantee meetings, etc., so that strategists don’t fall into old patterns of thinking about evaluation as reflecting on the work rather than informing it. Eliminate gatekeeping between evaluators and strategists, grantees, participants/beneficiaries, and other stakeholders. If your evaluation staff person is the main contact for the DE team, you’re already limiting the potential for DE to serve the real information needs of strategists.
Redesign contract terms. Rather than pinning contract accountability and payment to specific predetermined deliverables, consider designing a contract that triggers payment based on a regular performance review where the program team, the evaluator, and the commissioner’s contract manager review the array of evaluative work that has been conducted over the past 6–12 months and have a candid conversation about 1) whether expectations and agreements are being upheld by both sides, and 2) what developments in the innovation the evaluation has supported (see #5 in the consultants advice section below). Note: This is a significant and risky shift for evaluation firms, since the sudden loss of a contract that was expected to last much longer can be devastating to a business, particularly a small firm. Consider building in a “probationary” or ramp-down period if the parties decide the work isn’t adding sufficient value. Otherwise it’s too risky for the evaluator’s ability to maintain necessary cash flow to pay their staff.
Plan for the impact question. Anticipate that your board will ask questions about the impact of the systems innovation (probably prematurely) and protect the developmental evaluator from this kind of mid-way switch-up in purpose, questions, user, and design. Remind the board as often as possible about the nature of complex systems innovation and what to expect from developmental evaluation. Consider asking the evaluator to commit some of their resources to track emerging outcomes in preparation for the board, while keeping a significant portion of the budget open for developmental questions (what Patton calls “patch evaluation design”). Alternately, consider engaging a separate evaluator at a different point in time to do a retrospective evaluation that will determine the impact of the innovation on any outcomes that begin to appear.

Tips for Evaluation Consultants

Push back on commissioners. Look for — and push for — all of the above with evaluation commissioners. Don’t be afraid to say “I don’t think you really want DE” when commissioners are asking for fully designed approaches, line-item budgets, conclusions about impact, and/or are signaling that your access to program strategists will be limited.
Experiment with flexible staffing. Consider how your staffing models could be restructured to allow for rapid deployment of unexpected methods, e.g., a floating pool of “rapid response” staff with good methods chops and cross-training, a bank of outside subcontractors at the ready, etc. (though it will be critical to maintain the core team at the strategy table throughout the engagement). If your relationship with the commissioner is solid, consider asking for a consistent payment each quarter, even if the work is more intense in some and less intense in others, tracking so that it all comes out even at, say, the end of the year. (Frankly, we haven’t quite solved this challenge either, so let us know what you come up with!)
Allow for unknowns and “I have no ideas”. Check your habit of asking for fully fleshed-out theories of change at the outset. Instead of getting frustrated with lack of clearly articulated theory or anticipated outcomes on the part of the strategists, consider how data collection and other evaluative work can help people explore what options are possible moving forward in shorter, testable increments. In other words, how can your evaluative support help them work their way towards a TOC by more rapid-cycle testing and feedback. For example, if innovators don’t know what will come of action X but they have a hunch it’s important, design data collection to help capture the range of outcomes that emerge from action X to determine whether that pathway shows promise.
Stop doing midterm and final reports. Instead of packaging and delivering evaluation findings at regular pre-planned intervals, get accustomed to lighter-burden, rapid delivery of insights when the strategy question is actually on the table. This may take the form of a short learning brief focused on an individual question, a quick deck or visualization, a real-time participatory data collection and sensemaking meeting, or even just a phone call to say to strategists, “Here’s what we’re seeing in the data. What do you think its implications are?” Be liberated from the final report.
Keep track of developments. In our experience, it’s difficult for people to see how their own thinking and strategy changes over time. As we come to understand more about the system we’re working to change, we forget what we used to believe and assume. It can be useful for developmental evaluators to keep a log of key developments in the innovation, as well as the team’s rationale for its changes in strategy. This makes visible the strategists deepening understanding of how to make the changes they want to see, and can help you illustrate where and how DE contributed to that growth.

Above all, just like innovative strategists, developmental evaluators and commissioners need to approach their work together as an innovative enterprise…testing, learning, and adapting the evaluative approach as we go.

Tanya Beer is associate director of the Center for Evaluation Innovation and co-director of the Evaluation Roundtable. Email: tbeer@evaluationinnovation.org