Why Doesn’t The Agile Community Practice Empiricism?

How we can improve our profession with more reliance on objective evidence, and less on personal opinions

Christiaan Verwijs

Published in

The Liberators

8 min readJul 26, 2021

You can also listen to a podcast of this episode.

Try this. Open your personal feed on LinkedIn and find the first post that makes claims about Agile or Scrum. You can also check the most recent article on your favorite Agile blog, podcast, or YouTube channel. Now check if the person making the claim also backs it up with evidence that is not based on personal experience, personal preferences, or their own interests.

I sampled 50 recent posts from popular professional platforms (e.g. Serious Scrum, Scrum.org, Agile Alliance, and many others) for this post. I specifically looked for posts that made strong claims about Scrum, Agile or specific practices. For example, one post argued that SAFe was completely ineffective and merely marketing. Another post claimed that estimation of work produces less learning and takes more time. I also read a post that stated that Scrum Masters can’t also be developers while another argued against JIRA for Scrum teams.

If you’ve been part of the Agile community for a while, you’ll probably be nodding in agreement with these statements. I certainly did. Some of these statements have become self-evident truths among practitioners.

However …

Of the 50 posts I sampled, only 3 provided any evidence to support their claims. One referred to opinions by other “thought leaders”, which is highly subjective. Another referred exclusively to the personal experiences of the content creator. Only one post referenced an industry report by VersionOne. This is pretty remarkable when you consider that these claims have clear implications for the workplace if they’re followed by readers — which I’m sure is the aim of the authors. But is it really justified and ethically responsible for content creators to make such claims without being clear about their evidence?

My point here is not that these content creators did anything wrong individually. They merely illustrate a systemic issue in our field — and one that I am guilty of as well. We often make bold claims and suggest recommendations that are not backed up by equally strong evidence. Or worse, no evidence is offered at all. I believe that this is a serious problem in our profession.

Because what happens when seemingly self-evident truths are actually refuted by the facts? What if we find no meaningful difference between organizations that use SAFe and organizations that use other scaling methods? What if SAFe actually turns out to be more effective, or for certain kinds of organizations? What if we would find that Scrum Masters that also act as Product Owners don’t meaningfully affect effectiveness? What if they do even better? What if teams that estimate their work are actually more effective than teams that don’t when we would measure this? Or what if it turns out to be more nuanced, where estimation works better in some contexts but not in others? How do we know that our strong claims don’t actually cause harm, financial and personal?

Intermission: The Evidence Isn’t Clearcut

As it turns out, the evidence for the questions I raised above is not as clear-cut as the strong claims suggest. For example, Usman, Weidt & Britto (2015) offer a more nuanced view of the benefits and use of different estimation techniques in Agile environments. Putta, Paasivaara & Lassenius (2018) reviewed existing research on SAFe in 52 organizations and summarize both its challenges and its benefits. I was unable to find research that investigates the challenges of Scrum Masters that also act as Product Owners. But I have personal experience with instances where it worked quite well and where it didn’t work well. So I wouldn’t be surprised if a more objective investigation of this question would suggest the same.

My point here is not to conclude that “estimation is great”, “SAFe is a good idea after all” and “Scrum Masters can also be Product Owners”. My point is that actual systematic attempts to investigate these issues with an empirical approach seem to offer a more nuanced perspective on these questions.

Why Don’t We Practice Empiricism On Our Claims?

What is revealing here is that the Agile community is supposedly all about “empiricism”. And while the kind of empiricism that happens in Agile and Scrum is arguably more about learning from experience, should we not also apply empiricism to the claims we make about what is true? Wikipedia defines empiricism as:

“Empiricism […] emphasizes evidence, especially as discovered in experiments. It is a fundamental part of the scientific method that all hypotheses and theories must be tested against observations of the natural world rather than resting solely on a priori reasoning, intuition, or revelation.”

Empiricism allows us to develop reliable knowledge about the world and to understand the consequences of actions. Reliable knowledge requires reliable evidence that is proportional to the claim. Even though gut feeling, intuitions, and personal preferences certainly have their value, they are not reliable sources of evidence. The physicist Carl Sagan summarized this as “extraordinary claims require extraordinary evidence”. Sagan also urges us to be skeptical of claims that lack proportional evidence and challenge those making them without such evidence. This elevates discussions away from personal opinion, and into the quality of the evidence. These are far more productive discussions to have as they tend to encourage nuanced opinions. More importantly, the presence of objective evidence also makes our field more professional as it gives us better arguments to convince skeptics.

What Makes Good Evidence?

Good evidence is free from personal bias and is gathered and analyzed through unbiased methods. This means that other people can gather the same data (observations, cases, numbers) and use the same methods to reach the same conclusions. This is called “replication” in science, and it is the gold standard for how reliable knowledge is generated. I can’t possibly do justice to questions surrounding this topic in this post, ranging from the nature of knowledge (epistemology) to meaning (ontology), the limits of the scientific method (philosophy of science), and how conclusions can be drawn from data (statistics, inductive methods). I also don’t think these questions are important for the purpose of this post.

Because I don’t expect content creators to adhere to this gold standard. I also don’t expect practitioners in the field to do so. It is exceptionally difficult to gather strong objective evidence to support similarly strong claims. It also takes specialized skills to analyze data and reach sound conclusions. But instead of focusing on what we can’t do, here is what I think we can do:

A Call To Action

As content creators, we should be clear about the kind of evidence we use to support our claims. When we claim that “SAFe doesn’t work”, we have to be explicit about the evidence we use to support that claim. The evidence could be based on the teams you have personal experience with. It could be based on your own scientific research in hundreds of organizations. Or it could be based on research done by others. This allows for a more fruitful discussion about the quality of the evidence rather than the opinions.
We have to recognize the quality of the evidence compared to the strength of the claim. You can’t responsibly claim that “JIRA destroys transparency in Scrum teams” based on the handful of Scrum teams that you’ve worked with. Such a small sample isn’t representative of all Scrum teams. And your experience is probably biased by your preferences or your desire to create catchy content. You can always say: “Based on my experience with five Scrum teams I believe that [ statement ]”. If you’re writing an opinion piece, make it exceptionally clear that it is your opinion and not necessarily the truth.
We should be explicit about our own biases. We should consider and be clear about what may have biased our claims and the evidence we use. It is often easy to find only evidence that supports your claim (confirmation bias) or extrapolate a few observations to everyone (base rate bias). We may also have a stake in some opinions because a catchy and strong opinion draws in more readers.
We should connect the professional world with the academic world where we can. There is a surprising amount of useful academic research going on in organizations that practice Agile. A simple search on Google Scholar yields dozens of academic publications about Product Ownership, risk management in Agile environment, scaling efforts, and coding practices. Unfortunately, these appear to be separate worlds with few cross-overs. Let us connect with them where we can.
We should recognize that hearsay is not evidence. What someone else says is not evidence in its own right, regardless of how “true” it may sound or how much authority they seem to have. This is called “truthiness”, or the idea that something sounds very true even if it flies in the face of the facts. There is a lot of “truthiness” and groupthink on social media. This brings us to the final point.
We should be skeptical of strong claims. We should be highly skeptical of the claims made and the evidence presented to support those claims. If the creator isn’t specific about it, challenge them to present it. If the quality of the evidence doesn’t support the claim, challenge the creator.

Closing Words

I will be the first to admit that my content doesn’t follow these principles nearly as often as it should. You can challenge me on that.

My point with this article is not to argue that all professional content should present very strong (objective) evidence. There is always a place for opinionated conversations and column-like discussion pieces. But are we truly a professional community when most of the content seems to be of this nature? My point is that we should seek out objective evidence to support (or reject) our claims and beliefs where we can. We should also be more transparent about what kind of evidence we use to make our claims. And we should challenge strong claims that are made without strong evidence — especially from highly visible thought-leaders, gurus, and influencers. This group, in particular, bears a significant professional responsibility to present their evidence transparently.

I dream of an Agile community where practitioners work together to clarify beliefs they have and then investigate evidence — link to existing research, collect their own data, do case studies — to either confirm or reject those beliefs based on what they find. Others can replicate this or build on it. Since we’re not scientists, we can build bridges with academics to do the science stuff with us. I believe that such a process would offer more nuance and provide a stronger grounding for our work with clients. Wouldn’t this be a wonderful way to improve our profession? Wouldn’t that be a great way to apply actual empiricism to our work?

Limitations of this post and its conclusions

What is my evidence?: I sampled 50 blog posts from well-known industry platforms to reach my conclusion that we rarely use evidence in our professional community to support strong claims. I induced the rest of my claim from this evidence. It would be useful to repeat this process with a larger sample and without specifically sampling posts that make strong claims — but include any post.
How might I be biased?: I am trying to bring more scientific insights into my writing, and contribute to science with my own research. I do this for the reasons outlined in this post, but am also biased towards it.
How can we falsify my point?: My argument loses its point when we can establish that most content creators include evidence in their posts of sufficient quality to make the claims.