Why Science Is Essential To Professionalize Our Community

Christiaan Verwijs
The Liberators
Published in
10 min readMay 20, 2024

--

A while ago, I appealed to the Agile community to rely more on scientific research and less on personal opinions. I’m worried that the overreliance in our community on authority arguments, biased personal experience, hearsay, and wishful thinking is part of why Agile is in decline. Our profession is actively damaged by those who make strong claims without sufficient evidence or sell solutions without proof that they work. How do we know that what we’re peddling to teams and customers isn’t snake oil? The bar for what we consider “true” in our profession is quite low.

I’m not alone in this appeal. I also see it in the work of people like Nico Thümler, Karen Eilers, Luxshan Ratnaravi, Takeo Imai, and Joseph Pelrine. Since my appeal, I’ve worked to bring more scientific research into our profession. This consists of my research projects, and a series of blog posts to summarize scientific knowledge about a range of topics related to Agile. At the same time, there is a surprising amount of pushback from our community. Moreover, many people have approached me personally to wonder how this affects their work.

In this post, I will respond to the most common criticisms and questions about incorporating more scientific research into our field and clarify what my appeal doesn’t mean.

Together with Prof. Daniel Russo, Ph.D., I published a large-scale, peer-reviewed study about what makes Scrum teams more effective. We worked hard to make it easily digestible for the professional community. The publication itself is more technical and dry but worth a read.

Why is scientific research important to me?

The Agile profession is relatively young and vibrant. It is also deeply idealistic. A cursory look at LinkedIn or news aggregators like Serious Scrum shows many deeply held beliefs about what is and isn’t right. Most of us are in this game because we feel strongly about lofty purposes like “delivering value,” “creating human workplaces,” “enabling empiricism,” and so on. We also have a lot of beliefs about what is required to do so. For some, this is answered with specific frameworks (like Scrum, Kanban, XP) or the dismissal of others (SAFe).

It also manifests in beliefs about how teams should be formed (diverse versus homogeneous, fluid versus stable, technical versus non-technical), what roles are needed, what specific practices teams should use, and what cultural traits must be established in organizations. In other words, there are many things that each of us believes to be true about how teams and organizations should work. However, with the many lively debates on social media, it is also clearly true that there is a lot of disagreement on what is true. Some argue fervently that product managers and product owners are the same, whereas others don’t. Some argue for estimation; others are vehemently against it. And so on. Such questions are empirical, so they can be robustly answered with sufficient data, yet this is rarely attempted in the onslaught of opinions.

“There are a lot of things that each of us believes to be true about how teams and organizations should work. However, as the many lively debates on social media show, it is also clearly true that there is a lot of disagreement.”

Scientific research is important to me because it is the best systematic approach we have developed to determine what is true in the natural world. When practiced properly, it is the least prone to biases and the most likely to approximate how things work in the world around us. As a professional, I feel it is my ethical responsibility to practice what works and discard what demonstrably doesn’t. I also feel my ethical responsibility is to back up claims with evidence proportional to the claim's strength. If I make a generalized claim like “All Scrum Masters need technical skills,” I need to bring evidence from a large sample of Scrum Masters with and without technical skills and correlate to the effectiveness of their teams.

If that evidence doesn’t exist or I’m not aware of it, I should dial back the strength of my claim to be proportional to the evidence: “Based on my experience with X teams, I’ve noticed that Scrum Masters with technical skills seem to have more effective teams.” This is clearly where scientific research comes in. As the best method to determine what is true, we should primarily use scientific evidence to back up our claims about certain practices and do’s and don’ts. We can also use such evidence to challenge claims it doesn’t support.

Is there value in individual experiences?

Many Scrum Masters, Agile Coaches, and others share their personal experiences on blogs and social media. Does my appeal imply that those experiences are useless in uncovering what truly works (or not)?

Not at all. There is always space for individual experience. The scientific method is built on the notion that the experiences we derive through our senses should be the primary source of knowledge. Our individual experiences in our work with teams and in organizations can inform us how things might work. For example, you may experience that large teams are less effective than smaller ones. You may also observe that pair programming leads to better code.

The problem with individual experience is that it is prone to many biases. Our brains are wired to remember those instances that confirm our beliefs and ignore those that don’t (confirmation bias). If I believe that all teams need to be as stable as possible, I’m more likely to remember those instances where this was true. Furthermore, selection bias can easily lead me to experience cases that match my beliefs. For example, I‘ve grown into Agile in small organizations. I’ve long believed that Agile works better there than in large organizations. However, my sample is highly biased. Moreover, more objective data doesn’t seem to support my belief. A more extreme version is the “N=1 bias,” where we extrapolate a single experience to cover all cases. This is typical for many “best practices”-posts in our industry. This leads some people to discard estimation entirely because it doesn’t work in their organization. People argue that “product ownership” and “product management” should be discrete fields because this was the case in their company.

The gist of this is that individual experience can inform us about potential patterns. But we can’t generalize those patterns with confidence to other settings.

In summary, our individual experiences can certainly inform us about potential patterns (e.g., “SAFe wreaked havoc in my organization” or “Pair programming improved our code quality”). But those observations can’t be turned into generalized claims that are true for everyone without stronger evidence. Generalization is only feasible when we can pool many people's experiences from a representative sample and confirm that the pattern is also present there. If this is the case, we can confidently make our claim.

This is essentially what many scientific fields aim to do; they identify potential patterns from individual experiences (qualitative research) and test if those can be generalized to the population (quantitative research). This is why we can say with quite some confidence that teams learn more effectively when there is psychological safety (Edmondson, 1999). We also know that teams that release more frequently tend to have more satisfied stakeholders (Verwijs & Russo, 2023). And so on.

Should we only consider evidence from experiments?

Another line of reasoning is that people agree that scientific evidence is useful but then raise the bar to such a degree that nothing qualifies. In effect, one can make all sorts of claims without grounding them in scientific evidence because it isn’t possible to collect such evidence anyway.

One example is when people argue that only evidence derived through double-blind, controlled experiments is acceptable. These are studies where the researchers modify one variable while keeping everything else equal (“controlled”) and then measure certain outcomes. If the results change because of this modification, it can be attributed with high certainty to the modified variable. Ideally, both the participants of the study and those performing it don’t know what is being tested, making it “double-blind.” While it is true that such studies give a high degree of confidence, it is unreasonable to require it for all sorts of scientific evidence and domains. It often doesn’t even make sense. Let me explain.

Experimental studies are viable in scenarios where the researchers can keep all other variables equal. This is possible in some branches of the natural sciences, like physics, chemistry, and medicine. However, experimental designs are impossible in most other fields, or at least highly unfeasible or unethical. A historian can’t modify something in the past to see how it affects the present. An astrophysicist can’t make one star explode while keeping an identical one intact to see how it interacts with an identical planetary system. A developmental psychologist can’t raise the same kid twice in two settings to see how it affects development. An organizational researcher can’t force everything to remain the same while implementing a certain methodology.

For all their strengths, experimental designs are not suited for every research question or possible in every context. The system under study is often too complex to control all potential variables, like in the social sciences, climate science, and ecosystems, where a thousand variables can influence what happens (culture, personality, personal beliefs, events. etc.). There are also ethical reasons not to use experiments. We can’t expose teams to extreme stress or psychological unsafety to see how they react. Instead, scientists develop evidence with various methodologies, including observational studies, theoretical models, case studies, meta-analyses, and more, to comprehensively understand various phenomena. Each method has its strengths and limitations, and the choice depends on the nature of the research question and the available resources.

Instead, we should weigh the evidence from diverse methodologies (qualitative, quantitative, survey study, correlational, and so on) to inform how true a claim is. Academic communities often publish meta-analyses and literature reviews to weigh the scientific evidence around psychological safety, effective leadership styles, team stability, and scaling frameworks. You can find those on Google Scholar (select “Only reviews” as a filter).

In summary, the argument that we can only rely on evidence from experimental, double-blind studies is unreasonable. It would effectively discard most scientific evidence and disqualify the scientific method altogether for most domains. Instead, we should inform ourselves about the scientific consensus around relevant areas and use that to decide which claims are grounded in evidence and which are just snake oil.

“We should inform ourselves about the scientific consensus around relevant areas and use that to decide which claims are grounded in evidence and which are just snake oil.”

What does this mean for your practice?

With my appeal, I do not expect everyone to engage in scientific research themselves. While I applaud more community-driven research, performing proper scientific investigations is a profession in its own right.

It's useful to look at how more mature professions do this. For example, psychologists have non-commercial institutions like the APA that educate, train, and license psychologists. They also fund research projects and inform their members about scientific studies. Similarly, engineers have the IEEE; architects have the AIA, and project management has the PMI. Such professional institutes exist to protect their professions. In our profession, the Agile Alliance is the closest to what these institutes do. Other institutes in our industry are for-profit, like Scrum.org, Scrum Alliance, Scaled Agile Inc., and others. While they may invest in research, their priority lies in selling commercial services rather than improving the profession.

“Since we don’t have strong non-profit institutions in our profession like the APA, IEEE, PMI, or AIA, the responsibility to protect our profession is more up to the professionals themselves.”

Since we don’t have strong non-profit institutions like the APA, IEEE, PMI, or AIA to provide independent guidance, protecting our profession is more up to the professionals themselves. Here is what you can do:

  1. Don’t make bold claims to customers or on social media without presenting strong evidence that supports them. A bold claim is something like “SAFe is the worst thing that happened to Agile.” You can still share this as an opinion but demarcate it as such (“In my experience …”). Anything less is unprofessional, in my view.
  2. To assess the strength of the evidence for a claim, consult the scientific consensus around it. Google Scholar is a great starting point. I recommend searching for “review articles” that are fairly recent (like the last two decades). I wrote much more about how to do this here. Academic articles aren’t always easy to read. However, those articles' summary, introduction, and conclusion can usually help. I’m trying to summarize the scientific consensus around several Agile-related topics, like the optimal size of teams, the benefits of pair programming, whether Scrum works better in small organizations, the effects of scaling on team effectiveness, the effects of stress and motivation, and so on.
  3. You can contribute to scientific research by connecting with academics like Daniel Russo, Nils Brede Moe, Maria Paasivaara, Margaret-Anne Storey. They are often looking for participants or participating organizations in their studies.
  4. You can gently challenge others who make bold claims on social media to present their evidence. If their evidence is based solely on their own experience, the hearsay of others, or even their business interests, we can disregard their claim until actual evidence is provided. If we do more of this as a community, we can continue to improve the professionalism of our profession.

Closing Words

In this post, I respond to some comments made when I appealed to our community to rely more on scientific research and less on personal opinions. I wanted to highlight the continued need for more scientific research in our field. This will bring more professionalism and allow us to deliver more value to the teams and organizations we work with (and less snake oil). This won’t be easy, and it isn’t helped by the lack of independent non-profit institutes that can help in this area. But we must start moving in this direction as a community. I outlined some things you can do in your practice to support this.

Check out patreon.com/liberators to support us.

--

--

Christiaan Verwijs
The Liberators

I liberate teams & organizations from de-humanizing, ineffective ways of organizing work. Developer, organizational psychologist, scientist, and Scrum Master.