A group photo of the Dagstuhl castle of every attendee in the workshop on a staircase.
The official Dagstuhl group photo.

Dagstuhl trip report: theories of programming

Amy J. Ko
Bits and Behavior

--

About four years ago, my friend and colleague Thomas LaToza came to me with an observation: we spent a lot of time in CS talking about ways to make programming easier, better, less error prone, more powerful, and even more just, but rarely do we ever try to explain any of these challenges. Why is programming hard? Why is it slow? Why is it error prone? Why is it powerful? How does it do harm? These why and how questions about are so fundamental to understanding programming as a human and social activity and likely instrumental in reshaping it thoughtfully, and yet CS, as a discipline, often sets those questions aside in favor of how to questions, offering a flurry of new tools, methods, and techniques. Lots of tools, no theory.

Thomas wanted to talk about this gap in our discipline, and pitched the idea of a Dagstuhl workshop to bring together areas of CS most concerned with programming as a human activity — software engineering, programming languages, human-computer interaction, and computing education— to try to understand the challenges of developing theories of programming and using them to shape our activities of innovation. I joined, along with Dag Sjøberg, David Shepherd, and Anita Sarma (who later had to withdraw) to try to organize a week at Dagstuhl to have these conversations.

Then, of course, the pandemic happened. Our seminar was postponed, then canceled, then resurrected and rescheduled, then postponed again. It wasn’t until the beginning of 2022 that we felt like we had a stable date of June 6–10, and we begin finalizing invites for the official Dagstuhl Theories of Programming seminar. Fortunately, interest was strong (even though interest in traveling was not), and we had 28 participants from across CS commit to join.

Our workshop was only 3.5 days because of a German holiday, and so we planned a relatively dense four days of activities:

  • Tuesday: welcome, what is theory, describing theories, critiquing theories
  • Wednesday: brainstorming unexplained programming phenomena, sketching theories, getting feedback on theories, and refining theories
  • Thursday: presenting theory sketches, discussing ways of sharing theories, and skeptically examining whether developing theories of programming is really worth the time
  • Friday: reflecting on takeaways and departure

In addition to those content sessions, we also had the usual breakfast, coffee, lunch, cake, dinner, and social time scheduled into all Dagstuhl workshops. We also reserved each afternoon session for walks, hiking, biking, conversation, games, and other play. As part of this agenda, the organizers also held to a few principles: 1) no sessions full of talks, and 2) no session format should happen more than once. We wanted to keep participants engaged and mix up modalities.

Here’s a rough overview of what happened each day and what we learned:

Thomas prepares to kick off the workshop with a slide titled “What is theory?”
Thomas sets the figurative stage on the literal stage.

Tuesday: Understanding Theories

Our first session was a simple welcome and networking session; Thomas gave the pitch I summarized at the beginning of those post and then we covered the arc of the week. We then spent about an hour in a rapid series of 7-minute phases of “find 1–2 people you don’t know or haven’t talked to in a long while and have a conversation about what brings you to the workshop!” I personally enjoyed reconnecting with many colleagues from the past, and also meeting many new ones that we’d invited to the workshop but didn’t know. By the end, I think most of the 28 attendees were able to meet most of the people they did not know.

In the second session, we asked, “What is theory?” This is a dangerous question to ask at a retreat: it risked derailing any meaningful progress, and getting us trapped in an epistemological hole. And so we framed the session as the one and only time where we would go deep on definitions, epistemology, and disagreements, and that this was everyone’s chance to surface all of the tensions so that they could hold them in the back of their minds as we engaged more concretely with creating, critiquing, and sharing theories. The organizers kicked off the session by giving their own 5-minute hot takes on what theory is; we intentionally created our talks in a series to build upon each others’ points, and show our own disagreements. We then invited the participants to form small groups and share their own theories of theories, then share them back to the larger group. My primarily insight from this session was that CS is fairly “lumpy” in its understanding of epistemology: most are die hard positivists, but the specific group we invited were questioning and dabbling in interpretivism. Furthermore, most viewed theory as a practical explanation to help reason about complex cognitive, social, and organizational questions about programming, and had a basic appreciation for its utility in intervention.

A lecture hall of 28 people listening to a speaker.
Attendees ask questions about the theory template.

In the next session, we talked about how to express theories. This was of particular concern for the organizers for two reasons: 1) we were going to ask participants to write some theories on day two, and 2) more generally, there aren’t really that many great examples of how to describe theories in a semi-structured way. I had worked on a theory template to help with this, and used the session to share it with participants, and we broke into five groups to practice trying to express existing theories using the template. The template, which is publicly accessible, had several basic components:

  • A short name to help people refer to the theoretical explanation in conversation
  • A summary of the theory
  • Contributors to the theory
  • A description of the phenomena being explained
  • A collection of prior work (including existing theories) that help explain the phenomena
  • Concepts in the theory, such as variables, processes, people, agents, structures, contexts, or other details necessary for accounting for some phenomena
  • Relationships and mechanisms that offer a causal explanation of how the concepts interact to explain the phenomena, with a prompt for concrete examples.
  • Example hypotheses that theory might predict
  • Example studies that might test example hypotheses
  • Corollaries that follow from the theory

The five groups tried to explain 1) questions that developers ask about code, 2) program comprehension as fact finding, 3) the theory of “leaky abstractions”, 4) theories of information hiding, and 5) theories of programming instruction. We broke off into different rooms in the castle to try to express those theories with the template, and used it to revise the template into the form summarized above.

After some cake and coffee, we then had members of each group rotate to another group to read their draft theory and offer feedback on both the template and how the theory was expressed. This served to help capture issues with the template, but also issues with describing theories in general. We noticed, for example, that a large part of what makes a theory powerful is having a few simple concepts and words to capture those concepts; efficiency of communication seems to be essential in enabling them to be practical tools for thinking. We also noticed that audience was quite important: knowing how would be reading and using a description was central in deciding how much nuance to provide.

Michael Coblenz, Slim Lim, me, and Justin Lubin playing root at a table with a green tablecloth.
Michael Coblenz, Slim Lim, me, and Justin Lubin playing root.

After a long day of talking about and describing theories, we all took a break to adventure outside before dinner. Some went running, some went for walks, and some did “synchronized napping,” recharging before dinner. After dinner, I opened a board game I’d been wanting to play called Root, which turned out to be an incredibly complex asymmetrical strategy game about factions of woodland creatures, but also a great case study in computational thinking. Some of us, overwhelmed but curious, committed to trying to play a full came sometime during the week.

Wednesday: Creating Theories

After breakfast, our first session of Wednesday was a group brainstorm of “unexplained phenomena” in programming — anything that anyone felt was a known phenomena that held important mysteries to understand analytically and empirically. We made a big spreadsheet where everyone could chaotically contribute and ended up with more than 90 ideas. Some of my favorites (as phrased) were:

  • People make mistakes when programming (inserting bugs).
  • Data-driven programming vs “logic-driven” programming
  • theory of what “bit-rot” is and why it exists
  • theory of edge cases (what “unexpected” means)
  • separating programming and capitalism (what would a socialist/ communist/ anarchist programming language look like?)
  • non-English PL and compatibility with English PL
  • Ephemerality in programming (SW that runs a few times vs runs for 50+ yrs)
  • Effect of climate change, infrastructure collapse on programming (assumptions, constraints)
  • human infrastructure (e.g. unwritten knowledge, hidden labor) for maintaining SW infrastracture
  • Debugging into existence is sometimes effective
  • How rubber ducking affects debugging outcomes
  • The effect of being trained to write code with pencil and paper on learning outcomes

We created some columns in the spreadsheet for people to express interest and converged toward five phenomena:

  • Programming with concrete examples (example use/test cases, example data vs reading documentation/code)
  • Novices become experts / Programmers gain skills in different ways and at different speeds / The effects of expertise on handling problems (psychic debugging, schemas for error messages, etc.)
  • Using static (and dynamic?) analysis to guide problem solving
  • How types influence programming
  • Human infrastructure of software infrastructure (turnover, tacit knowledge, data labor)
  • Programming styles (opportunistic / systematic) and upfront vs. deferred understanding / Neurodiversity and programming
  • Data-intensive programming vs. logic-driven programming

After a morning coffee break, we all ventured into the castle into our chosen groups and began theorizing. I chose the “programming styles” group and worked closely with Lutz Prechelt and Jeff Stylos about why people seem to have such different high level approaches to programming. Jeff speculated that some people are just different and that these differences can either create or remove friction with languages, APIs, and tools depending on who they are designed for. I suggested that perhaps those differences are due to neurodiversity: ADHD and autism, for example could be a powerful force in shaping how systematic someone is. Lutz did a wonderful job challenging these ideas and helping shape them. We ended up with a theory of Neurofriction, which claims that observed differences in programmer behavior can be explained by social and cognitive diversity and the interactions between that diversity and the neurotype preferences encoded into language, API, and tool design.

A panoramic view of the seminar room with 28 people having paired conversations about theories.
Feedback speed dating in the seminar room.

After the group work sessions, we came back as a group and had each individual team member rotate in a massive speed dating protocol with other team members and share the theory, get feedback, and capture it for the group to process. This was a wild event, with 14 pairs in conversation, rotating every 10 minutes or for 90 minutes, generating a massive amount of feedback and further insight. I found it fascinating to see just how much everyone already had theories in their minds to explain these phenomena, but had never tried to write them down in a structured, explicit way.

After cake and coffee, we returned to our groups, used our feedback, and tried to polish our theories into something a researcher or programmer might comprehend, and begin preparing a presentation for the following morning session to “pitch” our theory to the broader group. We then broke off again for physical and social activities before dinner, and then social time after dinner over wine, cheese, ice cream, games, and conversation.

Thursday

The third and final full day was all about sharing theories and reflecting on their value. We had two sessions of presentations, to give ample time for every team to share their theory in detail and discuss its claims.

Jon Bell talks about their theory of debugging with a slide titled “Our scope”
Jon Bell talks about their theory of debugging.

The debugging group presented explanation of debugging as something inherently scientific, proposing a search space, a hypothesis space, strategies that developers apply to navigate the space, and the role of tools in accelerating (and sometimes decelerating) this work. As someone who began her career thinking about debugging, this resonated strongly, but also helped me see that there are numerous unanswered questions about what strategies exist, why developers choose particular ones, which ones are effective, and how tools do or don’t support them. I wish I had had this theory when during my dissertation work!

  • The types group presented a fascinating theory about the role of types in programming, postulating that writing types is about creating an ontology of constraints that, when sufficiently expressive, help programmers ensure consistency with an ontology, and evolve that ontology when it is insufficiently expressive. With this theory, they hypothesized that whether types are helpful has much to do with how expressive the type system is, how well it can model the domain they are trying to expression, and how well type information is used to find inconsistencies. This theory strongly resonated with my own experiences with type systems, but also made me wonder about the barriers that requiring an ontology to be expressed at all might pose to someone not used to thinking about a domain in such rigid terms.
Benji Xie talks about a theory of data programming, with a slide titled Concepts: Computational Notebooks”
Benji Xie talks about a theory of data programming.

The data programming group argued that, as an activity, data programming is far more about data narratives than program construction, as many data-driven programs are only executed once, and that this has many implications for software engineering, which is broadly concerned with maintenance and evolution rather than storytelling. I was so excited to see a theory that grappled with how a program is used in the world.

Jun presents their “Theory of Code Examples” Google Doc
Jun Kato talks about a theory of example code.

The examples group presented a theory that argued that examples are fundamentally about contextualizing an abstract API, allowing developers to tinker with an example to see an API in concrete use; they struggled a bit to define what exactly “contextualizing” was, but observed that even something as simple as the specific value chosen in an example — such as the number 0 instead of 1—could have dramatic impacts on the comprehensibility and utility of an example.

The tools and expertise group presents a set of hypothesis during a discussion on a stage, listening patiently.
The tools and expertise group presents a set of hypothesis during a discussion.

The tools and expertise group argued that programmers make decisions based on beliefs, that most beliefs about code and requirements come from information seeking, and so tools are fundamentally about helping with search, and tool use is fundamentally about issues such as trust and credibility. This much grander claim — closely related to the debugging group’s theory— suggested that tools are really about building trust in information.

Emma writes “Learning Effects from Code Analysis” on the chalkboard.
Emma Söderberg uses the chalkboard to draw.

The code analysis group argued that analysis tools are fundamentally about changing a developers knowledge about an implementation and its domain and that surprise is the primary driving value of analysis tools. But they also observed that that surprise is contextual and changes in beliefs are about learning, and so there was much to say about the relationship between surprise and learning.

My neurofriction group described the theory I summarized above, but did so through a sketch in which I represented two different APIs and Lutz and Jeff represented two users with very different neurotypes; we illustrated the pairwise friction that can arise when their is a mismatch. (I couldn’t find a photo!

After cake and coffee, we then posed one final question: is all of this effort worth it? We formed small groups to reflect on when theories might not be worth creating, and challenged them to communicate in some way that did not use academic conventions of speeches, presentations, or documents. We had four lively sessions.

A complex diagram describing a flow chart of when to use theory.
When to use theory? It depends.

One group created an intricate diagram of pros and cons of theories, observing that issues of maturity, generalizability, reduction, and relevance could all mean that a theory isn’t a good tool for thought.

Six illustrations depicting abstract contexts in which theory might be problematic.
Good luck decoding these visual puns.

One group created a collection of illustrations that comically depicted the potential failures of theory, weighing down thought with old ideas, creating barriers to interdisciplinary communication, excluding people through the weight of theory, or scaring people away from a field.

My group did a humorous improve skit depicting a researcher answering questions after a talk about theory and a fourth created an epic bulleted list of challenges of using theory in CS fields that are so a-theoretical, at least about human and social behavior.

Warring factions of animals in the woods on a cardboard board game.
We finally figure out Root. (At least part of it).

Tired and saturated with ideas, we spent the rest of the day walking, hiking, biking, and exploring outside, having a tasty final dinner, and then in conversation all evening. My group of Root players dug in and successfully played our first game of Root, finding it to be a fascinating, challenging, and surprisingly deep game.

Thomas LaToza standing on a stage.
Thomas closes out the workshop, soliciting reflections.

Friday

By Friday, everyone was fairly exhausted and some were heading out to return home, and so we spent the last morning session reflecting on what we had all learned about theories of programming in a a large group. The insights were deep, including:

  • As a community, we don’t yet have the infrastructure or skills for creating and evolving theories. We need a lot more opportunities like this Dagstuhl to learn and grow capacity and “muscles” for doing theory work. We don’t even have graduate courses that talk about research methods, let alone theory.
  • To our surprise, everyone found deep relevance and utility in thinking theoretically about programming. There is promise here, even if it’s unrealized.
  • Our publication venues are generally hostile to theory. There is great opportunity to bring together the many communities concerned with programming—such as the VL/HCC, CHASE, Programming, and PLATEAU conferences and workshops—and begin to create a context for doing this theory work together across different areas of CS.
  • Everyone appreciated the time to talk about theory explicitly; for many , it was always in the background of their careers, but never a subject of conversation.
  • There are many debates to be had (or rehashed, from other fields) about where theories should come from: data, subjective insights, other fields?
  • Everyone in all of these communities is so nice and wonderful! Stereotypes be damned: we are a social bunch, especially after two years of being apart.

Personally, I’m quite optimistic about theory, especially in HCI and computing education, where it already has a strong foothold, but even in software engineering and programming languages, where there are definitely a critical mass of people who see its value and necessary. There’s certainly hard work ahead to make more space and capacity for it it in our scholarly discourses and doctoral education, but I think everyone wants to do the work, where they can. I’m excited to have so many new and old friends to do it with, and look forward to making it the new norm!

--

--

Amy J. Ko
Bits and Behavior

Professor, University of Washington iSchool (she/her). Code, learning, design, justice. Trans, queer, parent, and lover of learning.