The tail end of a Dagstuhl break.

Dagstuhl trip report: learning and teaching programming language semantics

Andy J. Ko
· 26 min read

The first programming language I learned was Texas Instruments BASIC. It was a simple procedural language, with a few control flow constructs, basic conditionals, variables, and a big library of mathematical functions. It had an incredibly basic syntax, with every program basically a list of instructions, with a few GOTO statements to get around. It wasn’t nearly as expressive as even our simplest modern languages: TI-BASIC now makes JavaScript looks as complex as C++. But it was more than expressive enough for me as a novice programmer: I learned to make animations, text adventures, and interactive games like Tetris and Breakout.

From a learning perspective, the simplicity of TI-BASIC was its greatest strength. After reading the calculator manual’s chapter on the language many times, I had enough of a grasp of how TI-BASIC programs executed that I could understand how the programs I wrote would be executed by my pocket computer. That chapter, and all of its prose, examples, tables, and diagrams, is something the field of computing education research has long called a notional machine. A notional machine can be thought of some form of pedagogical instruction that convey the set of rules that govern the behavior of a computational machine (in programming language terminology, its semantics). This is contrast to the rules themselves, and also different from a learner’s mental model of the semantics, which they acquired from the notional machine’s instruction.

This basic idea of a notional machine as a form of pedagogy for teaching programming language semantics reveals a range of interesting challenges around programming language learning and programming language design. For example:

  • What affects a programming language’s semantics learnability?
  • Are notional machines necessary to support learning, or can learners just learn the semantics through causal inference?
  • What are the consequences of notional machines imperfectly conveying programming language semantics on learning, productivity, and software defects?
  • What are the most effective ways of teaching a notional machine?
  • Can a notional machine be equally effective at explaining a programming language’s semantics, or do they vary like most forms of instructional design based on prior knowledge, culture, and other factors?

Organizers Shriram Krishnamurthi, Mark Guzdial, Jan Varenhold, and Juha Sorva brought together more than 45 faculty and doctoral students for 5 days at Schloss Dagstuhl to discuss these questions, generate more questions, and envision some new theories about programming language learning.

Paul Denny gives his 80 second lightning talk.


The beginning of any Dagstuhl is often one of wonder and uncertainty: what have the organizers planned? For the newcomers, what happens at a Dagstuhl? Who are all of these people I’ll be spending time with for a week? The organizers’ strategy was to spend the first two days doing a mix of theoretical primers, social time, and poster sessions, building the relationships amongst the participants.

The organizers began with a series of 80 second lightning talks, just to get to know each other and our collective interests. And the interests were broad: language semantics, misconceptions, error messages, debugging, strategies, intelligent tutors, philosophy, representations, algorithm comprehension, programming environments, and machine learning literacy. The attendees were spanned computing education, programming languages, software engineering, HCI, and cognitive science. And because computing education often has many “pivoters”, there were many other areas of CS represented, including parallel computing, architecture, databases, and visualization.

Ben de Boulay, who coined the phrase “notional machines” in the 1980’s.

Ben de Boulay on notional machines

After the introductions, Ben de Boulay gave a historical overview of his idea of a notional machine and the ideas they were based on. He started off with an overview of Papert’s ideas of programming as a conceptual framework for teaching mathematics. He discussed Gerald Weinberg’s book The Psychology of Computer Programming, which discussed the social aspects of programming in professional contexts, including the notion of program understanding. He then discussed Alan Kay’s Dynabook, which was his vision for educational computers of the future, which presaged the iPad. He presented Thomas Green’s series of studies about the error-proneness of notations of programming languages. He discussed Pea & Kurland’s 1984 work invalidating Papert’s claimed cognitive effects of learning programming.

He tied all of these early ideas together by posing some big questions that emerged from this work. What are programs for? How does one do programming? How do programs worked? How do I design programs to accomplish my goals? These big questions set the stage for a lot of the work that came next across many fields, including Ben’s paper that coined the phrase “notional machines.” Miller studied non-programmers, Soloway studied programming plans, Hoc studied algorithm design, Gould studied debugging amongst experts and novices, and many studied novice errors. All of this work on the psychology of programming, published across the 1970’s and 1980’s, shaped my early doctoral work in the 2000’s.

Ben then turned to his work on notional machines. He defined it as “a way of explaining how a machine works.” This is different from the mental representation that a learner builds; it is instead an approach to instruction that results in a mental model that is hopefully consistent with a language’s semantics. This might be pictures of a machine, stories about a machine, traces of machine behavior, representations, metaphors, or any other form of instruction that help someone understand a machine’s behavior. My favorite line: “What you’re doing is telling the best lie that you can” about how a machine behaves.

Ben also argued that there are notional machines about every layer of computation: There are notional machines about programming languages, but there are also notional machines about operating systems, compilers, editors, and other things in the ecosystem of a language. Learners need to comprehend all of these things, and teaching a notional machine about all of these things can therefore help.

The group was ready to dive deep into the meaty issues of language learning:

  • How do learner conceptions interfere with language semantics?
  • Are there studies comparing different types of notional machines?
  • Should we design languages for learnability, or just get better and teaching languages?
  • Should we teach notional machines for actual machines, or for abstract representations of machines?
  • Should notional machines be task, domain, and population-specific? Must they be in order to be effective?
  • How do you sequence learning? For novices, do you start with a full language, part of it, a toy language?

These set the stage for the rest of the seminar.

Lunch at Dagstuhl is always a great way to bring random people together. And part of this is because the Dagstuhl staff literally randomize who sits next to each other. On Monday I got to sit next to Christoph Becker, who was at Dagstuhl alone on a writing retreat. He studies human values, and how to incorporate thinking about human values in to software project planning. We had a fascinating conversation at lunch about the challenges of bringing critical reflection about values to computing in academia, and how K-12 education might be an effective Trojan horse for these ideas.

Tom Ball shares what he’s learned about Cognitively Guided Instruction

Cognitively guided instruction

After lunch, Tom Ball (Microsoft Research) kicked off a series of tutorials to give the participants common ground for the rest of the week. He gave a tutorial about Cognitively Guided Instruction, which comes from math education research (studying how learners solve problems like “One bag has 6 marbles, another bag has 2, how many marbles are there total?). The general idea behind CGI is to help students understand why the computational strategies they use work the way they do (not just learning a manual algorithm for adding, but learning why this manual adding algorithm works). Students generate and apply their own strategies and learn to identify themselves as mathematical thinkers. Students engage in these processes socially, in groups, and with the help of teacher. The whole approach, which has 30 years of research behind it, is essentially a constructionist take on primary math knowledge, helping learners converge towards more effective strategies for solving math problems.

The bridge Tom built a few bridges between this and computing education. One was pedagogical: in computing, what if we talked about different strategies for writing programs, and had learners engage in developing strategies socially to solve programming problems? Another was a metaphor between the semantics of mathematical notations and the semantics of programming languages: if we can use CGI to help learners problem solve around math notations, maybe engaging learners in reasoning socially about their interpretations of programming language notations may also help.

Colleen Lewis explains knowledge in pieces

Knowledge in pieces

Next Colleen Lewis gave a tutorial on Andy DiSessa’s work on Knowledge in Pieces (KiP). One of the central ideas is that students don’t have a single mental model of things; there are variations over time and context that result in students reasoning differently about the same ideas. For example, students might entertain that the world is flat while walking across it, but that it’s round when thinking about space. Both are reasonable theories of the shape of earth. Another key idea is that knowledge is like a “bag”: learning involves consistently using the “right” knowledge from this bag. Context determines when learners bring knowledge to bear. Another idea is a p-prim (phenomenological primitive), which is some basic unit of intuition about phenomena in the world — the physical embodied world. Research in KiP involves mapping out the concepts that people build out of p-prims, and the learning trajectories we should design to productively develop learner understanding. The research is therefore inherently iterative.

These ideas have implications for computing education. First, there are many p-prims that learners might bring to the learning of computing, constraining, shaping, and influencing what is easy and hard to learn. For example, learners might have intuitions about sequence from the physical world that shape learners’ reasoning about sequences in computing. These primitive beliefs is something a teacher might have to work around over time in order to produce conceptual change.

Informal poster sessions allowed for quick sharing

Interleaved throughout the rest of Monday was a series of poster sessions in which attendees shared their recent research most related to national machines. Like most of a week at Dagstuhl, the focus was less on formal presentations, and more on deep dives into particular ideas related to the topic of the week. People gave posters about program tracing, manual sketches about program execution, theories about program complexity, studies about the pronunciation of programs, learning technologies to support assessment and practice on program execution. I talked about my lab’s recent work on program tracing, programming tutors, and programming problem solving. By the end of the day, most attendees had met each other, learned about each other’s expertise, positioning everyone to start working through the big questions raised throughout the day.

Mmm, beef curry.

However, before that deep dive was an equally important part of any Dagstuhl: a healthy dose of social time. Dinner is an excellent time to meet colleagues from all over the world. I had a fascinating conversation about different university systems, different cultures of CS teaching, and the varying politics of admissions. I spent the rest of the evening chatting with attendees about things far from the focus of the workshop, including games, politics, parenting, digital archiving, and some of the ridiculous status signaling in German and Austrian titles.


After a lovely breakfast deconstructing the failures of the U.S. publication education system, the second continued with tutorials and one more poster session.

Katie elegantly mapped several theories onto one framework

Structure, behavior, function

Katie Cunninghum gave a short tutorial talk on the Structure Behavior Function framework, which Herb Simon first began to develop, and others built on. The general idea of the framework is that the structure is what the system is made of, the behavior is how the system works, and the function is why the system does what it does. For example, in an aquarium, the glass container is the structure, the filter is the function which cleans the water, and the function is to keep the fish alive. Katie mapped these ideas to programs by talking about the the programming plans underlying an implementation as the structure of code: how is the code implemented to achieve it’s behavior and function? She mapped program execution to behavior. For function, Katie talked about the requirements of code; what is its purpose in the world?

She further extended this mapping to talk about debugging as analyzing behavior, plan composition as relating behavior and function, mapping structure and behavior as program tracing. The general idea was to use this framework as explanatory concepts for analyzing learner behavior, but also to unify some of the theories the field has contributed.

One of the discussions that arose from this was the opportunity to link the framework’s terminology to the terminology in software engineering such as requirements, specifications, and program comprehension. The field of software engineering has done a lot of work to define these ideas, and they may be more precise ways of discussing the activities in learning.

Robert Goldstone gives an epic survey of psychological factors

The psychology of programming

Continuing day 2 of tutorials, Robert Goldstone (Indiana University) gave an overview of the 40 year (but sparse) history on the psychology of programming. He began with a summary of misconceptions when students are learning to program computers; he used a taxonomy that included syntactic errors, conceptual errors, and “strategic” errors, which referred to missing parts of programs that introduced defects. But there are also broader phenomena at play, including fragile knowledge, cognitive load, natural language intrusion, limited working memory, memory schema “misconstruals”, perceptual failures, transfer failures, and individual differences. All of these factors suggest an immense and perhaps overwhelming complexity to to understanding learning to program.

Robert had an interesting account of how it is even possible that humans can learn to program. We leverage our evolutionary endowed system (language, spatial reasoning). We adapt our environment to better fit our capabilities. And we acquire knowledge that uses the environment and our basic cognitive abilities to enact skills.

Robert’s work leveraged this theory to create an environment called Graspable Math, which uses math notation, but makes it direct manipulation in a spatial sense, dragging and dropping things, while enforcing rules of commutativity, associativity, etc. He showed some really cool demos about how to make mathematical reasoning interactive, spatial, and multi-representational. He reflected on how IDEs might provide similar interactivity in comprehending program execution, drawing links to Bret Victor’s popular demonstrations.

Robert ended with a catalog of pedagogical recommendations from Cognitive Science, including defining the roles of variables, providing worked examples, simple problems, practice, explicit training for transfer, and peer instruction. In a sense, he was restating the work of the computing education community in cognitive science terms.

Ben Shapiro broadens our views to the sociocultural

Activity theory

The next tutorial was by Ben Shapiro (CU Boulder). He shared perspectives on activity theory. The general premise of his talk was to say we need to think about cognitivst views, but also sociocultural views. He framed cognitivism as an epistemological prison. He framed the history of activity theory historically, talking about Vgotsky through Rogoff. There is no singular activity theory, nor is it a predictive theory; it’s an analytic framework centered on practice. One way of thinking about activity theory is as understanding the unity of consciousness and activity, arguing that cognition is only understandable in situ in its mediated, embodied contexts. Another claim of activity theory is that all activity, including learning, is social and cultural. The general claim is that the unit of analysis of an individual is insufficient; we need units of groups, classrooms, and contexts.

The implications of this are that everything in curriculum, programming language design, assessments, etc. are all socially negotiated and situated. As Herb Simon said, CS is a science of the artificial; we don’t inherit it from nature, we design it, and it is there for sociocultural and sociopolitical. Ben’s general argument was that we can’t understand the learning about computing without understanding these designed, sociocultural and sociopolitical contexts. And therefore, just looking at semantics will be insufficient, and in fact can’t even explain the recent results we are discovering in our field. He then went on to argue how activity theory, and broader systems-level views of learning, better explain our results, in particular by interrogating the practices that learners are using to write their programs. He concluded that we can’t evaluate a programming language for learning without accounting for the culture and practices around it.

Joe went into teacher mode to motivate the value of semantics.

After lunch, Joe Politz (UCSD) gave a basic tutorial about programming language semantics. However, rather than taking a mathematical view of semantics, he give us a practical, programming-centric introduction by talking about refactoring and autocomplete scenarios in which we need to use semantics to reason about soundness and completeness of these features. The general insight that Joe shared was that reasoning formally about a programming language’s semantics is valuable, but also critical for understanding program behavior. While the tutorial was fun, I think it ultimately missed the mark of teaching semantics.

However, a debate about the phrase notional machine eventually made it interesting. There was a fascinating debate about whether examples of algebraic expressions can be used as a notional machine to explain the part of programming languages that are algebra. It was a really good set of examples to clarify the meaning of a notional machine.

The diagram that David Weintrop drew


After Joe’s semantics tutorial, we broke into four breakout groups to distill everything we’d learned into some clearer definitions about notional machines, generate research questions that convey what we don’t know about notional machines, and then sketch ideas for how to answer one or two of these questions.

My group settled on a fairly simple definition of notional machines:

  • The semantics of a programming language encode the abstract rules that govern the behavior of a program written in a programming language. (Though these semantics aren’t necessarily the same as the the semantics of the actual runtime that might execute a program.)
  • A notional machine is a pedagogical answer to the question “How do programs execution in this language” or “How does this program execute?” These might be natural language explanations, visualizations, diagrams, or other representations that answer this question in a way that produces learning about semantics. Notional machines are not perfect analogs of the semantics; they may be simplifications.
  • The result of encountering a notional machine should be a mental model in a learners’ mind. This may not perfectly reflect the notional machine conveyed.

My group generated over 30 research questions about notational machines, ranging from really basic descriptive questions such as “What kind of notional machines do teachers currently use” or “What kinds of notional machines do students learn from teachers, from compilers, from IDEs, and from each other?” to more explanatory questions like, “What makes a notional machine effective for learning? or “How do dependencies on cultural knowledge embedded in notional machines mediate learning?” Our charge was to develop research methods to answer one or two of these questions and present them in the morning.

Joe continued his tutorial on PL semantics after dinner.


One of the interesting points that came up at my table dinner was the way in which learners seem highly prone to beliefs about semantics that aren’t part of a language’s semantics. For example, they might see the name of an identifier, use that to infer the behavior of the program, and therefore assume semantics of the program that do not exist. Therefore, the role of a notional machine is just as much to explain what semantic rules don’t exist in the language, just as much as what rules do exist.

More on semantics

Just after dinner, most of the participants returned to class, resuming Joe’s lecture on semantics. For this second half, he dove deep into formal (i.e., mathematical) semantics of programming languages. This included grammars and how they encode syntax, and evaluation rules and how they encode semantics. Because I have had some training in this already (and been immersed in these perspectives by hanging out with software engineering researchers), it was fascinating to hear questions from a large group of computer scientists who hadn’t had a rigorous training in programming languages, but nevertheless have been studying programming language learning. What became clear was that while formal semantics are unambiguous about many things, there are also a huge number of unstated rules that that are inherited from other areas of math and computing. For example, one of the interesting hidden concepts was the notion of a “value” as something that you can’t evaluate; it’s atomic. That’s a good example of a concept that is essential in programming languages but something that is often never taught when teaching programming. Similarly, there were a lot of principles that the participants were surprised and sometimes confused by, such as notions of soundness and completeness; from the learner-centered perspective of many attendees, soundness was a nice-to-have for edge cases of language use, whereas to a semanticist, soundness is essential.


Our third day was blended a bit more building of common ground, a bit more envisioning, and a half day break, where some people hiked, some people biked, some people went to the town of Trier, and some went to a iron-smelting UNESCO world heritage site.

Geoffrey Herman summarizes the group’s Gremlin metaphor of notional machines

Breakout reports

Our first session Wednesday morning was to share thoughts from our Tuesday afternoon breakouts.

  • One group defined notional machines as a model between the source code and the actual machine, like a model of gravity that’s adequate within certain bounds, and useful to certain groups. Within this definition, notional machines can be stories and representations. This provoked questions like “What notional machines do instructors use and why?
  • Another group, led by my student Greg Nelson, focused more on mental models. They asked how to elicit and change learners’ mental models. They generated ideas like giving learners code, and having the make a video explaining how the code works, surfacing their mental models. They wondered how to validate assessments of mental models, whether mental models transfer between languages, how practice and feedback mediate the effect of notional machines on mental models, and whether to tell learners that notional machines are a simplification of semantics.
  • A third group wondered a lot about “lies” as well: how much should they simplify semantics, how much should they talk about these simplifications, and what choices do instructors make within this space? However, unlike the other groups, their questions were very teacher-centric. One reaction to this idea was that very few teachers actually know the “truth” about programming language semantics.
Sally Fincher draws a picture of semantic waves, placing notional machines at the bottom of the wave.

One idea that arose during all of these discussions, shared by Sally Fischer, was the notion of a “semantic wave.” This is the notion that concepts are hard, they require some simplification to reach “down” to where learners are, and then ultimately, you have to bring learners back “up” to what makes the concept hard. So the notion of a lie, from this perspective, is an essential act. Similarly, there was some question about whether there was even a “truth”; at some level, every idea about programming languages is a simplification at some level. The group largely agreed to move past the notion of a lie and instead use the word “model.”

Shriram tries to refine what he views as a notional machine at a lower level of granularity.

Examples of notional machines

This session probably should have come first in the week, but better late than never. Shriram broke down three ideas using the idea of model-view-controller architectures:

  • Explanations of semantics have a model of what’s happening in the machine, such as what is stored in memory.
  • Explanations have a view of that model of what is shown about the machine.
  • Explanations have a notion of the rules that govern the machine. Shriram argued that the rules are the notional machine.

Here are some examples of these rules:

  • Python Tutor provides no notional machine because it doesn’t explain the rules governing Python program execution. It does have a visualization of the machine’s content and behavior, but not the rules that shape it’s behavior.
  • Greg Nelson’s PLTutor does have a notional machine, because it does everything Python Tutor does, but also explains that rules that govern program execution.
  • Mark Guzdial showed several examples explaining the rules governing Smalltalk message passing, most of which were natural language sets of rules.
  • Shriram showed examples from How to Design Programs, which included a series of “rules of evaluation”, some of which deferred to algebraic expressions, but also function evaluation rules and conditional evaluation rules in prose, with code examples to teach the rules.

Given that all of the examples above involved natural language, one interesting question was whether formal semantics (not natural language, but mathematical) are a notional machine. Shriram argued that they were, because they themselves were models of an actual computational machine’s behavior.

Brainstorming notional machines

In the last session before lunch, the organizers engaged us in an activity to generate a variety of notional machines to explain the behavior of some concrete programs. To do this, they gave us a program and three different outputs, governed by three different possible semantics. We then generated notional machines to distinguish between these three different semantics.

This activity resulted in an interesting taxonomy of different types of notional machines:

  • Some notional machines leverage metaphor, implying semantic rules by leveraging prior knowledge about rules learners might have already encountered. For example, when we were looking at a program that had a dictionary data structure, some groups used metaphors of phone books to explain lookup behavior.
  • Some notional machines articulated mechanistic rules, similar to formal semantics. For example, in the dictionary example, a rule might be “the system reads the string between the brackets and then finds the matching string in the dictionary.”
  • Some notional machines expressed condition action rules for learners, such as “when you see an error like this, this is what you need to do to fix it.” These don’t really explain the rules governing program execution, but they do give learners a chance of inferring them.


After lunch, the group split up to adventure around western Germany. Some stayed at Dagstuhl and walked, some ran, some went on a hike. Some went to the nearby old Roman town of Trier, and some went to Völklingen Ironworks, a UNESCO world heritage site. I’d been to Ironworks twice before, but I found it architecturally fascinating both times, so I decided to go again and practice some black and white photography.

The Ironworks is a now abandoned iron and steel smelting plant that was used in the world wars to support German manufacturing. It’s smoke and steam output was so severe at peak operation that none of the allies noticed it, and missed it, and so it was preserved unlike all of the other iron and steel plants in Germany. It’s a monument to the industrial era of Germany, to the 20th century world wars.

I also found it to be a fascinating example of a machine in the context of our workshop. Like a programming language, the factory was governed by a range of chemical and human rules, and only if these were followed did the factory successfully produce iron. But to learn these processes, we needed a tour guide to learn them. The explanations the tour guide gave were a form of notional machine.

Somehow, I was alone at the beginning of dinner, so I snapped a sad selfie. My randomly assigned table mates eventually arrived.

Dinner and evening

My dinner was full of fascinating conversations at the intersection of assessment and knowledge transfer. Most people don’t know that one of the most robust findings in learning sciences is that when we learn something, we don’t automatically generalize that knowledge to other settings. In this sense, knowledge is highly contextual, and if we want someone to use it in other contexts, we often have to explicitly teach how to transfer that knowledge to those specific contexts. There are certainly people who learn to do this transfer independently (academia is full of them), but it is by no means automatic, but a highly effortful process.

Our table discussed some of the profound implications for this in computing education. For example:

  • If what we teach is in a classroom, and how we assess that knowledge is in a classroom, but the point of the learning was for knowledge to be applied in the world, does any of the classroom knowledge actually transfer?
  • What is the “context” of learning programming? Is it really a classroom, or is it an IDE, or a rectangular layout of code, or just the text of code itself? And if knowledge is sensitive to these media, does practice and assessment on paper generalize to IDEs?
  • If a critical part of the context of programming in industry is the social context of programming, why do we try so hard to eliminate that context from learning contexts?

All of these larger observations have important implications for notional machines. How we use them to teach programming language semantics may be highly dependent on context.

The breakout group on notional machines for non-programming language machines


We spent most of Thursday in breakout groups tackling the many specific issues that we raised. The breakouts included:

  • Notional machines for things other than programming languages
  • Notional machines for Python and Scratch
  • Categorizing notional machines and the representations they use
  • Understanding the utility of notional machines for learning
  • Designing the instructional content of notional machines

I chose the first group.

Some examples of formalisms beyond programming languages that also need notional machines.

Notional machines beyond programming languages

While the primary phenomena of the workshop was about understanding programming languages, with a strong bias toward imperative and functional languages, the world of computing is full of formalisms beyond programming languages that may also require some form of explanation of the formalisms for learning. Our group brainstormed many:

  • Machine learning. The semantics of machine learning are quite different from PL semantics. How do we explain them and how do those explanations relate to explainability and interpretability discourses?
  • APIs/frameworks/libraries. One way to think of these is as programming languages with semantics but without syntax. Model view controller frameworks, React, callbacks, and other black boxed worlds all require some understanding of the semantics of these software architectures.
  • Distributed programs have their own unique semantics around coordinating independent machines.
  • Human-in-the-loop systems, including things like Mechanical Turk, Task Rabbit, and other forms of human computation where there’s some formalism, but also some component of human decision making and computation embedded in this formalism.
  • Spreadsheets and reactive languages, which have constraints, circular dependencies, and other invisible aspects of computation.
  • Game mechanics, which specify rules governing human decisions in a formal social context.
  • Proofs, which have their own semantics but often have no guidance.

As the world embraces formalisms (e.g., the GSPR granting a right to explanation of software behavior), notional machines for all of these computational media is of increasing importance.

We broke off into smaller groups to explore APIs, proofs, and distributed systems and found a few key insights:

  • Notional machines can use passive and active voice; these often indicate who is doing the work, the programmer or the computer.
  • Figuring out the triggering event that demand a notional machine is key to understanding both audience and context.

Our group met further and dug into JavaScript and it’s relation to the DOM, as a fascinating example of complex semantics. We arrived at a few further really interesting insights:

  • It’s often critical to take a language and envision a sequence of sublanguages in order to simplify the introduction of semantics.
  • Even really simple examples in JavaScript are really complex. For example, the code “black” is full of complexities about global objects, properties, special properties that affect the DOM, the lack of a need to define properties, CSS. This reveals the complexity of sequencing sublanguages of JavaScript, since even simple examples involve so much semantics.
A Dagstuhl photo in front of the chapel is a longstanding Dagstuhl tradition. I snapped a selfie before the official shot.

Other breakouts

After lunch, I learned about the activities of the other four other breakout groups:

  • One group explored the role of meta-cognition of semantics during program writing, and how to teach them in a way that ensures that learners simulate program behavior during program writing. They hypothesized that simulation is the moment of maximum impact of a notional machine; students may bring to bear a notional machine explanation at moments of confusion about semantics. They designed a study to explore this theory to try to answer the question “what events embedded in programming trigger students to trace code behavior?” by investigating learners doing a series of tasks in pairs to be able to observe tracing behavior.
  • Another group explored the dimensions along which notional machines vary, such as form, concept, semantics, learning goal, motivation for the student, how long it takes to teach, granularity, applicability, and audience. The group met further and distilled these dimensions, and tested them against several notional machines for different aspects of programming languages.
  • A third group talked about specific notional machines for Python and Scratch. They talked about Greg Wilson’s 14 rules of Python, and deconstructed the ways that it has different resolutions of detail, and layered references to semantics of other languages. In their second breakout, they dug deeper into Python and tried to design a learning progression of notional machines for Python, which involved adding a succession of semantic rules.
  • The other group talked about instructional design as it relates to notional machines. They generated guidelines for integrating notional machines, practice, and instruction together. The guidelines included things like “use diverse representations”, “bridge different representations”, “leverage human endowments such as gestures, audio, video”, “sustain engagement”, “mixing practice and instruction”, “adapt to social context”.

The leaders of each group did some exceptional and rapid synthesis of 2 hours of thinking, and then another whole rapid synthesis of 2 more hours of thinking. What an exciting but insightful day!

Ben Shapiro structures our thinking about JavaScript and the DOM.

Just before dinner on Thursday, Mark Guzdial surprised me by leading the group in a round of happy birthday. I turned 39, kicking off the last year of my thirties. Thanks Mark and everyone for singing for me!

When Mark asked me to get us ukulele, I had no clue why he asked *me* to get it. It made sense in hindsight!


The last day, which was a half day, began with three questions, to stimulate planning for the future:

  • Imagine all schools teach CS, but we have to use new languages, and have 5 years to figure it out. What do we design?
  • Imagine that Java is outlawed. How do we recover in CS education?
  • Imagine the world becomes Racket, Scala, and Haskell. What should the computing education undergraduate education community do?

In general, these three scenarios provoked us to thing about the language (and tools and curriculum) that we think would be ideal to support various types of learning.

Our large group engaged in deep thought.

My breakout group discussed many aspects of these scenarios:

  • It’s not enough to think about languages. One must also think about the tool and API ecosystem to support it.
  • There are many kinds of possible language semantics; what semantics do we think are important and how would we sequence them?
  • If we made programming language learning trivial, a lot of these questions about which language wouldn’t matter; people would just learn many languages.
  • The discussion revealed that many of the implicit learning objectives in CS curriculum are about programming and program design, not programming languages.
  • For some learning objectives, such as self-efficacy and programming problem solving, choice of programming language may or may not have a large impact.
  • Part of considering the language is also considering the language community and culture. This determines authenticity, how students see authenticity, and how they perceive cultures of computing.
  • Many learners many only ever learn one language. That’s in tension with ideas of having multiple language learning progressions.
  • Our local education policies and learning objectives likely define requirements for our programming languages, which demands different programming languages.

The goal of the session wasn’t more than to just have lots of discussion, but I walked away from it feeling like the nuances of how programming languages are situated in the world are exceptionally diverse.

Greg Nelson talks about the importance of systems thinking in all of our reflection on computing education.

Closing thoughts

Like a lot of Dagstuhl seminars, this one involved a few critical discussions about definitions, a broad ideation of research opportunities, and a lot of incredibly valuable networking. Here is how I’d synthesize what I learned this week:

  • Programming language semantics are hard to understand, especially in their formal form.
  • Teaching semantics requires finding ways of explaining them that connect to students’ prior knowledge, but also engaging students in wanting to learn semantics.
  • There are many strategies for teaching semantics, including choosing languages with simpler semantics, developing learning progressions tied to sublanguages, and not only explaining, but assessing learners’ understanding of semantics.
  • Understanding programming language semantics is far from adequate for learning to program; we need to design ecosystems of tools, documentation, and communities for anyone to create anything meaningful with programming languages.
  • Programming languages are culturally-situated, value-laden artifacts, and therefore we are likely to need very different languages in different learning contexts.
  • We know almost nothing about programming language knowledge transfer, but computer scientists appear to strongly believe that transfer is trivial.
  • There are many media that can embody explanations of semantics, including books, interactive learning technologies, and even just teacher’s verbal articulations. We know little about the affordances these different media involve.

Thank you Dagstuhl for creating the space for generating these insights!

The original Dagstuhl castle after a morning of light rain.

Bits and Behavior

This is the blog for the Code & Cognition lab, directed by professor Andy Ko, Ph.D. at the University of Washington. Here we reflect on what software is, what effects it's having on the world, and our role as public intellectuals in help civilization make sense of code.

Andy J. Ko

Written by

Associate Professor @UW_iSchool, Chief Scientist+Co-Founder @answerdash. Parent, feminist, scientist, teacher, inventor, programmer, human.

Bits and Behavior

This is the blog for the Code & Cognition lab, directed by professor Andy Ko, Ph.D. at the University of Washington. Here we reflect on what software is, what effects it's having on the world, and our role as public intellectuals in help civilization make sense of code.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade