Teaching a strategy for reading code

Clickbait title: You won’t believe how 5 minutes can improve your coding skills!

Summary: Spending 5–10 minutes teaching a strategy to read code can lead to improved reading performance, helping prevent low-performers from becoming overwhelmed and giving up! I presented this work at SIGCSE 2018. Links to resources (teaching resources, papers, slides) are at the end of this post.

Say I’m teaching 4th grade math students how to solve problems with different operations, like this:

2 x (3 + 6)

I’ve taught them the individual operations of addition, subtraction, multiplication, division. So I give them practice problems that have a mix of these operations. And my hard-working students surprisingly(?) struggle with them! What happened?

In this case, I didn’t explicitly teach them the strategy for solving these problems. That is, I didn’t actually teach them the order of operations (also known as order of precedence), such as by introducing PEMDAS. I instead required my students to (perhaps blindly) make up their own strategies as they solved problems.

Requiring students to construct their own strategies as they practice applying a skill can result in unproductive struggle. Instead, we can explicitly teach a strategy and properly equip students to learn more from their practice.

My above example sounds a bit absurd, but that’s what’s happening in a lot of programming courses today. So let’s explicitly teach problem solving strategies to improve how people learn to read code!

Reading code, simulating its execution in your mind, and predicting its output is known as code tracing. This tracing skill is critical for programming but novices struggle to do it! This is in part because they are still developing their knowledge of code constructs (if statements, while loops, etc.), but also because they’re trying to figure out the right strategies to solve the problem. This can lead to unnecessary difficulties such as novices focusing on memorizing variable values as they update or trying to translate the code to English(a more advanced skill). For novices to get the most out of their practice, we should provide scaffolding so they can focus on learning the skill of reading code without becoming overwhelmed.

Here’s my claim: Providing explicit instruction of a strategy that encourages line-by-line tracing and updating an external representation to keep track of variable values helps novices focus on conceptualizing how code executes and improves their trace ability. Let’s describe this strategy:

The strategy consists of 2 parts: a description of the steps to solve a code tracing question and a description of memory tables that novices can use to keep track of variable values. The steps consist of understanding the question, finding where the code begins executing, and then tracing the code one line at a time. The full instructions are below:

By being encouraged to trace line-by-line, novices have a general approach for solving any code tracing problem.

Sometimes, novices must update a memory table. A novice creates a new memory table each time a method is called. When a variable is declared in that method, they fill in a new row in the memory table. When a variable is updated, they find that row in the table, cross out the previous value and write in the new one. After a method finishes executing, they cross out the entire table.

By using memory tables to keep track of variables, novices don’t have to waste “brain power” remembering variable values!
An example of a memory table after a variable swap operation.

So that’s the strategy! It fits neatly on 2 pieces of paper and we spent only 5–10 minutes teaching it to novices (college undergraduates in their first computer science class).

To evaluate the strategy, I recruited 24 college students who were 5 weeks into in their first computer science course. Working with them individually, I taught half of them the strategy. I then had all participants work through the same 6 code tracing problems, saying aloud their thoughts as they worked. By doing this, I am able to answer two questions: 1) does teaching this strategy improve tracing performance? 2) how does this strategy change students’ thought processes when solving tracing questions? Let’s see the results!

We see that participants who learned the strategy performed better and with less variability than their classmates who did not learn the strategy. This is true for the 6 tracing questions we asked them to solve for the study, but also for their course midterm which they took 3–6 days after they participated in the study (see figures below)! With this, we can answer our first question and say that teaching a strategy does improve code tracing performance.

PERFORMANCE ON STUDY QUESTIONS: Participants who learned the strategy performed on avg. 15% better and with 46% less variability on the code tracing questions in the study! (p<0.05)
PERFORMANCE ON MIDTERM: Participants who learned the strategy performed on avg. 7% better and with 42% less variability on the course midterm.

To understand participants’ thought processes, I analyze the think-aloud of the groups of high and low performers in both conditions. The details of the analysis are in my SIGCSE 2018 paper (linked at end of post), but the most interesting finding is with how the strategy supported low-performers. In the control group, the 2 lowest performers deviated from line-by-line tracing and didn’t write down variable values. This resulted in them becoming overwhelmed and giving up on many problems without being able to produce a solution. Contrast this with the low performers of the strategy condition, who typically (but not always) followed the strategy and usually traced line-by-line and updated memory tables to track variable values. From this, we concluded that the strategy helped low-performers make progress and not give up.

So, we find evidence that spending a few minutes (and without needing a computer), we can improve novice’s ability to trace code. This could be because the strategy supported making incremental progress with line-by-line tracing and helped novices focus on what the code is doing by offloading the cognitive load of remembering variable values onto memory tables.

The implications for instructors is this: Provide explicit instructions to equip novices with strategies to solve problems! I use memory tables as an instructional tool, but I know that they have also been used for assessment.

The implications for research are numerous! I’m investing how we can use representations such as memory tables to capture intermediate steps in problem solving to better understand thought processes and misconceptions. I’ve also investigated how to translate memory tables to an online domain. This is ongoing research, so let’s collaborate!

Below are resources to help you teach the strategy or extend the study, as well as a link to the paper and the slides. I reiterate that this is very much still ongoing research for me, so please reach out (comment, email, tweet) if you are a teacher looking to bring this strategy to your class, a researcher looking to extend this work, or anyone else. You should especially reach out if you are a curious learner! I look forward to hearing from you.