The 2018 ACM International Computing Education Research Conference — Espoo, Finland
At this moment in my academic career, I’m most interested in helping to build community. I want to do this by making newcomers to my academic communities feel welcome, included, encouraged, and valued. Attending conferences like ICER, and helping plan them, is a huge part of doing this.
However, there’s a tension in community building and being a scholar. Community building, to a large degree, means avoiding conflict, as conflict can breed distrust, fear, and division, none of which are helpful to community building. And yet, the job of a public intellectual isn’t to avoid conflict, it’s to seek it and work through it, no matter how uncomfortable it is.
As a junior faculty member, I mostly avoided conflict, saving the community building for more senior. Since receiving tenure, however, I’ve decided that I’m not just safe in engaging in intellectual conflict, but also obligated to do it when I believe it serves discovery and progress. That’s what tenure is for. And that’s led me to do many things in my community that make me (and often other people) uncomfortable: blogging about important but delicate issues such as diversity, publishing controversial arguments with my students that question the norms and values of our scientific practices, and creating resources that serve future members of our community but wrangle some of my senior colleagues’ sense of scholarly and community identity.
I’d say at least half of the controversy I’ve created in the past year is me poorly executing these efforts: I’ve communicated poorly, spoken out of turn, made some people feel excluded, overlooked important voices, and made rushed, poorly planned decisions. As grueling as all this failure, I hope I’m learning, and in the process I hope that I’m also constructively questioning my community’s practices, beliefs, and norms in the name of progress.
Because of all of this conflict, I approached this year’s ICER conference with some trepidation. I’ve received emails from many of my community’s most senior people with criticism and disappointment with my actions. I’ve tested norms about confidentiality that have made some people very uncomfortable. And yet, I’ve received ten times as much positive feedback from students, newcomers to computing education research, teachers, and people in industry thanking me for helping, including them, and giving their concerns about our community a voice. Having so much positive feedback from the broader community bolsters my confidence, but that only goes so far when senior colleagues I respect are so forcefully negative about my behavior. When all of this tension moves from computer-mediated communication to face-to-face interaction, what happens?
For the most part, it turns out, nothing. Email turns out to be a terrible medium for expressing emotion, and yet in academia, these is the de facto standard for all scientific dialog. I saw most of the colleagues I’ve had conflict with at the reception before the conference, and while the substance of their critiques were all there, I sensed little personal animosity. That’s not to say there wasn’t conflict—the intellectual conflict is still there, which I think is desirable and appropriate in scholarship, but our ability to discuss it on a foundation of interpersonal trust is still there. And this is no guarantee: academia is full of people and communities who do not like or trust each other. It takes an inclusive, welcoming community like computing education to build resilience against distrust.
The doctoral consortium
Much of this resilience to conflict begins with setting cultural norms. That is how I spent most of Sunday before the conference, helping to run the SIGCSE Doctoral Consortium with my excellent colleague Jan Vahrenhold, and five other colleagues who served as mentors and facilitators (Aman Yadav, Anna Eckerdal, Brian Dorn, Kate Sanders, and David Weintrop). We spent the day with 19 wonderful students from all over the world, some of whom were only months away from graduation, others of whom had just started. Most were relative newcomers to the computing education research community, and so it was an excellent chance to not only discuss the (varying) norms and standards of our community’s research, but also our community’s scholarly communication. Throughout the day, we practiced elevator pitches, poster presentations, and the deeper epistemological, theoretical, and methodological issues with students’ work in smaller groups with mentors. We ended the day with a career panel, discussing the many types of jobs that people get beyond traditional tenure-track research positions. Each of these sessions was active, low-fidelity, and highly interpersonal.
One of the most striking pieces of feedback from students was how welcomed, supported, and warm students felt. We certainly aimed to create a safe, inclusive space, and had some explicit strategies for doing this (reassuring failure, modeling mistakes, validating risks, etc.). But I suspect that much of why students responded this way is that the people in our community are just welcoming, supportive people. It’s hard to create a process that achieves this without people who can manifest it.
Kicking off the conference, the chairs shared the usual good news: the conference is growing, the number of publications is growing, and the community is more international than ever. The rapid growth of CS education practice is leading to a proportional growth in CS education research. Part of reacting to this growth is ensuring that the more than 50% of attendees who were newcomers felt welcomed; we applauded first time attendees, distributed senior members of the community across the room, introduced ourselves, encouraged newcomers to move tables throughout the day, and continued the tradition of table conversations about each paper to ensure dialogue throughout the community. I’m proud of my undergraduate Harrison Kwik for communicating feedback about his first-time experience last year; his willingness to speak up improved the experience for everyone this year.
Measurement and theory
One of the first sessions was about the nitty gritty detail of how we do the science of computing education. Jan Vahrenhold and Laura Toma presented a case study of labs of an algorithms class. People take algorithms often to prep for interviews for jobs, not because they’re interested in algorithms. Last year they contributed a validated instrument of algorithms self-efficacy, this year they wanted to replicate their validation, but also try measuring cognitive load and emotional responses during the course. The paper contributed one of the first case studies of experiences in an algorithm class and a set of instruments for analyzing experiences in programming labs. In a way, it’s a nice package of measurements for investigating experiences in any algorithms course.
The next talk about was by Antti-Juhani Kaijanaho and Ville Tirronen on growth and fixed mindset in advanced CS classes in two advanced courses with poor pass rates and grades. They wondered if mindset explained these negative outcomes, but found that despite their modeling, and prior work suggesting the importance of mindset, it didn’t account for any of the variance in grades. There are many possible explanations for this; the paper didn’t reveal any possibilities, but it does call into question the role of mindset in advanced studies.
Rodrigo Duran, one of the doctoral consortium students, presented a paper on the relationship between program complexity and cognitive complexity. He positioned program complexity as related to cognitive load, defining a way to infer program complexity from concrete programs. Their model is fundamentally about the plans that programs follow and the path learners have to follow to comprehend the plan. They built upon Soloway’s notion of a programming plan and devised a way to extract a plan from a program. They also have an interesting idea of modeling the focus of attention on a program, and the requirement of simultaneous processing of plans. It’s early days for this theoretical account of program comprehension in learning; they haven’t validated it yet, or tried to use it to help instruction or assessment designers to inform which programs they use, but it’s a compelling direction for future work.
In the session before lunch, my student Greg Nelson presented our argument on how we should and should not use theory in computing education research. The core of the argument is that theory is a tool for design; it helps us make predictions and interpret how people interact with designs. However, using it as a barrier to reporting innovations with demonstrated benefits, even if those benefits are inconsistent with theory, is problematic. Greg did an excellent job delivering his argument visually.
Of course, the argument was controversial to many. Some of the founders of the conference expressed concern about rigor. Others lamented our use of confidential reviews to substantiate part of our argument. Some pointed out that we missed critical prior work on theory. Other questions praised us for opening up and furthering an important longstanding debate in learning sciences and education research about the role of theory.
The program chairs organized a panel after the talk, where Sally Fincher, Colleen Lewis, and Ari Korhonen presented their positions on the role of theory and moderated a discussion. Sally shared a pragmatic view of theory, arguing that it helps us to know what we’re looking at, what we’re looking for, helping us to explain and generalize observations. Colleen talked about the important question of how we pick our research questions, both as individuals and as a community, and how we need broader forums for making these decisions other than peer review (such as the conference’s tradition of a works in progress workshop after the conference to help develop attendees research questions). Ari talked about not knowing anything about computing education when he started, and his trajectory of learning about theory.
The questions that followed from the audience covered a range of topics. Mark Guzdial was surprised that few on the panel had addressed the role of design, which was the core purpose of our paper, and pointed out that this year’s proceedings are actually full of great progress on CS-specific theories to support design, such as Rodrigo’s paper above about program complexity. Others disagreed, and wanted to define our community as explicitly and only about explanation, arguing design belongs elsewhere, or isn’t research at all. Others like Lauren Margulieux shared a more blended perspective (like mine and Greg’s), talking about how much she enjoys having a community that cares about both. She advocated for building more bridges between these perspectives. Sally described these bridges as intellectual trading zones (I missed the origin of this idea; perhaps if she’s reading this, she can remind me). David Weintrop pointed out that learning sciences have had similar conversations about design, pointing to Andy DiSessa’s paper on ontological innovations. Brian Dorn pointed out that many of us build these bridges in ourselves, wearing different hats as scientists, instructional designers, and discipline-based education researchers.
The discussions that ensued for the rest of the day were enlightening. I head of fears of lowering barriers to publishing designs eroding rigor. Some discussed the lack of literacy about design. Some viewed Greg’s paper as an epic response to our paper being rejected (not realizing that we were discussing a paper that we were not involved in, just using it as a case study). I had interesting debates about epistemological pluralism and resistance by education and learning sciences researchers about design contributions without theoretical framing or evaluation.
Ultimately, the community valued the discussion, and collectively nominated it for the John Henry best paper award at the conference, for tackling big, audacious, impossible problems. Congrats to Greg for the recognition of his careful thinking, and dedication to improving our scholarly discourse!
Problem solving and feedback
After a morning of discussions about heady topics of theory an and measurement, the afternoon began with discussions of programming problem solving. Raymond Pettit began by presenting a paper on metacognitive difficulties that learners face. He built upon my student Dastyni Loksa’s work on problem solving process, investigating difficulties that students had knowing where they were in their process. The most interesting data was one on one interviews with students about their processes, revealing an astounding diversity of process and process struggle.
The next paper by Ph.D. student John Wrenn at Brown investigated harm in automated grading assessments. The core observation, analytically and empirically, was that tests under-specify correctness, which lead to misclassifications of program correctness, which may lead to harm in learning and self-efficacy. The implications are straightforward: unless you’re validating your assessment, you can’t be confident that assessments are not doing great harm. However, the study didn’t analyze the students’ experiences of harm (e.g., reductions of self-efficacy, erosion of growth mindset, reduced learning), so some of the harm is speculative.
Barbara Ericson presented the final part of her dissertation work on the adaptive Parson’s problems on learning. Parson’s problems are lines of code that have been jumbled in order; the learner’s task is to order them correctly to achieve a particular program requirement. Barbara was exploring how to adapt Parson’s problems to prior knowledge, and what effect this might have on learning. The approach was twofold: making problems easier if there is evidence of a learner struggling and then using evidence of struggle to simplify future problems. In a between subjects study, Barb found that both Parson’s groups finished faster than the non-Parson’s control group, but no significant difference in learning gains by condition. One reason for this may be because there wasn’t much use of the adaption features within problems, but more of the between problem adaptions.
CS teacher learning
The final session of the first day covered several different lenses on teacher learning. The first talk was by my incoming Ph.D. student Alannah Oleson. She presented her undergraduate honors thesis work on the pedagogical content knowledge that CS and information science faculty need to teach undergraduates about inclusive, universal software design. Her work is notable for a few reasons: it used action research and it investigated teaching about inclusion, both of which are rare in computing education research. Alannah did an incredible job presenting her work! I’m so excited to have her start at UW.
The next paper was presented by Aleata Hubbard from WestEd, a non-profit research agency, presented a paper about teacher PCK for CS, investigating instructional practices for learning. They did a case study with three math teachers who were participating in a two year professional development program. One of the most striking findings was that teachers who engaged in reflective practice found that reflection more valuable than professional development. Another finding was that there was an inherent tension between promoting student learning and acquiring content knowledge about CS simultaneously. To me, this calls into question the viability of in-service professional development, and professional development in general. It makes me want to invest even more effort in building pre-service programs that are robust, transformational experiences with time for teachers to develop and learn.
I had dinner with a large group of Ph.D. students in Helsinki. To my surprise, many of them were interested in learning more about how to make change in academia. I admitted that my experience was limited, but I had learned a lot in the past few years as I’ve taken more leadership positions. One of the key points I shared was that change is slow, and so making change requires patience, persistence, and planning. I encouraged them to work through leaders in a position to make change, while also learning from those learners, so they could position themselves as leaders later in their careers.
We also talked about the tradeoffs of abandoning institutions that are hard to change and creating your own institutions. I talked about how it’s just as hard if not harder to create new communities, but it gives you more freedom to express your values through new norms. It also results in fragmentation, which is often problematic for scholarship.
Kirsti Lonka’s keynote on learning in Finland
The conference keynote Tuesday morning was Kirsti Lonka from the University of Helsinki. She came to speak on the role of education in Finland. Her view on the history of Finland was that education transformed a poor country into modern one, but that many of the older practices of education have begun to alienate newer generations. She views much of the current educational reform efforts in Finland as integrating digital skills with art, sports, and outdoor activities, engaging youth with computing through the outdoor world. She described this as phenomenon-based learning. One example we discussed at our table was surface tension of water and how it relates to swimming and diving. In this view, computing was just one of many phenomena relevant to youth’s lives. It was a great example of how to integrate computing in a holistic way, thinking about people’s home lives and not just work lives.
Miranda Parker presented work on relationships between spatial skills and programming. She built on prior work showing that spatial skills can be trained to improve learning, but that there are differential gains based on socioeconomic status. Miranda’s work more deeply explored this mediating variable. Her hypothesis was that socioeconomic status leads to access to more practice with spatial skill practice, which then affects CS learning. Maranda tested this hypothesis with a survey and three structural equation models. Because it was exploratory, the work suggests but does not confirm that there is a relationship between socioeconomic status and spatial ability, which then mediated CS achievement. This was of course only for the sample of high SES higher education learners in her sample.
The second talk was also about spatial skills and was presented by Jack Parkinson. The big question he investigated was why there is a relationship between spatial skills and CS achievement. Most spatial skills are characterized as either mental rotation of a structure or mental transformation of a structure. Jack also reviewed the literature on which disciplines have a strong evidence base for the correlation or causation between spatial skills and STEM learning (physics and engineering have some causal links, CS and other STEM subjects are correlation also). Jack’s theory is that spatial visualization skills are tied to visual models of program structure and notional machines, that speed of closure and perception is related to beacons, landmarks, cues, and programming plans. Jack’s first evaluation of this theory was to look at the development of spatial skills across multiple years of learning; the trend was mostly increasing over time, suggesting is that continued advancement improves students’ spatial skills. The paper makes some progress toward proposing a more granular explanation of these relationships, and is an excellent example of a CS-specific theory of programming skill. The program chairs recognized this paper’s rigor and awarded it the best paper award for the conference. Congrats Jack for the excellent work!
The final talk of the session was by my undergrad Harrison Kwik and on an entirely different topic of students who change colleges to study CS (known as transfer students on the US). Harrison presented the results of a survey and interview study investigating what factors might explain differences in transfer student outcomes. Some of the factors that emerged were commuting and the different culture of classrooms in a large university.
Conference committee planning meeting
At lunch on Tuesday I attended the senior program committee meeting to discuss ideas for improvements to the review process. We discussed review quality, reviewer recruiting, ways to be more inclusive to different contribution types, and how to scale the number of papers we accept while preserving the feeling of the single track conference. I was impressed with the diversity of ideas everyone shared and openness of the leadership to consider the more radical ones. I won’t say what they are, mostly because what the next chairs do isn’t up to me, but I personally think we’ll see exciting innovations. It’s hugely rewarding to be part of a community so open to change and innovation in conferencing.
In the Tuesday session on K-12 education, Katie Rich from Michigan State University presented work on teaching decomposition to youth aged 6 to 12. The big question is what this might mean. This work followed their earlier work on developing learning trajectories for this age group. The notion of decomposition they used was that problems can be broken into parts, solved, and then recombined. The substance of this project was clustering learning objectives and synthesizing them into “consensus” learning goals. For example, one learning goal was “systems are made of smaller parts.” While the literature had lots of ideas about how to convey this concept, there were limits to the depth of the ideas in prior literature, especially in how to link ideas of decomposition into tacit knowledge that youth already have.
Eva Marinus presented her work on 3–6 year olds programming robots. Her interest was developing a cognitive model of programming, and in doing so, develop an assessment that works for 3–6 year olds. The key idea of the assessment was to use a task that involved translating a sentence with a prompt into a process for computing an answer. This task was predictive of performance on a programming task with a coding toy. This is really exciting early evidence that verbalization of plan formation may be a sub skill of programming.
David Weintrop presented work on early childhood CS learning experiences, investigating learning strategies that youth use to learn. The study included 135 students aged 9–12. Students had 15 hours of CS instruction over 15 week’s, taught by their teachers. They were using a blocks editor to construct animations and simulations. The curriculum involved storytelling and game design, with two final projects. David analyzed the process that students used to work on their final projects. The strategies they observed had a lot to do with working around the abstractions provided in the language, such as using wait blocks to pause execution to coordinate concurrent procedures, or using and coordinating event handlers to decompose programs. I view most of these results as a byproduct of the semantics of the language, and the tasks youth were trying to achieve with the language.
Conceptions in programming
Dan Zingaro presented a multi-institutional study of student difficulties with data structure. This is important work because most of the literature we have on student difficulties is about basic language semantics, and not on advanced topics. Dan gave some nice definitions of conceptions, misconceptions, and difficulties: the first two are not observable, the latter is, and misconceptions are just conceptions that conflict with accepted evidence about the world. Dan’s study did interviews and a test to try to uncover difficulties. Some of the more interesting difficulties they found were that students often had partial solutions to data structure implementations, overlooked parts of implementations that would have been useful in their solutions, or or ignored performance implications of their choices.
Luke Gusukawa, a PhD student from Virginia Tech, presented work on trying to analyze misconceptions to generate better automated feedback on programs. His particular approach was to focus on instructor-authored automated feedback, indexed by abstract syntax tree matching. In a controlled experiment, feedback authored in this way improved learning immediately after the intervention, and there was an increase in completion of programming problems, but learning by the end of class was the same. Even though there wasn’t evidence of retention, it did achieve the scaling goals.
The last talk on misconceptions was by Alaaeddin Swidan from TU Delft. He was focused on children using Scratch in a museum and wanted to uncover common misconceptions affecting program construction. Hey engaged 145 participants aged 7–17. One interesting misconception they observed was overestimation of the capabilities of the computer. I see this a lot myself.
The conference reception
The reception was at the beautiful Hanaholmen Cultural Center, a place where Finland and Sweden celebrate their shared culture and partnership. The views were stunning, overlooking the sea and the forest, especially on a beautiful summer day. We had drinks outside, then moved inside for a banquet. At my table, we talked about international politics, culture, and what makes teaching so hard. In some ways, there’s nothing more riveting than sharing a dinner with people I just met from around the world, but share all kinds of interests and experiences.
Later, we listened to Lauri Malmi, one of our conference chairs, play classical piano, and then we all played a trivia game. This year was full of anagrams of conference organizer names, common terminology in our field, and past locations of ICER. I thought our table was doing well, but to my surprise, my own students—who named their team “Andy’s Angel’s—beat us and everyone else in the room. I take no credit, they’re a clever bunch!
The first talk of Wednesday morning was presented by Narjes Tahaei from UC Merced. She talked about an approach to plagiarism detection that used a sequence of assignment submission to infer likelihood of plagiarism. They built upon a classification of two programming strategies: planning and tinkering. Her hypothesis was that tinkerers would make large changes after a sequence of incremental modifications. She derived many possible features for operationalizing this behavior and then used logistic regressions to identify which factors were most predictive. After labeling ground truth via expert judgement of plagiarism, they found that they could identify roughly 80% of plagiarism cases. Unfortunately, the paper did not discuss the larger systematic effects of such a technique in practice: would students game it and how would teachers use it?
Paul Ralph from the University of Aukland at New Zealand presented research about object-oriented programming. The key insight of the work was that a count of the number of objects created at runtime would be a useful metric for evaluating object-oriented programs that students write. Underlying this idea was the notion that program behavior, and not program structure, is a better indicator of comprehension of object-orientation. It turns out that at least in the programs the authors analyzed, object counts at runtime were a good discriminator of the object-oriented design patterns that students chose. The design idea was to leverage these object count metrics to give instructors insights into program behavior without having to read programs, as they summarize program behavior. I think the more general idea of summarizing program behavior is a powerful one, both for scaling instructor analysis of programs, but perhaps also for giving feedback to learners about their program’s behavior. It begs the question what kind of runtime feedback supports student reflection?
The last talk on tools was a broad longitudinal study led by Neil Brown at King’s College London on five years of data on programming projects hosted on BlueJ, an object-oriented learning environment. It’s a lot of data, including 40% of users (who aren’t necessarily learners), which include millions of people and their 300 million compilations of programs. They did a mapping study of all the papers who had used the data and analyzed those papers and surveyed researchers who used the data. Pretty much all work has looked at errors in programs, some in learning contexts, some in software engineering contexts. Neil talked about the many difficulties of using program analysis to analyze the data at scale. Personally, I think there is a bounty of techniques from software engineering to support this work; the bigger challenge is defining good operationalizations of learning issues, rather than properties of programs and their behavior.
Interest in computing
The next session was about career, recruitment, and mentoring. Amnah Alshahrani discussed a study of students’ post rationalizations of career choices. Building upon a long history of studies on this topic, this study contributed insight about the sociocultural factors in Scotland. For the most part, the factors the study discovered do not seem particularly different from North American factors: secondary education is weak, jobs are a big motivator, and youth don’t get a lot of support; there are stereotypes of engineers being smart, wealthy men. In one sense, this is evidence that Scotland’s CS culture is similar to other western cultures. Of course, since many of these measures of career choices are highly problematic; can students really account for their decisions and the factors that influenced them? Lots of evidence in decision sciences says no.
The next study investigating interest considered the role of parents. Jody Clarke-Midura from Utah State University presented, starting with the important role of social support in shaping interests. Parental support in particular is strongly related to interest, mediated by gender. She built upon the small number of studies on parental support of CS interest, looking at mothers and fathers separately and how they talked to their kids about their experiences in an App Inventor summer camp for 9–13 year olds. The study found that the camps: 1) led to increases in interest and mother support, 2) that mother support was related to gains in interest, and 3) mother support was related to youth’s view of the utility of learning CS and interest, but father support was related only utility, but not interest directly. The interviews revealed that youth reported their fathers played the role of instructor and a role model, but mothers were not mentioned in these roles at all. Mothers, however, were more likely to provide emotional support. These findings reinforce the gendered role of supporting interest development in CS. They make me wonder about the larger sociocultural factors that reinforce how parents view their roles offering support.
The final study on interest was a continuation of Sebastian Dziallas’ study of STEM education narratives across students’ lives. The probe was to show students narratives about their experiences in STEM from the past and have them reinterpret them in the present. The interviewed students in 2013 and then again on 2017, after they had started their careers. The primary contribution of the work was understanding how to use this method of “rephotography”, and what kinds of insights they can reveal about personal history. One example is that people’s narratives can shift dramatically, but often don’t.
The last session of the conference was a random assortment of interesting topics. The first was about TA grading sessions, by Brian Harrington from the University of Toronto. The question in his work was what benefits collocated synchronous grading sessions have over remote asynchronous grading. They ran a controlled experiment with mock exams. The comparison was unequivocal: collocated synchronous grading was faster, more consistent, less error-prone, and better developed relationships between the TAs and the instructor. This is a fun example of a truly applied and someone mundane question that is nevertheless highly actionable.
The last talk of the conference by Amber Solomon was a fascinating analysis of the non-verbal gestures in describing, analyzing, and discussing code. As an exploration of the potential of using gesture, the study focused on analyzing learners gestures themselves. Amber used a taxonomy of deictic (pointing), iconic, metaphoric, and “beat” gestures that simply match the rhythm of speech, but her goal was build a more discipline-specific taxonomy about gestures about computation. One of the gestures Amber noticed visualizing control flow with deictic references. Students used iconic gestures to convey program output and metaphoric gestures to convey abstract ideas about control flow and data flow. The beat gestures were used to convey repetition and sequences of data. The work makes clear that students and teachers do gesture and about what; the next steps might be understanding relationships in how gestures facilitate reasoning as a form of informal externalization.
Looking back on the last four days, it seems strange to have been anxious about the conflict that I and my students have stirred up. I had a great Sunday mentoring 19 doctoral students, a great Monday raising big questions that most of the community viewed as interesting, important, and necessary, and the rest of the conference really digging into the fantastic work this community is doing. It’s more rigorous than ever, it’s more important than ever, and the community is stronger than ever. I think I’m far from generating harmful conflict. I think being bold, and empowering my students and future generations of our community to be bold, we’re helping us move the community forward. I think the community thinks this too, on the whole, and I’m glad to be a part of it!
Until next year in Toronto!