A tesselation of green triangles and wooden walls with ceiling mounted lights. — The wall at Storey Hall.

ICER 2024 trip report: winter in Melbourne

Published in

Bits and Behavior

16 min readAug 15, 2024

In 2016, I went to the ACM ICER conference in Melbourne. It was a strange time in my life: I was just extracting myself from startup life, reentering faculty life, and starting to feel the inescapable gender feels that would eventually lead me into a deep, life-threatening depression, and then eventually out into liberation. My memories of that trip, therefore, were a weird feeling of disorientation, partly from adventuring in a new city, partly from not quite having my footing back in academia, and partly from feeling like my false foundation of manhood was crumbling. I’m sure there was some conferencing in there too!

Returning eight years later, then was a chance to see Melbourne through new eyes. I was excited to return with a firm footing in my academic home and a life much more full of affirmation, freedom, and joy. I had certainly changed and I wondered if Melbourne had too, and how that might make the conference feel different as well. I left early Friday morning Pacific time for the airport, and yelled upstairs to my wife, “Off to male-born!”, a little transphobic gallows humor to signal my departure. And then proceeded to bus, train, and plane for 33 hours, arriving at 5 am on Melbourne’s Sunday.

A bar with several students milling about in an eclectic space of dim lighting and curious murals. — I took some DC participants to a Thai restaurant called Cookie.

Monday: Doctoral Consortium

Throughout the day on Monday, I participated as a discussant, mentoring, supporting, and networking with eight doctoral students from around the world engaging in computing education research. We practiced elevator talks, talked careers, ate together, critiqued arguments, and contrasted institutional contexts. The day was lively, challenging, and fascinating, and I hope that the students left with new perspectives, friends, and a sense of belonging in our growing community.

I won’t go into detail about the day — it was too intimate to want to really be public, but I will say that I leave feeling ever more excited about the field’s future. There are some truly groundbreaking thinkers in our next generation of scholars, ones going far beyond the introductory classroom to more ambitious questions about critical computing literacy and the world’s global capacity for teaching.

A highly angular but organic light green ceiling with neon blue lighting and a polygonal yellow staircase. — Margaret opens the conference reception.

Tuesday: Kickoff, pedagogy, and assessments

The conference and program chairs Paul Denny, Leo Porter, Margaret Hamilton, and Briana Morrison kicked off the conference, which was the 20th anniversary of the conference. The conference published a record 36 papers this year, marking the continued growth of the field; there are also some great short videos from past chairs about the history of the conference. I simmered knowing that one of our papers, and many others’, would not be in the program, for the usual tragic reasons of epistemic gatekeeping, while also feeling grateful for the program committee’s unseen labor over the past six months.

The first session started with papers about student self-assessment:

Melissa Chen (Northwestern) examined students’ negative self-assessments and found several reasons for these, including self-expectations, performance judgements, or lack of ability to recover. Some students demonstrated resilience, normalizing struggle or indicating self-regulation skills.
Maja Dornbusch (Müenster) examined the possible relationship between compiler error messages and sense of belonging through a set of interviews. She generally found that many respondents attributed human-like qualities to the messages, placing human expectations on their communication quality. The authoritative nature of these communication qualities often led to self-doubt.
Yinmiao Li (Northwestern) investigated the interaction between metacognition, affect, and behavior in non-majors. Though an analysis reflective diaries and retrospective interviews with 20 students in a Python project course, they found avoidance of struggle, persistence through struggle with self doubt, and persistence through struggle with self affirmation.

After the session and a lively break of reconnecting, I took a short break from the conference during the next session to grab some lunch for later, since I was scheduled to run a trans youth group remotely on Zoom (at 7 pm Pacific time). I probably should have just found a different volunteer to take my place for the group — especially since I was missing a good chunk of the first day. But it’s also a highlight of my week, and didn’t want to miss it.

The stage and projector screen showing a Parson’s problem example.

I made it back for two talks, one contributing additional evidence about the efficacy of Parson’s problems as scaffolding, and another showing that testing checklists as a form of scaffolding also helped in the short term, but not later in the semester, possibly explained by the ways it constrained their test designs. Both demonstrated and reinforced a long held instructional design principle that scaffolding works, but requires careful use and eventual removal.

I left before the third talk to set up for my youth group and eat my takeout. I facilitated seven youth supporting each other about interacting with exes, fighting with insurance, and the many small wins that would carry them through their week. The group turned out to be a nice way of staying grounded in my conferencing. There’s nothing like a reminder of marginalized youth’s day to day lives to keep my perspectives on research centered on youths’ actual lives, rather than only through the lens of data that tends to erase their perspectives.

I wrapped my youth group shortly after the start of the afternoon session on generative AI.

I missed the first talk from Stephanie Yang (Harvard) on help seeking with the AI tutor in a large intro course; check out her paper titled Debugging with an AI Tutor: Investigating Novice Help-seeking Behaviors and Perceived Learning, which generally shows that students didn’t view the chatbot they used as a primary source of debugging support, and also reported several unproductive uses.
I did make the next talk from Juho Leinonen (Aalto) to talk about 1st author Evanfiya Logacheva’s work on personalized programming exercises. They pre-generated exercises and then reviewed them manually, and examined their quality, students’ judgements of quality, and how they interacted with a system that allowed students to choose a context of their choosing. Instructions found them shallow; students found them clear and aligned with learning and liked having control over themes.
Aadarsh Padiyath (Michigan) presented the third talk on Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course. Through a series of learning performance, surveys, and interviews, they found that students internalize technological determinism narratives, and that these were shaped by peer use and perceptions of future jobs requiring use of LLMs. But students were very self-aware of the negative impacts on their learning and changed their behavior, demonstrating that how LLMs are used are very much socially shaped, not technology determined.

After a lovely break talking about disciplinary cultures and moral hazards in fundraising, there was a short session on student challenges.

Björn Fischer (RheinMain University) gave a methods talk about data collection in research studies, especially from a GDPR lens. He talked about an encryption approach in which student advisors serve as data trustees. The team experimented with this model to see if there was an increased consent rate and trust. Their was no evidence of increased rates of consent, but there was less reduced anxiety of disclosure.
Parthasarathy PD (BITS Pilani) presented a study on the relationship between personality and collusion in learning settings. Building upon inconclusive prior work on plagiarism, they examined this relationship in an AI course that was 93% men by asking students to take a plagiarism pledge and then completed an assignment. Individuals with a high extraversion score demonstrated a greater tendency to engage in plagiarism, and those with higher conscientious scores tended to demonstrate less, consistent with prior work on the “fraud triangle” model.
Finally, Artturi Tilanterä (Aalto) talked about student misconceptions about Dijkstra’s algorithm, such as student-constructed rules, conflated concepts, and deviating concepts, all of which are generally invisible to students. They found, while interviewing students about their use of an algorithm visualization, that students had many confusions between spanning trees and priority queues and that students overlook dynamic programming aspects of the algorithm.

After a mini-break talking about how problematic most plagiarism papers are, the last session had two papers:

The first, presented by Viral Kumar (Indian Insttute of Science), investigated the extent to which beginner programming problems can be written that LLMs can’t answer. They built on an older idea of intentionally incomplete problems, as generative AI tools problematically produce code even in the presence of missing information. Their system allowed students to probe aspects of a question and found some evidence for feasibility.
David Smith (UIUC) presented a study of distractors in Parson’s problems, replicating prior work that distractors improve learning in writing and debugging code. They also replicated why: distractors cause students to attend more deeply to the meaning of code in the problem. All of this was in the context of an intro Python course on list sorting and the effect size was signficant but modest. They did a nice job of curating distractors from prior responses.

Overall, the first day was full of compelling work. I lamented the lack of work on equity, teachers, and schools, but there was a bit more critical work than usual, which made me happy. Fortunately, there was a lot more in the coming days.

Five attendees smiling in front of a table of raw meat and vegetables and boiling broth. — Hot pot adventure!

I quickly shifted my attention toward dinner and had a wonderful evening talking about the day over hot pot with several attendees where we playfully imagined futures of computing education.

Wednesday: teaching, learning, equity, and help

I woke up super early on Wednesday, wide awake at 4:30 am thanks to jet lag and caffeine. I made the most of it by reading about the history of Melbourne’s coffee scene and finding a cute cafe that opened at 5:30, where I found a cozy corner and participated in a 7am Dean’s review meeting with our university’s Provost. (Never thought I’d write that sentence!). I then went off to a kitschy pancake house in the train station to rally my two PhD students Rotem and Eman who were both presenting their research at the conference. (I never thought I’d write that sentence either! Take that LLMs).

The second day of the conference kicked off with more anniversary retrospectives. Mark Guzdial lamented the free-wheeling ideas of the early days that have been lost with more narrow focus on positivist “rigor”. Raymond Lister talked about the fool’s hat tradition, which was granted with the paper that pushed boundaries, and reflecting on the progress of 20 years of research. Michael Caspersen talked about Marcia Linn’s keynote on how to teach programming and the importance of engaging other disciplines.

The first session focused teaching:

Vidushi Ojha (Harvey Mudd) talked about instructional transparency about learning goals and values, building upon prior frameworks on the concept. Through linear regressions on student survey responses of 11,000+ students, she investigated whether transparency was associated with self-efficacy, belonging, and identity. She found that women, first gen, and students with disabilities felt like there was less transparency; that perceptions of transparency predicted self-efficacy and belonging, but was mediated by racial identity.
The second talk was by Naaz Sibia (U Toronto). She explored the effect of grouping Q&A forums in classes by levels of prior knowledge, and giving students agency over their anonymity. Through a thematic analysis of discussions and some quantitative analysis, they found in one class that students tended to hide their identity when they had less experience to protect their psychological safety; in another class it was the opposite. The quasi-experiment nature of the study made it very difficult to disentangle the many factors that shape student help seeking behavior.
The final talk was by Parthasarathy PD (BITS Pilani) on a survey on teaching accessibility in India. They replicated a study from my lab on faculty perspectives on teaching accessibility, but in India, and with supplementary interviews. They received 75 responses and conducted 13 interviews, generally finding that only <1% teach accessibility, that they were more likely to be women, and more likely to know someone with a disability. They also found that most of accessibility topics were in HCI and Software Engineering courses, but that students mostly did not interact with people with disabilities in the course.

Eman in front of a podium and a large projector screen with a blue background and the paper title. — Eman kicks off her talk on assessment policies.

After a generous 45 minute break was a short two paper session on equity and diversity.

The first speaker was my doctoral student Eman Sherif (University of Washington). She spoke on how assessment policies interact with students’ lives, finding through a series of interviews that policies have many unintended consequences, partly shaped by marginalization of students identities, and partly shaped by inherent ambiguities in policy.
Melissa Marchner and Maria Christensen (IT Copenhagen) talked about gender representation in teaching materials. They examined the high school informatics materials of 55 different classrooms in 16 different high schools in Denmark. Examining textbooks, audio/video, webpages, assignments, and documents, and counted pronouns, names, and people in videos. They found mostly representation of men (unsurprisingly, because why would we expect the men in charge to dismantle the Danish patriarchy at scale, even after all these decades of feminism?).

Before lunch, there was a flurry of lightning talks on works in progress. They included generative AI chatbots for secure programming education, artistic expression through AI literacy, examinations of concurrency errors when students use LLMs, sociotechnical AI literacy, and more.

After a tasty ramen with a fun group of attendees in which we dreamed about a scholarly world free of epistemic gatekeeping, we had a session on students including three papers that made it through ICER’s post-positivist sieve:

Arif Demirtaș (UIUC) talked about programming plans, and their analysis of learning curves examining practice opportunities and error rates, extending prior work on syntax learning. They found that to some extent, plans explain skill development, and that counting, sum, and filter plans were learned most successfully, unlike more complex plans. These results suggest that learning curves can be a reasonable way of analyzing plan learning.
Sverrir Thorgeirsson (ETH Zurich) used electroencephalography (EEG) to examine cognitive load in two different editing paradigms, programming by demonstration (PBD) and textual Python editing. They found that EEGs could help explain load and that the PBD editing facilitated improved learning but similar cognitive load. It wasn’t clear what to take away from the paper, since it had so many mixed motives, but it did push on interesting boundaries.
Jinyoung Hur (UIUC) talked about the motivations of conversational programmers (students who want to communicate fluently with developer colleagues, but not be them). Through a survey and an expectancy-value theoretical lens, they found that many were learning CS to be conversant only, that students underrepresented in computing are overrepresented, and that self-efficacy was lower than students aspiring to be developers, but they were largely focused on career goals.

After a lovely break of doctoral consortium posters and chatting about CS departments big and small, we had a short session on scalability:

Keith Train (North Carolina State) talked about a developers’ perspective on scaling educational programming tools. Through a series of interviews, he revealed a lack of incentives post-publication, lack of funding, and significant maintenance effort, leaning on industry partnerships, the need to focus on teaching materials for adoption,
Carsten Shulte presented Lucas Höper’s paper (Paderborn) on data literacy and empowerment in K-12. Their study found that explicit instruction on explanatory models of data-driven systems empowered youth to have more self-reported agency in their everyday interactions with digital technology.

After a short break, I was session chair for the last session of the day on student support:

Shao-Heng Ko (Duke) presented on help seeking behaviors, finding through a survey that the pace of students’ individual help seeking behaviors are relatively stable across courses, but that their social behaviors are highly context dependent.
Jan Vahrenhold (Müenster) presented on regulation in group work, finding that group work self-efficacy mediated the relationship between socially shared regulation and team performance.

After the session, and a short break, we went to the State Library Victoria for our reception, across the street from our RMIT building. Afterwards, I ventured to the night market for treats and shopping.

A Greek style facade of columns with a staircase. — Our library venue across the street.

Thursday: un/critical pondering

The first session of the morning was on teaching practices:

Alice Chung (UC San Diego) talked about media artists in teaching computing workshops, finding through a series of interviews with teachers that most were motivated by learning, sharing, and cultivating new aesthetic practices, and that most focus on teaching repurposing of computing beyond their original intents (e.g., web scraping), and ways of organizing artist-led technology.
David Torres-Mendoza (UC Santa Cruz) talked about undergraduate research pathways. They analyzed an undergraduate, low-stakes research orientation program and found that it caused students to reconsider career and education goals, particularly by focusing their attention on the nature of research and the benefits of contributing to research.
Brandt Redd (Utah State) talked about project work in cybersecurity education. He studied an informatics cybersecurity capstone course at UC Irvine that included several explicit learning goals around risk, security, ethics, and user experience. They demonstrated learning outcomes.

After a break of careful coordination around journal editing and future conference chairing, we returned for a couple papers on equity and diversity:

Amreeta Chatterjee (Oregon State) presented on inclusive debugging in online course materials. Her work used a rule-based inclusivity defect detector to find workflows that don’t support diverse problem solving styles (e.g., lack of step by step styles). They shared this tool with faculty and found that instructors were able to understand and fix the issues in learning materials.
Rosalinda Garcia (Oregon State) presented on a study of students’ behavior shifts after learning about inclusive design. They examined ways of embedding inclusive design into non-HCI courses, and found that many students who received instruction on inclusive design did have fewer inclusivity issues for all GenderMag personas for information processing inclusion issues, despite not having an explicit incentive to address inclusivity bugs.

We then had another batch of lightning talks about spatial skills, teaching ecosystem scaling, low-code learning with LLMs, teaching accessibility, and software design specification feedback.

Murtaza in a pink button down in front of a white slide starting with the title “Using Benchmarking”. — Murtaza kicks off his talk on LLM assessment benchmarks.

Next, I had lunch with RMIT HCI faculty George Buchanan and Dana Mckay. We chatted about computing education, information schools, decolonization, and more. I came back in time for the generative AI session:

Murtaza Ali (University of Washington) presented on a benchmarking study of how well LLMs perform on CS assessments, and ways of tracking this over time by using the SCS1 and BDSI concept inventories. They used IRT to model question difficulty and found that nearly all LLMs did worse than an average student on the SCS1 and most did better than average on the BDSI. They examined why and found that in many cases, it was LLMs choosing the same distractors.
James Prather (Abilene Christian) reported on a study of benefits and harms of LLMs to student learning. Through participant observation, eye tracking, and interviews (replicating a prior study on metacognition), they found that when students used Github Copilot, students with no metacognitive difficulties were able to use Copilot to accelerate their problem solving; the students that did have metacognitive difficulties exacerbated metacognitive difficulties and encountered new ones, amplifying disparities in learning outcomes and harming self-efficacy. Interestingly, the evidence supported many of the predictions in my essays on LLMs (More than calculators, LLMs will change programming a little), but also demonstrated that students with strong metacognitive skills tend to be helped (consistent with generally all prior work on learning technologies — they amplify).
James Skripchuk (North Carolina State) investigated students intentions around using web resources and generative AI. He argued that LLMs are not so different from use of web resources, and investigate students’ choices around whether to use web or generative AI resources. Through a scenario-based survey, they found that students generally assess the utility of resources, while accounting for plagiarism policing around use, but also peer opinions about resource use.

Benji with a mic and Rotem in the background in front of the podium with a white slide on stage. — My former PhD student Benji kicks off the last session of the paper, where my student Rotem spoke on moral prisms.

After a lovely and generous poster session and break, we head into the final session on ethics education, chaired by my former doctoral student Benji Xie (now at Stanford, soon University of Denver):

The first speaker was my wonderful advisee Rotem Landesman (University of Washington), who spoke about engaging ethics education pedagogy when teaching computing ethics. She shared a study in which she combined computing and philosophy education (philosophy for children in particular), examining youth’s ethical sensemaking.
Michelle Tran (UC Colorado, Boulder) talked about ethics in computing education group projects, particularly from the perspective of CS obsessions with workforce development. Through a semi-structured focus groups with students, she found that students tended to center cynicism about tech companies, tended not to see ethics as legitimate learning outcomes for group learning.

After the session, we spent some time as a community to recognize exemplary work, reflect on the conference, and look forward to next year. There were thank yous for the outgoing chairs Paul Denny and Margaret Hamilton. There were then awards for Best Paper (Melissa Chen’s work on self-assessments), Honorable Mention (Vidushi Ojha’s paper on instructional transparency and Rotem’s paper on Ethical Sensemaking, and Keith Trans’ work on education tools), and Lasting Impact (Colleen Lewis’s The importance of students’ attention to program state: a case study of debugging behavior from 2012). Congratulations to all of the amazing students whose work was so well deserving of recognition! Paul then handed off the program chairing to Leo Porter (UC San Diego) for his second year, along with incoming program chair Neil Brown (King’s College).

As always, I found ICER to be a rich, lively, mutual supportive place of growth and humility. There is such a wonderful thread of curiosity through the community, a passionate for change and growth, and an eager engagement of research methods as a means to that evolution. It certainly has its problems — continued epistemic gatekeeping, uncritical inquiry, oversight of key discourses in learning sciences and education research. And by no means is it on the leading edge of progressive, equity-centered thinking in scholarship on learning. But it is a place that I continue to see shift, respond, and adapt in powerful ways that many academic communities do not. I look forward to next year, where we will continue to nudge ourselves forward in Charlottesville, Virginia, USA (and in Sweden the following year).

After the conference, I ventured out into the city with other queer attendees for some food at a rooftop greenhouse bar and one-actress play at an intimate community theater about codependency. It was a lovely evening of decompressing with new and old friends, listening, laughing, and creating space for each other, playing and plotting for the world we all deserve. It was the perfect capstone to an already warm week in Melbourne’s cozy winter.

Bits and Behavior

ICER 2024 trip report: winter in Melbourne

Monday: Doctoral Consortium

Tuesday: Kickoff, pedagogy, and assessments

Wednesday: teaching, learning, equity, and help

Thursday: un/critical pondering

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Bits and Behavior

Written by Amy J. Ko

Responses (1)