A Decade of Demographics in Computing Education Research

Benji Xie
Bits and Behavior
Published in
8 min readAug 18, 2022

We read 510 papers to understand how CER wields demographic data and identified considerations for more rigorous and equitable research.

Demographic data is critical to computing education because it helps us to understand how people from diverse groups experience similar phenomena in different ways. For example, demographic data could help us identify whether a pedagogical technique serves students of different genders fairly. But demographic data is also problematic, with its groupings and categorizations historically used as an oppressive tool. Continuing the previous example, demographic data could fail to consider non-binary genders, effectively erasing non-binary students’ experiences from consideration. Demographic data is simultaneously a tool which we can use to support equity and rigor in computing education as well as a tool that can perpetuate harms and biases in computing education. So it’s important we think about how we wield it!

Figure showing directed process from “which populations studied,” to “how data collected,” to “how data reported,” to “how data used.” Underneath the figure reads “Researchers make **decisions** about demographic data!”
Researchers make decisions about demographic data throughout these four phases.

Many decisions go into the process of using demographic data. In computing education research (CER), we can think of this across four phases:

  1. Which populations do researchers decide to investigate?
  2. How do researchers decide to collect demographic data?
  3. How do researchers decide to report demographic data?
  4. How do researchers decide to use demographic data?

To explore what decisions CER researchers made, we analyzed 510 papers published at 12 venues from the past decade (2012–2021). Here’s what we found:

CER overwhelmingly investigated post-secondary/ university students.

Given that most computing education researchers are affiliated with a university, it comes as no surprise that most CER papers study university students, labeled older learners: formal in the figure below. What is surprising is the shear magnitude of difference: 3 in 5 CER papers studied university students. And given that most people will not learn computing in a university setting, this is a mismatch between point of investigation and actual experiences!

A horizontal bar chart that shows the number of papers that studied each population. The category “older learners: formal” has by far the most with 304 papers, followed by “young learners: formal” with 80 papers and “educators: formal” with 68 papers. The rest of the categories have between 2 and 49 papers, most on the lower end of that range.
Codes reflecting the frequency at which analyzed papers studied different populations. Total number of codes (604) exceeds the number of papers analyzed (510) because 69 papers studied multiple populations.

While post-secondary/university students were investigated more frequently than students in pre-K, primary and secondary levels (labeled “young learners” in chart below), the reverse was true for educators. Educators in formal pre-K, primary, and secondary learning contexts (educators: formal - 68) were investigated in 3.5 times more papers than post-secondary/university educators (educators: post-secondary - 19). This may reflect a growing interest in training and certifying pre-K, primary, and secondary level computing educators. In the United States, a teacher must complete a post-secondary degree or professional development to be able to teach a primary school student how to drag and drop Scratch blocks. But no such teacher training is required at all to teach university students machine learning.

We hope to see future research that explores how learners, educators, and professionals engage with computing education across formal, informal, and online contexts!

Most CER papers left unclear how researchers collected demographic data.

Consider a multi-ethnic student with a hidden disability in a predominantly white introductory computing class. The demographic data they chose to share with researchers would almost certainly differ from what their teacher would report on their behalf. Or consider a non-binary or trans student filling out a survey question about “gender” where the two options are “female” and “male.” What they would say if given the option to write any response vs what they would say when constrained to the survey would like vary drastically. How demographic data is collected is important to understanding how it was reported and used.

A horizontal bar chart that shows the number of papers that collected demographic data in various ways. The category ``unclear’’ has by far the most with 346 papers, followed by “self-report: custom” with 147 papers. The rest of the categories have between 4 and 46 papers.
How papers collected demographics. Total number of codes (580) exceeds the number of papers analyzed because 65 papers collected demographics 2–3 ways.

But 2 in 3 CER papers left unclear how researchers collected at some demographic data, as shown in the figure above. After that 30% papers used custom instruments to have participants self-report demographic data (self-report custom-147), but rarely did researchers share these instruments (e.g. the surveys they created). Pairing that with low use of existing instruments (only 14 papers, self-report existing), we find that computing education researchers tend not to reuse or enable the reuse of data collection instruments.

School/university enrollment data was a common source for demographic data for the 5% (23) of papers that relied on a pre-existing data. In at least one instance, this led to limitations to the reporting of demographic data because gender was only recorded as female or male.

We found that exemplary papers had robust descriptions of researchers collecting demographics in justified, transparent, and responsible ways.

CER Papers barely reported many important demographic attributes

The figure below shows how many CER papers reported demographic attributes for all study participants (yes, fully), some but not all participants (yes, inc), or for no participants (not at all). For example, let’s look at gender. Of the 510 papers in our sample, 163 reported gender of all participants, 71 papers for some but not all participants, and the remaining 276 papers did not report gender for any study participants.

A series of 11 horizontal bar charts about how CER papers reported demographics. The charts are titled with the 11 demographics we coded for and the first 10 have categories of “yes fully” “yes incomplete” and “not at all”. Most charts show that demographics were rarely reported (large “not at all” bars). Exceptions are gender, which was reported in 234 papers, age/grade reported in 296 papers, major/program reported in 154 papers, and geographic location reported in 336 papers. The last chart i
How CER papers reported 11 demographic attributes. The frequency of partial/incomplete reporting (“yes, inc”) was especially concerning!

We analyzed 11 demographic attributes in all. Here are some high-level takeaways for each:

Gender reporting reflected an assumed binary dichotomy and conflated gender with sex.

Gender was among the most commonly reported demographic attributes, in part because of the growing emphasis on broadening participation in computing. Initiatives to broadening participation often focus on women, reflecting an assumed gender dichotomy between women and men.

An assumed gender dichotomy was also prevalent in the reporting of demographic data. This was especially true for partial reporting of gender where either the proportion of women/female/girls or men/males/boys were reported. Incomplete gender reporting implicitly reinforces binary gender norms and contributes to erasure, implying that given information about participants of a single gender, readers can infer the identities of unlabeled participants (typically implied to be the “other” binary gender).

A friendly reminder that gender is a social construct (e.g. non-binary, transgender, woman) and sex is a biological construct (e.g. female, male). While both impact ones’ lived experiences, social science and education researcher should typically focus on the social construct!

Exemplary reporting of gender focused on normalizing non-binary genders by allowing participants to self-report and remain authentic to their chosen labels.

Race and ethnicity was only reported for 1 in 5 papers

Terms to describe race and ethnicity include Black, Indigenous, and Hispanic.

Racial categories typically followed political and historical trends, such as using US census-defined categories. Partial reporting of race or ethnicity (e.g. “83% Caucasian”) left readers making assumptions about the race of the remaining participants, further minoritizing and erasing non-dominant identities.

Exemplary reporting of race or ethnicity went beyond the categories to provide descriptions that reflected diversity of cultural experiences. For example, one paper supplemented racial categories with languages spoken at home to illustrate diversity within homogenous categories.

Ability, Socio-economic status (SES), language fluency, and family/household info are important, but barely reported.

Physical, mental, and social (dis)abilities, SES, fluency in instructional language, and family/household context are all important factors when considering students’ learning experiences. But these demographic attributes were barely reported, with at most 17% of papers reporting any information about these 4 attributes. These reflect assumptions of ableism, English-language capabilities, and privilege that can erase or ignore experiences of many diverse groups.

We hope future papers will consider less commonly reported yet important demographic attributes such as ability, SES, language fluency, and family/household information!

Aggregate terms left ambiguous

Of the 510 papers we analyzed, 1 in 5 used an aggregate term, such as underrepresented, diverse, at-risk, BIPOC, people w/ disabilities, and non-STEM. These broad terms can have pragmatic purpose (e.g. privacy, solidarity), but can also erase individual identities and experiences. Tiffani L. Williams wrote a CACM post on why the aggregate term “Underrepresented Minority” (URM) is considered harmful, racist language.

Most papers that used an aggregate did not define or disaggregate them, leaving unclear which demographic groups the term was referring to. For example, one paper described their study participants as “homogenous,” while another described theirs as “heterogeneous.” 🤔

Not only does this ambiguity impact the clarity of a paper, they also require readers to assume their meanings, which can implicitly perpetuate norms of dominant and marginalized groups in computing.

Exemplary reporting of aggregate terms would include a definition and disaggregation of the term, as well as justification of why this term was used.

CER Papers used demographics for descriptive or contextual purposes, not for validity concerns

CER papers used demographic data for many different purposes, as shown in the figure below. They could use it to motivate the paper (e.g. study experiences of computing students with disabilities). Papers could use demographic data to describe their study participants (e.g. age of participants) or describe the study population to provide context (e.g. percentage of students at a school whose families were low enough SES to be eligible for free/reduced lunch). Demographic data could also be used for analysis, such as evaluating how an intervention worked for participants of different genders. Finally, papers used demographic data to consider the validity of their study, such as to identify limitations because their study lacked inclusion of a particular demographic group.

A horizontal bar chart that shows the number of papers that used demographics in different ways. The category “description” has the most with 420 papers, followed by “contextualization” with 268 papers. The rest of the categories have between 13 and 152 papers, most on either end of that range.
A horizontal bar chart that shows the number of papers that used demographics in different ways. The category “description” has the most with 420 papers, followed by “contextualization” with 268 papers. The rest of the categories have between 13 and 152 papers, most on either end of that range.

We were particularly surprised that only 14% (70) of papers considered demographic data regarding the validity of their study, even though 94% (478) of studies were single-site studies overwhelmingly in western, educated, industrialized, rich, and developed (WEIRD) countries. This suggests a rampant and unrecognized WEIRD bias in CER.

We hope that CER papers will normalize considering demographic data as part of validity arguments to foster more rigorous and equitable research.

Conclusion: How to make deliberate and explicit decisions about demographic data

We conclude by identifying four considerations that can foster more responsible wielding of demographic data for rigorous and equitable computing education research:

  1. When choosing populations, consider who is and isn’t there, and why.
  2. When collecting demographics, use justified, transparent, and responsible methods.
  3. When reporting demographics, recognize biases and make assumptions explicit.
  4. When using demographics, provide details to support interpretation and engage with broader contextual factors.
Figure describing four considerations when wielding demographic data: 1) When choosing populations, consider who is and isn’t there, and why. 2) When collecting demographics, use justified, transparent, and responsible methods. 3) When reporting demographics, recognize biases and make assumptions explicit. 4) When using demographics, provide details to support interpretation and engage with broader contextual factors.
To wield demographic data for more equitable and rigorous research, we must be considerate.

To learn more, read the open-access paper and find out more information on my website.

Thank you to my awesome co-authors/ labmates/ mentors: Alannah Oleson, Dr. Jean Salac, Jayne Everson, Megumi Kivivua, Dr. Amy Ko.

Full paper citation:

Alannah Oleson, Benjamin Xie, Jean Salac, Jayne Everson, F. Megumi Kivuva, and Amy J. Ko. 2022. A Decade of Demographics in Computing Education Research: A Critical Review of Trends in Collection, Reporting, and Use. In Proceedings of the 2022 ACM Conference on International Computing Education Research — Volume 1 (ICER ‘22). Association for Computing Machinery, New York, NY, USA, 323–343. https://doi.org/10.1145/3501385.3543967

--

--

Benji Xie
Bits and Behavior

I design equitable and critical human-data interactions. Embedded Ethics Fellow, Stanford HAI, Ethics in Society. PhD, UW iSchool. Prev MIT CS, Code.org.