We need to learn how to teach machine learning

Published in

Bits and Behavior

4 min readAug 21, 2017

Jackson Pollock’s ”*Untitled.” I think it represents most students understanding of machine learning after taking a course.*

This is a revised version of a position paper I wrote for the ICER 2017 workshop “Learning about Machine Learning.”

Knowledge of how to apply machine learning to products is on high demand but low supply. Journalists write endlessly about it, employers want engineers who have it, college students want to learn it, and yet almost no one actually knows it.

Unfortunately, knowledge of how to teach machine learning effectively is also scarce, and has been for some time. For example, in my first year of graduate school at Carnegie Mellon in 2002, for example, nearly everyone in my doctoral cohort wanted to take machine learning to apply it to our research, but we soon learned that few of the faculty knew how to teach it well. One professor regularly came to lectures with 100 slides, 98 of which were mathematical proofs, and delivered them in 60 minutes. Another professor who co-taught the course spend the next three days of class trying to explain the proofs. The students who thrived in that class already knew the material; the rest of us learned little of either practical or theoretical value.

Some faculty responded by creating more practical courses, but they too struggled to convey what machine learning is and how to use it to solve problems. They focused more on practical skills, such as setting up tooling, understanding a confusion matrix, and preparing data. While these were necessary low-level skills for applying machine learning, fundamental misconceptions lurked underneath those skills, leading my peers and I to use (and sometimes unintentionally abuse) machine learning to advance our research agendas. For example, we assumed the default configurations of machine learning algorithms were suitable for our data. We casually selected features without understanding how they were modeling a phenomenon. We believed when a classifier was successfully built, it was “correct” like procedural programs are correct, despite the broad set of possible data-driven flaws.

Now that machine learning is reaching the broader undergraduate masses through MOOCs, through universities classes, and even indirectly through consumer products, I fear these same misconceptions exist but at a much grander scale. We still know little about what students need to know, how to teach it, and what knowledge teachers need to have to teach it successfully.

To correct this, I argue that we need to discover the pedagogical content knowledge (PCK) necessary for teaching concepts in machine learning. PCK about machine learning includes:

Useful representations for concepts in machine learning
Effective analogies, examples, and explanations of machine learning
Knowledge of which concepts in machine learning are difficult and why
Knowledge of conceptions that learners bring to learning machine learning
Methods of informally assessing knowledge of machine learning concepts
Common mistakes the learners make when applying machine learning

If we knew these things, and we had ways to train teachers about these things, I hypothesize that students would learn how to apply machine learning more efficiently and effectively.

One could argue that students and teachers are successful without the knowledge above. After all, students are graduating and apply machine learning to products. I would argue that there are actually very few students applying machine learning to products with an expert-level understanding of how to use, configure, and develop for machine learning algorithms. This is likely why the few who do have this expertise are paid so handsomely. Rather than have companies fight over all-too-scare experts who have successfully acquired knowledge of how to apply machine learning, we need to discover the PCK necessary for helping teachers help many more students successfully learn.

Arguing for the importance of PCK is not new. But in a surprising way, the importance of machine learning PCK is possibly even more important than broadly taught subjects such as math, science, and writing. This is because the products that people make with machine learning are imbuing surprising and problematic biases to a range of decisions in law, ethics, automation, and business. The lack of understanding about machine learning by even just a small population of machine learning software developers can have a profound impact on billions of people globally. Helping the 1,000 faculty better teach the next 100,000 developers will impact the next 1,000,000,000 people engaging directly or indirectly with machine learned systems.

How can we begin to investigate machine learning PCK? We need to:

Study learners’ experiences in learning machine learning.
Rigorously and carefully extract the barriers students face in learning
Evaluate how well explanations of ML concepts produce understanding
Discover the prior conceptions that conflict with ML learning
Invent ways that students and teachers can reliably recognize progress in ML learning.

This will require a broad range of researchers and research activity to achieve, not to mention a lot of funding to support these researchers. With the scarcity of funding for similar questions about introductory computing concepts, generating funding to investigate advanced concepts in machine learning could be quite difficult. Now is the time to build coalitions of faculty who teach machine learning, and applications of it in data science and other areas, to investigate this important but nascent research area.

Unfortunately, there are some major barriers to doing this research:

There’s very little NSF funding for research on learning advanced concepts in computing.
Microsoft, Google, Apple, Facebook, and other large companies with the ability to fund or do research on the learning of machine learning don’t fund computing education research. They only fund new algorithms (that future engineers will fail to use successfully because we don’t know how to teach them).

All that said, I encounter dozens of doctoral students every year who would love to study this topic, for lack of funding and interest by the broader academic community. Let’s all agree that understanding how to successfully teach machine learning (and how to teach anything in computing) is of fundamental importance if we’re to harness AI effectively.

We need to learn how to teach machine learning

Written by Amy J. Ko