Anki Design Study: Advanced Machine Learning Concepts

Eric 'Siggy' Scott
Euthyphroria
Published in
5 min readDec 23, 2020

Here’s a couple cards that came up for my biweekly design review in my Anki knowledge base:

Whew! What pair of doozies! These beasts are from back when I thought drawing things by hand was the best way to create custom diagrams for Anki. Bad choice!

Besides the wonky graphics, these have a couple smells:

  • Each answer is complex, in the sense that I need to recall several parts. As we’ve seen with equations, complex objects often get easier if we ask about their parts one-at-a-time and/or with friendly graphics.
  • The cards are a little more isolated than I’d like (and this makes them hard to answer). Tasks and domains always go together — so we could benefit from a card explaining how they interact.

Learning Advanced Algorithms

These cards belongs to what I call my “Algorithms and CS” deck. This deck is devoted to abstract concepts in computing, which stand independently of any particular programming language, operating system, etc. I like this deck, because the focus is on understanding concepts, rather than memorizing syntax (syntax is always somewhat arbitrary!).

This card also belongs to one of the trickiest domains to apply Anki to: cutting-edge research. Creating flash cards from well-trodden textbooks is generally pretty easy, because the community has come up with really clear, canonical, and well-motivated definitions and explanations for things. But research is a wild west, with conflicting definitions all over the place, where you’re never quite sure what’s important and what’s just noise, and very little clear pedagogical material.

In this case I’m trying to capture some fundamental concepts from a famous survey of transfer learning (by Pan and Yang, 2010), i.e., of machine learning methods that reuse information from one task to improve their performance on a different, but related task. If you’re interested in this area, here’s a post that covers everything you’d want to know and more.

Understanding the basic concepts of a domain and a task is important, because it allows us to understand several different types of transfer learning that Pan and Yang (among others) cover in their classification schemes. I have cards for some of these, and while they’ve got their own problems, part of the issue is that I haven’t really internalized the basic notions of a “domain” and “task” that they build upon:

Examples of other cards that depend conceptually on Domains and Tasks. I’ll need to refactor these too.

A Better Diagram

Let’s start by addressing those hideous images. Nowadays, when I can’t find a good, specific image for my cards with a Google Image search, I prefer to create custom diagrams in Google Slides.

While I’m at it, I’m actually going to collapse my two original diagrams into one. This allows me to express the connection between a domain and a task.

A custom diagram that shows both the “domain” and “task” components of a machine learning problem.

This diagram is more complex, but it allows me to juice it for all it’s worth by reusing the same diagram for several related concepts.

Analyzing the Diagram

Now for one of my favorite Anki strategies: drawing boxes around parts of a diagram to analyze it into various interacting components.

Of course, this isn’t the only way to define and explain a “machine learning problem” in the abstract. But it’s a good one, and I can use Anki to choose to make it my personal “canonical” definition. If that makes me uneasy, I could also have said “according to Pan and Yang” or “in a typical transfer learning context” to qualify the prompt:

At this point, we’re done. We’ve met our goal for the design review: fixing at least one card.

For completeness, though, I’ll show how I’d continue to improve the related cards that depend on these concepts. In practice, you needn’t do this all at once: leave some for the next refactoring session so that you don’t get overwhelmed!

Variations for Follow-on Concepts

Now I can address the cards I really care about: the ones that use more basic concepts to define variations of transfer learning.

And we’ll add one more card pairing the two together, just to help remember there’s two of these that we need to keep separate in memory:

Now the ideas are super clear, but I expect I may still get confused about the terms. Pan and Yang don’t explain why one of these is called “inductive” while the other is called “transductive.”

A quick Wikipedia search teaches me, however, that these terms are standard, having been introduced by Vapnik in the 90’s to differentiate between algorithms that compute rules for labeling new instances (induction) and algorithms that reason about new sample instances from the same domain (transduction).

Knowing what “transductive” vs. “inductive” learning mean will make the above cards easier to keep straight. Adding a couple new cards to cement these last two concepts will reduce our reliance on rote memorization to recall names of the different types of transfer learning — as well as the relationships between concepts (ex. when I say “domain adaptation is a kind of transductive transfer learning,” it helps to have an intuitive sense of what the word “transductive” means!).

But I won’t show those cards here: you get the idea by now. We’ve easily completed two or three sessions worth of refactoring at this point! Time to move on to the parts of our life (if any) that don’t revolve around Anki!

--

--

Eric 'Siggy' Scott
Euthyphroria

AI researcher, language enthusiast, and modern Stoic practitioner