30 Days, 30 Papers: Machine Learning Writing Month

Cody Marie Wild
4 min readDec 2, 2017

--

A month ago, I embarked upon an odd, but ultimately fruitful quest. In prior years, I’d participated in the internet project NaNoWriMo (National Novel Writing Month), during which you commit to write 1666 words a day for all 30 days of November. I like the mechanics of this challenge: the way it gets you over the hump of perfectionism and laziness into Actually Doing The Thing, but I’m not really much of a fiction writer. But, I realized there was something I was equally motivated to give myself motivation to do: read and summarize new machine learning literature.

And so, Machine Learning Writing Month was born. The terms of the challenge: write at least 500 words, summarizing a ML paper or concept, on each day in November. The reduction in actual written length was to account for the time necessarily spent in reading and comprehending the actual subject matter.

The acronym was slightly ill-chosen (MaLeWrimo is a bit uncomfortably on the nose, when it comes to ML-community gender dynamics), and the posts definitely vary a bit in quality (I feel some desire to defensively label the ones written in their entirety after 1 am), but I found the challenge to myself really powerful, and I’m very proud to say I’ve kept up with it to completion.

I’ve collected a moderately-organized list of posts at the end of this document, but first, I want to share some overall reflections and strategies I came to during this process.

  • I found it absolutely necessary to put notes in your own words, instead of copying a phrase used by the paper writer. If you haven’t done the work of actually mapping the words to concepts, you haven’t internalized enough to be trusted to explain the idea. One workaround I found, for cases where I wanted to keep track of what was said, but not deceive myself into believing I understood more than I did, was to always use quotes when I pulled sentences direct from the paper. Quotes were my signal to myself that “this is a thing being claimed, but I haven’t yet understood what it means, or why it’s true”
  • Along similar lines, deploying humor in my notes was surprisingly helpful. While I don’t have a great theory on this one, my best guess is that injecting silliness into notes made me feel more like I was talking to a friend, which put me more in the frame of mind of explaining (to myself) rather than simply transcribing
  • Admitting a lack of understanding is enormously freeing. If you force yourself to stay in the register of “fully informed expert”, you push yourself to the extremes of either not sharing any understanding of an idea until you’ve understood it fully, or else pretending you understand something by simply repeating an assertion from a paper, which doesn’t do your readers any favors. Allowing myself to say, “yeah, I didn’t really grok this bit” was both a functionally necessary thing when I’d given myself a time limit and needed to sleep eventually, and also, I think, a useful thing to have gotten in the habit of doing.
  • Repetition is a gift to your reader. One frequently frustrating pitfall of papers was to explain their architecture only once, in one set of words, and leave you to hunt around for clues to build yourself a full picture. When you’re writing a thing down, you have the full concept in your mind, and it can often feel sufficient to simply give one framing of it; one lower dimensional projection, as it were. But, for someone who is entirely new to your idea, one framing might not be enough; it might not tell them what’s novel, what’s relevant; you might just have used a confusing term in your first explanation, and manage to remedy it in a second. Ensembling slightly orthogonal versions of a thing together is a foundational strategy of machine learning practice; it should be foundational to machine learning explanations as well.

Summaries List

If any of the ideas I’ve covered in these summaries represent things you want to learn, I’d love for you to take a read through and let me know what was useful, what was confusing, and, in general, what feedback you have on my approach. Keeping with the tradition of a community obsessed with gradient descent, the best way to learn is with clear feedback!

  • ImageNet Hall of Fame [3 posts]
    (ResNet, Inception, & DenseNet)
  • Machine Translation [4 posts]
    (Translation using monolingual corpuses, and the role of attention in state of the art translation models)
  • Generative Models [5 posts]
    (Variational AutoEncoders, GANs, PixelCNN, WaveNet)
  • Theory [4 posts]
    (Capsule Networks, Generalization Theory, Bayesian DL)
  • Reinforcement Learning [8 posts]
    (My attempt to get up to speed on RL, from basics, through Prioritized Replay and Double Q Learning, up to Alpha Go Zero)
  • Safety & Semi-Supervision [4 posts]
    (Ladder Networks, Few-Shot Learning, Adversarial Examples, AI Safety)

--

--

Cody Marie Wild

machine learning engineer; lover of cats, languages, and elegant systems; professional curious person.