Where Have We Been?

Tiger Shen
Paper Club
Published in
3 min readMar 11, 2018

This blog may have been inactive since the turn of the calendar year, but rest assured Paper Club has not!

After returning from the holidays we got together to chart a course for Paper Club in 2018 and beyond. The J’s had a blast at NIPS in December and Jason Benn started his first professional machine learning gig at Sourceress. We were pleased with our respective and collective 2017's in AI/ML and wanted to keep the momentum going.

Finding Bayesian Methods

We looked at our goals and surveyed several areas of potential interest. There is so much activity in the domain right now that analysis paralysis and FOMO are almost unavoidable. We met on a boat (long story), waded through some of this, and eventually decided to explore one of our many shared curiosities: Bayesian neural nets. There was a lot of chatter about these architectures at NIPS, we had varying levels of personal interest in Bayesian statistics as a whole (mine was quite high), and we had all seen firsthand the main problem that Bayesian nets purport to solve: interpretable confidence levels in deep learning. And so, it seemed as good a place as any to continue learning together.

Briefly and most basically, Bayesian neural nets are a class of neural net that:

  1. model parameters as distributions instead of scalar values, and
  2. allow for architect input of prior distributions for the parameters.

These two properties allow for many extensions to the capabilities of neural nets. It is a comparatively greenfield architecture with exciting possibilities.

Bayesian neural net regression on this equation with different small samples, demonstrating uncertainty bounds. From http://mlg.eng.cam.ac.uk/yarin/blog_2248.html

Learning Bayesian Methods

We quickly realized a third property as it applied to our attempts to learn Bayesian NNs: they are conceptually a bigger leap from vanilla neural nets than anything we’d encountered before.

We began with Yarin Gal’s PhD thesis. I’m sure there is immense value in these 150-odd pages, and I think we will end up returning to it, but it was very far above the level of the January version of me (and others…right guys?).

I understood about 5% of the easiest chapters of the document. Even still, this was more than enough to pique my interest: active learning, deep Bayesian reinforcement learning, medical diagnostic confidence bounds, oh my! (See Chapter 5 for a relatively digestable summary of these applications and more) I understood the neural network-related concepts, but it was clear I’d need more background in Bayesian analysis and probabilistic programming in order to grasp the entire thesis.

The next place I looked was Pyro’s tutorials. Pyro is a new library from Uber for “Deep Universal Probabilistic Programming”. Since I’d spent the past 9 months doing the “deep universal” thing, I figured I might be able to pick up the “probabilistic programming” part on the side as I worked through tutorials. After all, it comes second in the description so how hard can it really be?

Too hard for me, as it turned out. Dropping down one more level of simplification, I arrived at Cam Davidson-Pilon’s Bayesian Methods for Hackers book, which had previously been shared to the group by Jason Morrison. This book entirely focuses on Bayesian analysis and probabilistic programming, without getting into the deep learning applications. And it was just right.

The rest of the group seemed to share my appreciation for the writing style, engagingness, and overall fun of this book, so we decided to work through it together. And that’s where we’ve been! The material has been very interesting and, in my humble opinion, at the absolute perfect level of difficulty.

So here we are, with about two chapters left in that book. We have a loosely formed goal of taking what we’ve learned and working on a project with Pyro or PyMC3. Hopefully we’ll be sharing more of that and what we’re up to once we get through the book. Cheers!

--

--