FSDL Bootcamp — Day 1

Gerard Simons
Full Stack Deep Learning Bootcamp 2018
9 min readAug 21, 2018

After a long flight from Amsterdam, I arrived at San Fransisco, and took the BART up to beautiful Berkeley. I was pretty tired and wanted to be fresh for the next day. So the next day I rose early — but not middle-of-the-night jet lag early — ready and excited for the full stack deep learning bootcamp!

A nice half hour hike up University Avenue (these American grid city layouts can be very convenient) got me at UC Berkeley. The air was still crisp, but the sun was out, it was a beautiful day to start (deep) learning!

UC Berkeley in summer. I bet a lot geniuses sat on that bench.

After a short walk through the campus, I arrived at the Sutardja Dai Hall, the place of magic for the next three days. I immediately met some people, as it was pretty full already. Pretty cool to meet people working at the big tech companies of Silicon Valley like Google, Apple, and Twitch, but there were also great people from far away places from countries such as Italy and Slovenia.

Good, beautiful nerds everywhere

After having made some new friends, a signal, a kind of call of the nerds was given on a triangle by Pieter Abbeel himself that the bootcamp was officially starting! All the nerds shuffled in, and soon the room was packed. They asked us to move in close together as every seat would be dearly needed.

A cool addition to the regular stuff was the Slack workspace. This would prove useful as people would introduce themselves, discuss topics related to the lectures, introduce themselves and serve as a place to ask for help on the lab work.

Onto the content. The videos and lectures will be made public at some point in the future — although it might be not be until after the next Bootcamp — at which time I will be sure to add the relevant links. The point of this article serves more as an overview and giving you an idea of what you could focus on then or why you should try to go to the next FSDL bootcamp.

NOTE: This blog post is not intended as an exhaustive summary. I have dedicated more time to those that I think are of most interest / most unique to this bootcamp. Another addition that may help you choose which lecture too fully check out I included an indication “Takeaways” to each lecture description, that describes the main points I learned from the lecture.

Lecture 0 : Introduction

A warm welcome and introductory talk by Pieter Abbeel. He introduced us to the staff, and of course the schedule of the bootcamp, and notified us there would be free beer and pizza sponsored by Shell at a place just around the corner after today’s lectures.

Takeaways : Get a really quick overview, background and history of ML.

Lecture 1 : Machine Learning Projects

An enlightening introduction of how machine learning should be started and structured throughout its lifecycle on a code-base level, but also on a business side level. Basically, the life cycle of any ML project should be something like this:

Nice overview: Be sure to check out the slides / video if you see anything you like

Each of these parts was then meticulously fleshed out. Another part that I really liked was prioritising the ML projects. This resonated quite a bit for me since we have a lot of ML projects to work on at Captain AI. Which is great, except that it makes it difficult to prioritize. This problem was tackled by showing you ways of assessing the feasibility of an ML project and it’s impact, where high impact relatively easy projects are the ones you want to work on first.

Another point made was the introduction of baselines and metrics. It’s important to make everything measurable, especially your progress. For that you need metrics. But metrics alone don’t mean anything unless you have certain baselines. These can be either external or internal, varying from simple scripted baselines to simple classifiers, all the way up to the best baseline: a human one. Depending on your data and project particulars you should set up the best feasible baseline.

Takeaways : This lecture I felt was really close to the core of what this bootcamp is about: How to tackle a machine learning project, from requirements and planning to assessing your results in a professional setup and using a solid codebase. Any time a new ML project would arise you could take this as a guide for setting everything up.

Lab 1 : Plumbing

The first part of the lab work introduced us to the JupyterLab environment provided for us through weights and biases. The instance had 2x K80 GPU’s to give us enough power to help us through any training we needed to do. The setup of the codebase was similar to the one we just learned in Lecture 1.

In this and future labs we would work on a dataset call EMNIST, which stands for Extended Mini NIST, meaning that it is NIST all over again. It can be seen as a harder instance of the famous MNIST problem, so the classification of hand-written characters.

To be honest I felt that the lab work was a little overwhelming, with too little time given to complete the assignments. Just figuring out where everything was took up most of the assigned lab time. Finishing it on time was pretty much impossible, something the course supervisors noticed when they decreed that it was better to just follow along and copy-paste afterwards.

The Jupyter Lab instance. On the left you see the outline of the labs work. The terminal shows the codebase of an individual lab assignment

In any case, the purpose of this lab session was to get acquainted with the codebase and implement a simple multi-layer perceptron in the network script.

Lecture 2 : Deep Learning Fundamentals

Ah, the fundamentals, of course. These are pretty, erm, fundamental! This was basically a quick refresher of the basics Deep Learning: loss functions, gradient descent, back prop and the hardware involved (CUDA).

Takeaways : This helped me brush up on some of the core concepts of Deep Learning, why it’s happening and getting you excited all over again for this sweet piece of tech.

Lecture 3 : ConvNets

Of course, the prime example of the power of Deep Learning is the Convolutional Neural Network. This lecture refreshed us on how convolution works and the components and operations involved (filters, strides, padding, dilation, pooling, etc.).

Nothing new under the sun, but it’s a good refresher

Takeaways : Help you refresh the fundamental operations present in convolutional neural networks and why they work so well for image classification.

Lab 2 : ConvNets

In this lab we would implement a simple LeNet architecture to classify characters. We would then look at how to work with a line of characters. The notebook would show you how to create synthetic lines of characters by simply concatenating characters together to make a line of text. I liked this approach of first generating synthetic data. This would make it easy to iterate and evaluate solutions with, before heading out into harder real-world problems.

A naive approach to implement here is to apply a sliding window to the line and to the line, effectively separating out each character and feeding it to our ConvNet. Obviously this would not work well for real sentences as character width would vary, but was a nice start and made sure the entire infrastructure was working at least.

Lab 2 solution where the sliding window is connected to the ConvNet

Lecture 4 : Sequences

Another important type of data is sequential data. Sequential data is everywhere, basically any data where there is a strong relation between two consecutive data points is a kind of sequence. That could mean time series, text corpuses and audio snippets for example. But not necessarily the input data has to be sequence for it be classified as a sequence type problem, when the output is a sequence it can also be classified as such (music generation, image captioning). Recurrent Neural Nets, which are usually the networks used to solve these problems, were discussed including their issues (vanishing gradients) and ways to counteract these.

Just like the ConvNet lecture there was nothing all too new here so I won’t go into detail but if you’re in need of a good run-down of these type of problems this is definitely a good one to check out.

Takeaways : An overview on sequential data, their properties, some common examples and how to tackle them using recurrent neural networks (RNNs).

Lecture 5 : Vision Applications

This was basically an extension of the ConvNet lecture by looking at some of the state-of-the-art in ConvNet architectures. It was a great overview, discussing the from the very start of the hype, AlexNet to the more modern variants like ResNet and Inception. Each network’s architecture was concisely described. I found the comparison drawn below especially enlightening.

Comparing architectures: In the right plot circle size indicate number of parameters (memory size)

This could be important since you don’t always want the very best. Sometimes the model just won’t fit on your crappy GPU (mobile) or you don’t have enough training data to actually get there. There are important trade-offs to be made there. This overview could be useful.

Takeaways : Perfect if you are in need of a good vision architecture but not sure on which one to use. Also nice if you just wanna read up on the different popular vision architectures out there.

Lab 3 : Sequences

In this lab we trained an LSTM using the CTC (Connectionist Temporal Classification) loss function.

Something I particularly liked here was the introduction of tests to validate your model. In software engineering, tests are a huge part if you’re going to build a system that will be going into production. It’s a very solid and pragmatic method of ensuring that your code is (still) valid, even after some changes. Any changes breaking the tests, indicate that these are bad changes.

Coming from a computer science background, I often found it unnerving to not really see that same rigor being applied to machine learning, where the process is often very experimental. Data scientists in my experience, seem to be particularly ill-equipped to validating models in such a way. So it was a breath of fresh air to see model prediction being tested in a similar fashion as to how software code is tested. I have included a snapshot of part of the solution below. The full solution can be found here.

Pizza, Beers and Conclusion

Despite all its sophisticated allure there is that one glorious combination that beats even Deep Learning: 🍕+🍺. Thankfully, this is exactly what we got at the end of the first day, courtesy of Shell.

Beer & Pizza : Gotta Love It

Everybody was still kind of mind-boggled from the intense first day, so blowing off some collective steam was nice. It was also a great opportunity to discuss what everyone’s thoughts were on the first day of bootcamp.

The feeling I had seemed shared by a lot of folks: Great start, perhaps slightly too much of a refresher, The labs however could have used a bit more time, There simply wasn’t enough time to finish all of them adequately. Or perhaps it could have helped to have more of a segue from the lectures to the lab, so you would ease into it a bit more. This was something that was also felt by the course instructors, who indicated to me they would reserve more time for the labs at future bootcamps.

Nonetheless, everybody seemed genuinely excited about everything and was looking forward to the next two days. We would soon discover that we would not be disappointed.

Stay tuned for my report on Day 2!

--

--

Gerard Simons
Full Stack Deep Learning Bootcamp 2018

Data enthusiast, regular old computer scientist at heart. Publications in Computer Graphics and Data Visualisation.