Aurélien Géron Deep Learning crash-course & bonus interview (part 1/3)

Renaud Bauvin
Criteo R&D Blog
Published in
6 min readJun 28, 2019

--

The 26th of June concluded a new session of the Introduction to Deep Learning and TensorFlow crash-course. This intensive course given by Aurélien Géron, the author of the best-selling book Hands-on Machine Learning with Scikit-Learn and TensorFlow, condenses in three days what you need to know on:

  • Deep Learning (DL) fundamentals with TensorFlow 2.0 and Keras,
  • Convolutional neural networks (CNN) and
  • Recurrent neural networks (RNN).

It is the second time that Aurélien has come to Criteo to give this course as part of the Criteo crash-course series. Created a bit more than 6 months ago, this series had two objectives:

  • training Criteo employees on Machine Learning topics through workshops/hands-on trainings given by top practitioners and
  • giving back to the community by inviting, free of charge, external ML practitioners to learn with us.

So far, Criteo organized three such events:

  • two crash-courses on Deep Learning delivered by Aurélien (one in January/February — watch the short video of the event — and the current one in June) as well as
  • one crash-course on NLP (a session in May) delivered by Vincent Guigue, Professor at LIP6 laboratory.

Following the 1ˢᵗ session that led to tremendously positive feedback, it seems the word has spread pretty much and we have been literally over-flooded by requests of people wanting to attend it both internally and externally. (By the way, thanks to all the people who have applied and sorry to all the people who couldn’t attend this session).

If you want to be the first informed about next events, don’t forget to follow Criteo AI Lab on Twitter.

Aurélien Géron

We took the opportunity of having Aurélien in Paris to interview him on his experience with ML/DL projects, the advice he would share, his tips and tricks to keep up with the pace of the field, his vision for the future, etc.

As we had a lot of questions and Aurélien had a lot of very interesting stuff to say, we had to split this blog post in three:

Aurélien’s own path and advice to people joining

How did you arrive in the field of ML?

I have been fascinated by A.I. ever since I was a kid. I read a lot of SciFi, most notably Isaac Asimov’s robot stories. As a teenager, I wrote a little program that would beat me at checkers. It just used a simple minimax algorithm, but I was still amazed at how smart it seemed. I realized that high-level intelligence may just be the complex result of a simple algorithm. Then I majored in biology, and at one point I studied the behavior of insects, and I learned about how neural networks were used to model how they responded to various chemicals. I was immediately hooked by neural nets. I signed up to all the courses I could on A.I. and on neural networks, and I programmed my first neural network back in 1995: it was a Hopfields network capable of recognizing a few handwritten digits. I also played around with Kohonen networks (also called self-organizing maps), for example to solve the travelling salesman problem. It was exhilarating to see that a program could deal with fuzzy, imprecise inputs, somewhat like a human. Then I worked on a project where we used multi-layer perceptrons to predict the acidity of yogurt based on various input features such as the temperature and the quantity of sugar. I suspect it didn’t perform that well because I don’t remember much about this project! At the time, my teachers were not too excited about neural networks, and they managed to convince me that this field was doomed. So I worked on other topics, first as a consultant and a trainer in various fields, then I founded a wireless ISP, and I was a CTO there for 10 years. In 2013, I decided to change a bit and I joined Google as head of YouTube’s video classification team. This is where I really started working professionally on Machine Learning. It was like a dream come true. I worked with some of the brightest people I’ve ever met, and they taught me a lot of what I know. Since then, apart from writing my book, I work as a Machine Learning consultant and trainer.

Is a diploma in the field required today to perform?

I don’t believe it is. Deep Learning is a fairly accessible field, it’s not like quantum physics or medicine. You should know some college-level math, mostly linear algebra (know how to work with matrices and vectors), have some intuitions about calculus (know what a derivative is), and know some fundamentals of statistics (mean, standard deviation, confidence range…). Most of the time that’s all you’ll need, and in fact, for many tasks you can just use existing tools without having to know all of that. Some subfields require extra knowledge, in particular Bayesian Deep Learning. Today, there is such a shortage of experienced ML practitioners that companies usually do not require a diploma. But this is changing, as more and more students are graduating, companies may start to become more demanding.

What advice would you have liked someone to give to you when you started in the field?

It’s tempting to jump straight into Deep Learning and train neural nets, but it took me some time to realize that simpler models like linear models or Random Forests often perform better, especially when the training set is not huge, or when the signal-to-noise ratio is low. In particular, when dealing with time series, I have had more success with the good old ARIMA models than with Deep RNNs. Moreover, the model is actually a fairly small part of a Machine Learning project: I tried to highlight this in my book by starting with an overview of a full project, end-to-end, rather than jumping right into neural networks or other models.

Good pointers to keep up with the pace of development in the field

How do you stay informed with the progress of ML?

I follow a lot of amazing people on Twitter, I find that it’s a great platform to keep track of hot new topics. It takes a bit of work to keep the feed as clean and useful as possible, but it’s really worth the effort. I try to follow mostly people who tweet technical information, and I don’t hesitate to unsubscribe when accounts tweet too much on other topics. I also use www.arxiv-sanity.com to find interesting papers to read.

Who do you follow?

Yann LeCun, Andrej Karpathy, Andreas Mueller, Jeremy Howard, Jake VanderPlas, Olivier Grisel, Jeff Dean, Rachel Thomas, Ian Goodfellow, François Chollet, and many more.

I also follow great practitioners I have had the chance to work with, like Eric Lebigot, Martin Andrews, Sam Witteveen, Jason Zaman, and several others.

Finally, I also follow companies and projects like TensorFlow, DeepMind, Facebook, and many people working in these teams, such as Martin Wicke or Paige Bailey.

Which blog are you reading?

I greatly enjoy Distill. I occasionally read other blogs, such as Google and DeepMind’s blogs. But mostly I find interesting blog posts through links on twitter.

Which conference do you attend?

I enjoy the NeurIPS conference for the latest research, TensorFlow Dev Summit for TF specific content, O’Reilly’s Data and AI conferences for more industry-oriented use cases, and I love going to ML meetups.

What is one good source of information that people don’t think about?

I have found a lot of great information directly in the source code! Reading papers is great, but sometimes it’s hard for me to really grasp the details without actually seeing an implementation. The website paperswithcode.com is great for this.

Part 2 is here….

--

--