Interview with Deep Learning Researcher at fast.ai: Sylvain Gugger
Index and about the series“Interviews with ML Heroes”
Today, I’m honored to be interviewing another core member of the fast.ai team: Sylvain Gugger
Sylvain is working as a Research scientist at fast.ai research lab at The Data Institute, USF. If you’re from the community, you must have found his great answers on the threads. If you aren’t, please find ‘@sgugger’ on the forums, you will learn a lot.
Sylvain has a background as Math and CS teacher, he has authored several textbooks in French covering undergraduate Math, all published by Dunod editions.
About the Series:
I have very recently started making some progress with my Self-Taught Machine Learning Journey. But to be honest, it wouldn’t be possible at all without the amazing community online and the great people that have helped me.
In this Series of Blog Posts, I talk with People that have really inspired me and whom I look up to as my role-models.
The motivation behind doing this is, you might see some patterns and hopefully you’d be able to learn from the amazing people that I have had the chance of learning from.
Sanyam Bhutani: Hello Sylvain, Thank you so much for taking the time to do this interview.
Sylvain Gugger: Thank you so much for reaching out, I am very honored!
Sanyam Bhutani: You’ve worked as a Math and CS teacher, you hold a Master’s Degree in Math, are currently a researcher at one of the most ‘uncool’ non-profit research lab, fast.ai.
Can you tell us when did Deep Learning first come into the picture, What got you interested in Deep Learning at first?
Sylvain Gugger: I first heard of neural nets in 2002 and, for a school project, I coded a small program (vaguely) recognizing digits. It wasn’t very powerful at that time, so I kind of forgot about it until October 2017. I was randomly reading an article in the New York Times and fast.ai was mentioned as an accessible course for people wanting to study Deep Learning. I was curious to see how the field had progressed — of course, I had heard all the hype around it — so I followed the MOOC version 1.
I instantly loved the top-down approach, how a beginner can actually do things right after the first lesson (and it’s even easier with fastai v1 and the third version of the MOOC!). I have a strong background in Math, but it’s my love for coding practical things that kept me going.
Sanyam Bhutani: What is it like to work with Jeremy Howard? Does the fast.ai team ever sleep?
Sylvain Gugger: We never sleep, but that’s mostly because we both have toddlers! More seriously, working with Jeremy is amazing. I’ve improved a lot as a coder and I keep on learning new things from him. Just seeing how he iterates through your code to refactor it in a simpler or more elegant way is always fascinating. And I really love how he is never satisfied with anything short of perfect, always pushing to polish this bit of code or this particular API until it’s as easy to use as possible.
Sanyam Bhutani: Could you tell us more about your role at fast.ai and how does a day at fast.ai look like?
Sylvain Gugger: Sure! Since I am based in New York City, we mostly work in parallel. We chat a lot on Skype to coordinate and the rest of the time is spent coding or reviewing code, whether it’s to make the library better or try a new research idea.
As for my role, it’s a mix of reviewing the latest papers and see what we could use, as well as help Jeremy develop new functionality in the library and prepare the next course.
Sanyam Bhutani: The new library is absolutely amazing, to say the least.
Could you tell us a little about its development and how the team manages to push out the amazing library at a “fast.ai speed”? What more can we expect next from the awesome library, What SOTA score is fast.ai going to be de-throne soon?
Sylvain Gugger: I’m glad you like the new library, thanks! The first idea was to try to get something more unified in terms of API than v0.7. Then we’ve mostly been iterating through notebooks that Jeremy will present in the second course this spring. We’re not that fast since several final pieces of that puzzle only came to us at the end of October when the course had already started, so we had a few stressful weeks where everything needed to be rewritten behind the scenes without breaking (too much) the existing.
As for future developments, we’ll try to make it easier to put fastai models into production, we’ll focus on the applications we didn’t have time to finalize during the first part of the course (object detection, translation, sequence labeling), we’ll find some way to deal with very big datasets that don’t always fit in RAM, and also play with some research ideas we didn’t get to investigate (training on rectangular images for instance).
Sanyam Bhutani: There are many amazing SOTA techniques that Jeremy shares in the upcoming MOOC, that the library features.
How do you discover these ideas, what is the methodology of experimentation at fast.ai?
Sylvain Gugger: The methodology could be summarized into: “try blah!”, as Jeremy said in one of the courses. We try to have an intuitive understanding of what happens when training a given model, then we experiment all the ideas we think of to see if they work empirically.
Very often, research papers focus on the Math first and come with this one new theory that is going to revolutionize everything. When you try to apply it though, you often don’t get any good results. We’re more interested in things that work in practice.
Sanyam Bhutani: I have to confess: I’ve been trained as a student of the bottom-up approach, I’m still often overwhelmed by The Top Down approach.
Having taught Math and CS in the same setting, How do you suggest that the students of the “ancient setting” adapt to the “Top Down” approach best?
Sylvain Gugger: It may seem odd at first, but if you think it through, a course named “Practical Deep Learning for coders” kind of has to explain how to do things in practice. It’s not that the theory is useless, and it’s going to be a critical piece to help a beginner forge its intuition, but starting with training an actual model and deploying it before explaining what is behind it makes a lot of sense.
Have a leap of faith and trust Jeremy when he says he’ll explain everything that is behind the scenes, then just give it a try. It’s much more motivating to know you can do something before trying to understand why it works (because that part won’t be easy either). And if you’re really more comfortable with the bottom-up approach, you can find plenty of resources to help you with that, but I’d suggest watching/reading them after completing the MOOC.
Sanyam Bhutani: How do you stay up to date with the cutting edge?
Sylvain Gugger: By experimenting a lot! The fastai library isn’t just a great tool for the beginner, its high flexibility makes it super easy when I want to implement a research article to see if its suggestion results in a significant improvement. The callbacks system or the data block API allow you to do pretty much anything with just a few lines of code.
Sanyam Bhutani: What are your thoughts about the Machine Learning Hype?
Sylvain Gugger: It’s not necessarily a bad thing in the sense it can bring people to the field that would otherwise have never heard of it. You have to be careful to make the distinction between people who talk a lot and those who actually do things though.
Sanyam Bhutani: Thank you for the contributions to the library.
How do you think a beginner without any DL expertise can contribute to the fastai library?
Sylvain Gugger: There are plenty of ways! Helping with our documentation is a nice way to really dig into how one part of the library works (or just one function), same for our test coverage. Another way is it to give us candid feedback if a specific API is too hard to use or doesn’t make sense, as it helps us make the library more accessible.
Don’t be shy! Most PRs merged fix typos, change a tiny bit of code or just add the documentation of one function.
Sanyam Bhutani: The fast.ai philosophy is: Anyone can do DL, you don’t need to have a Masters/Ph.D. to contribute to the field.
Being a Math grad yourself, Could you share some of your thoughts about a “Non-Technical” student contributing to the field?
Sylvain Gugger: There are plenty of seats at the table! You don’t need to know advanced Math to actually do deep learning now, as most of the libraries (including fastai) hide it from you to allow you to focus on the important bits. Practical research on how to train models efficiently is as important as theoretical results. It may not lead to publications at prestigious conferences, but it’s very useful nonetheless.
Acquiring the Math needed may seem a bit difficult (which is why beginning with the top might be more motivating) but remember that no one has it easy. In my case, understanding the Math didn’t take too long, but I struggled a lot on bash/Linux commands since I was a Windows user with zero experience with terminals.
Sanyam Bhutani: Before we conclude, any advice for the beginners who are afraid to get started with Deep Learning because of the common prevalent “lack of expertise” thought?
Sylvain Gugger: Well, first of all, find someone else who is as much a beginner as you are and start the course together. This will help you for two very different reasons. The first one is that you’ll be able to discuss what you learn and may help each other understand parts that are unclear. The second one is that it’ll motivate you to stay until the end. You have to be tenacious to do Deep Learning: nothing is ever going to work the first time! But be patient, meticulous in debugging and persevere.
Don’t forget that there is a forum where other people may help you with the problems you face; you might also find directly the answer to some of your bugs.
Start a blog, where you explain what you have learned. Explaining things is often the best way to realize you hadn’t fully understood them; you may discover there were tons of small details you hadn’t dug enough into.
Sanyam Bhutani: Thank you so much for doing this interview.