Thoughts after taking the courses


Between a full time job and a toddler at home, I spend my spare time learning about the ideas in cognitive science & AI. Once in a while a great paper/video/course comes out and you’re instantly hooked.

Andrew Ng’s new course is like that Shane Carruth or Rajnikanth movie that one yearns for!

Naturally, as soon as the course was released on coursera, I registered and spent the past 4 evenings binge watching the lectures, working through quizzes and programming assignments.

DL practitioners and ML engineers typically spend most days working at an abstract Keras or TensorFlow level. But it’s nice to take a break once in a while to get down to the nuts and bolts of learning algorithms and actually do back-propagation by hand. It is both fun and incredibly useful!

What this course is about:

Andrew Ng’s new adventure is a bottom-up approach to teaching neural networks — powerful non-linearity learning algorithms, at a beginner-mid level.

In classic Ng style, the course is delivered through a carefully chosen curriculum, neatly timed videos and precisely positioned information nuggets. Andrew picks up from where his classic ML course left off and introduces the idea of neural networks using a single neuron(logistic regression) and slowly adding complexity — more neurons and layers. By the end of the 4 weeks(course 1), a student is introduced to all the core ideas required to build a dense neural network such as cost/loss functions, learning iteratively using gradient descent and vectorized parallel python(numpy) implementations.

Andrew patiently explains the requisite math and programming concepts in a carefully planned order and a well regulated pace suitable for learners who could be rusty in math/coding.

Course material and tools:

video lecture

Lectures are delivered using presentation slides on which Andrew writes using digital pens. It felt like an effective way to get the listener to focus. I felt comfortable watching videos at 1.25x or 1.5x speed.

quiz tool

Quizzes are placed at the end of each lecture sections and are in the multiple choice question format. If you watch the videos once, you should be able to quickly answer all the quiz questions. You can attempt quizzes multiple times and the system is designed to keep your highest score.

Jupyter notebook programming assignment

Programming assignments are done via Jupyter notebooks — powerful browser based applications.

Assignments have a nice guided sequential structure and you are not required to write more than 2–3 lines of code in each section. If you understand the concepts like vectorization intuitively, you can complete most programming sections with just 1 line of code!

After the assignment is coded, it takes 1 button click to submit your code to the automated grading system which returns your score in a few minutes. Some assignments have time restrictions — say, three attempts in 8 hours etc.

Jupyter notebooks are well designed and work without any issues. Instructions are precise and it feels like a polished product.

Who is this course for:

Anyone interested in understanding what neural networks are, how they work, how to build them and the tools available to bring your ideas to life.

If your math is rusty, there is no need to worry — Andrew explains all the required calculus and provides derivatives at every occasion so that you can focus on building the network and concentrate on implementing your ideas in code.

If your programming is rusty, there is a nice coding assignment to teach you numpy. But I recommend learning python first on codecademy.

How this DL course is different from Jeremy Howard’s course:

Let me explain this with an analogy: Assume you are trying to learn how to drive a car.

Jeremy’s FAST.AI course puts you in the drivers seat from the get-go. He teaches you to move the steering wheel, press the brake, accelerator etc. Then he slowly explains more details about how the car works — why rotating the wheel makes the car turn, why pressing the brake pedal makes you slow down and stop etc. He keeps getting deeper into the inner workings of the car and by the end of the course, you know how the internal combustion engine works, how the fuel tank is designed etc. The goal of the course is to get you driving. You can choose to stop at any point after you can drive reasonably well — there is no need to learn how to build/repair the car.

Andrew’s DL course does all of this, but in the complete opposite order. He teaches you about internal combustion engine first! He keeps adding layers of abstraction and by the end of the course you are driving like an F1 racer!

The fast AI course mainly teaches you the art of driving while Andrew’s course primarily teaches you the engineering behind the car.

How to approach this course:

If you have not done any machine learning before this, don’t take this course first. The best starting point is Andrew’s original ML course on coursera.

After you complete that course, please try to complete part-1 of Jeremy Howard’s excellent deep learning course. Jeremy teaches deep learning Top-Down which is essential for absolute beginners.

Once you are comfortable creating deep neural networks, it makes sense to take this new course specialization which fills up any gaps in your understanding of the underlying details and concepts.

Things I liked in this course:

  1. Facts are pretty much laid out bare — All uncertainties & ambiguities are periodically eliminated

2. Andrew stresses on the engineering aspects of deep learning and provides plenty of practical tips to save time and money — the third course in the DL specialization felt incredibly useful for my role as an architect leading engineering teams.

3. Jargon is handled well. Andrew explains that an empirical process = trial & error — He is brutally honest about the reality of designing and training deep nets. At some point I felt he might have as well just called Deep Learning as glorified curve-fitting

4. Squashes all hype around DL and AI — Andrew makes restrained, careful comments about proliferation of AI hype in the mainstream media and by the end of the course it is pretty clear that DL is nothing like the terminator.

5.Wonderful boilerplate code that just works out of the box!

6. Excellent course structure.

7. Nice, consistent and useful notation. Andrew strives to establish a fresh nomenclature for neural nets and I feel he could be quite successful in this endeavor.

8. Style of teaching that is unique to Andrew and carries over from ML — I could feel the same excitement I felt in 2013 when I took his original ML course.

9.The interviews with deep learning heroes are refreshing — It is motivating and fun to hear personal stories and anecdotes.

Things I found lacking:

I wish that he’d said ‘concretelymore often!

Some additional things I learnt along the way:

  1. DL is not easy. It takes some hard work over time to “get” the concepts and make them work well. Andrew had written a quora answer a while ago that deeply resonated with me.

2. Good tools are important and will help you accelerate your learning pace. I bought a digital pen after seeing Andrew teach with one. It helped me work more efficiently.

Black ink is Andrew’s and colors are mine

3. There is a psychological reason why I recommend the course before this one. Once you find your passion, you can learn uninhibited.

4. You just get that dopamine rush each time you score full points:

hell yeah!

5. Don’t be scared by DL jargon (hyperparameters = settings, architecture/topology=style etc.) or the math symbols. If you take a leap of faith and pay attention to the lectures, Andrew shows why the symbols and notation are actually quite useful. They will soon become your tools of choice and you will wield them with style!

some scary looking symbols which will begin to make sense when you watch the lecture videos

Closing comments(optional read):

  1. Everyone starts in this field as a beginner. If you are a complete newcomer to the DL field, it’s natural to feel intimidated by all the jargon and concepts. Please don’t give up. You probably were drawn to this field hoping to find your calling. Trust your gut and stay focused and you will be successful sooner than you realize! Even Andrew Ng had to learn linear algebra at some point in the past — he wasn’t born with that knowledge.
  2. While this is an incredible resource, this is not the only DL course in the world. Many generous teachers like Salman Khan, Jeremy Howard, Sebastian Thrun, Geoff Hinton have shared their knowledge online for free just like Andrew Ng. I wasn’t lucky enough to pursue a masters or PhD as I had to enter the workforce and support my family — But that didn’t mean the learning had to stop. I have had the opportunity to customize my own learning plan, thanks to the democratization of knowledge. And I could choose to learn from the people I admired most: Programming(Gerald Sussman), Linear Algebra(Gilbert Strang), AI(Marvin Minsky), Philosophy(Daniel Dennett), Psychology(Jean Piaget), Physics(Hans Bethe)…
  3. Most of applied DL is really disciplined engineering — And Prof. Ng provides a fantastic compilation in course-3(My favorite of the 3 courses released so far). The mindset required for making DL work for your problems are no different than that required while approaching any other hard engineering problem. Everything you need to know has been documented clearly by Claude Shannon decades ago.

Thanks for reading and best wishes!

Update: Thanks for the overwhelmingly positive response! Many people are asking me to explain gradient descent and the differential calculus. I hope this helps!