Review of the Intro to Machine Learning With TensorFlow Nanodegree Program on Udacity

Michał Oziębło
The Startup
Published in
6 min readAug 14, 2020
TensorFlow logo from https://upload.wikimedia.org/wikipedia/commons/2/2d/Tensorflow_logo.svg on CC licence

About the service provider for the context:

Udacity is a for-profit organization offering online educational courses. The origin of the name Udacity comes from the company’s desire to be “audacious for you, the student”. Udacity is issuing as a proof of received skills Nanodegree diplomas.

TLDR; Summary in a nutshell:

Introduction to Machine Learning with TensorFlow Nanodegree Program on Udacity platform is a short but highly informative course. It is based on three pillars: Supervised Learning, Deep Learning and Unsupervised Learning. It is worth highlighting the great presentation of Support Vector Machines and Ensemble Methods, as well as the Backpropagation Algorithm. The course contains many tasks to be carried out independently and requires three in-person reviewed projects to be completed.

I recommend it to people who are already developing their knowledge about the machine learning and are looking for a repetition of the basics in several evenings, as well as to beginners who meet the initial criteria of the course and have a lot of free time.

Link: https://www.udacity.com/course/intro-to-machine-learning-with-tensorflow-nanodegree--nd230

More details below.

TensorFlow or PyTorch version?

Udacity is selling two versions of the Intro to Machine Learning Nanodegree, with one of two types of deep learning framework — based on the TensorFlow or the PyTorch. Both of them starts with the scikit-learn machine learning library, before moving on to the Deep Learning section.

Each of these two frameworks have own syntax, they are both popular and used by many developers. TensorFlow was created by Google and came out in early 2017, while PyTorch released in 2018 is derived from Facebook. Both have continued to be developed. TensorFlow 2.0, that just released in late 2019, is used in this current edition of the Nanodegree program. PyTorch is a bit more native to Python, while TensorFlow has been connected to it through Keras integration (it is a library imposed on top of TensorFlow, simplifying its API).

Which framework should you choose? It’s a subjective decision, so I will advertise mine. Due to my learning path in this area, which began with the book Deep Learning with Python written by Keras creator and Google AI researcher François Chollet referring to TensorFlow, I remained faithful to the framework I know. There is also much less material on the internet referencing TensorFlow version which I believe gives this Nanodegree more value on the market as it is more resistant to plagiarists in the case of passing projects. Additionally, if you consider the developer communitie — TensorFlow’s one is bigger. As it is with majority of programming languages, learning any framework will help you work with others faster when changing to another. Check which syntax is more pleasant for you. Especially if you are just starting with this topic, choosing one of them does not determine you.

Projects

The program gives opportunities to apply new skills based on lectures in projects relevant to the key industries problems. What is very valuable is that they are assessed by the Udacity mentors — they can also be contacted to discuss problems. There is also a forum for users. Every project requires standards expected on the labor market and is presented in a fairly comprehensible way to the student about challenges to overcome.

After completing the projects, you have knowledge of many relevant topics, such as:

· When preprocessing is needed, and how to apply it?

· How to establish a benchmark for a solution to the problem?

· What each of algorithm accomplishes given a specific dataset?

· How to investigate whether a candidate solution model is adequate for the problem?

Project 1 — Finding donors for the charity (Supervised Learning)

First problem to solve is to construct several supervised algorithms to accurately model individuals’ income and choose the best algorithm from preliminary results and further optimization.

This project is designed to get acquainted with the many supervised learning algorithms available in scikit-learn (I have used Decision Tree, Random Forest and SVC), and provide methods of evaluating how each model works and performs on a certain type of data. The project includes open-ended questions as part of the challenge.

Project 2 — Image classifier (Deep Learning)

In this project, you will first develop code for the classifier of flowers’ images using a deep neural network built with TensorFlow, then you will convert it into a command line application using the trained model to classify new images. The aim of the project is to familiarize the student with the practice of working with the Tensorflow framework.

Despite the relatively narrow scope of the missions, the project contains a very good pipeline of writing scripts and justification of the results. You have to be prepared that the broader presentation of Deep Learning is covered by other courses on their platform.

Project 3 — Customer segmentation (Unsupervised clustering)

In this project, you apply unsupervised learning to identify segments of the population that form the customer base for a mail-order sales company. These segments can then be used to direct marketing campaigns. The provided data represents a real-life data science task.

Actions needed to pass the project are among others data cleaning and PCA transformation, then implementation of k-means clustering algorithm to segment the transformed customer data. While there will be precise guidelines on how to handle tasks in the project, there will also be parts with no specification provided. The student must justify own decisions on how to deal with the data as a part of the challenge.

The final review

I have started this program as an side self-dev challenge during the start of the pandemic. It took me exactly two months, spending about two hours daily.

As I wrote in the nutshell, I recommend it to people who are already developing their knowledge about the machine learning and are looking for a repetition of the basics in several evenings as it was for me, as well as to beginners who meet the initial criteria of the course and have a lot of free time.

It is worth doing this course if you want to really learn the issues in this area, and not just get acquainted with the buzzwords. The Udacity website contains a full syllabus and requirements before starting, which are worth fulfilling in order not to feel disappointed, because a large part of the issues are addressed to people who are friends with, for example, advanced mathematics. Projects were evaluated not just on the code, but also written discussion about observations, conclusions at each stage and final justifications of given solutions. It is worth highlighting the great presentation of Support Vector Machines and Ensemble Methods, as well as the Backpropagation Algorithm (although I still have the impression that the mathematics behind backpropagation algorythm can be learned throughout your whole life if you want to understand how it works, and not just for what is the purpose).

Unless every course on this platform is worth recommending, compared to other courses in this field, learning through this program was fun and challenging and I can honestly advertise it to you. I find it a good rate for its price, especially when buying it on sale.

--

--

Michał Oziębło
The Startup

Clinical data scientist/engineer - biotechnologist - financier Python / R / SQL