Computer Vision Nanodegree Program: What You’ll Learn

Enrollment is now open for our new Computer Vision Nanodegree program!

Cezanne Camacho
Udacity Inc
6 min readApr 23, 2018

--

Computer Vision is a specialized branch of Artificial Intelligence focused on enabling machines to visually perceive the world, and respond to it. As with human vision, this is a process of taking in visual information, analyzing and processing that information, and correctly identifying objects contained within that information. Thanks to advances in the field of Computer Vision — and significant increases in available computing power — machines can now “see” thousands and thousands of images, and process them far more rapidly and accurately than a human could ever do.

Our new Computer Vision Nanodegree program covers all the latest techniques. You’ll learn about deep learning architectures like R-CNN and YOLO (You Only Look Once) multi-object recognition models, and you’ll implement object tracking methods like SLAM (Simultaneous Localization and Mapping).

YOLO multi-object detection on an image of a road. This example shows detected people, cars, traffic lights, and even handbags.

Computer Vision and Healthcare

The applications of this pioneering technology are almost limitless, and Computer Vision is already having a profound impact on so many industries. Can you imagine being a cancer researcher, experiencing the power of Computer Vision for the first time? Using applications trained on thousands of images, doctors are now able to rapidly and accurately distinguish between cancerous and non-cancerous tissue, and diagnose patients much earlier.

In our new Computer Vision Nanodegree program, you’ll learn the very same image processing techniques that are revolutionizing healthcare and saving lives.

Computer Vision and Safety, Conservation, and Disaster Relief

The field of autonomous transportation depends on Computer Vision — it’s the technology that enables self-driving vehicles to “see” the world around them. Self-driving cars are predicted to save millions of lives, and enable efficiencies that will significantly improve air quality, not to mention reducing traffic congestion! Already today, Computer Vision is being applied in real-world settings that include helping to eradicate malaria in Malaysia, supporting wildlife conservation in Canada, and aiding disaster relief efforts in Puerto Rico.

In this program, you’ll learn how a self-driving vehicle uses visual input from laser sensors, radar, and cameras to safely navigate roads by itself. You’ll explore the ways Computer Vision is used to analyze camera images, and to identify objects like other cars or pedestrians.

Computer Vision: The #3 Best Job in the US, 2018

This is an incredible time to enter the field of Computer Vision. A recent report published by Indeed ranked Computer Vision Engineer as the #3 Best Job in the US in 2018! This is but one indication of how high demand is for Computer Vision talent — ZDNet recently put Computer Vision Engineer at the top of its list of jobs that will be most in-demand in 2020. TechRepublic identified Computer Vision Engineer as one of the 6 most in-demand AI jobs, and a recent article in Inc. declared that Computer Vision Will Be The Most Disruptive Innovation Driver.

In Udacity’s new Computer Vision Nanodegree program, you’ll learn the in-demand skills that will enable you to take advantage of the extraordinary opportunities in this exciting field.

Choosing the Computer Vision Nanodegree Program

If you’re new to Computer Vision, but you have a working knowledge of machine learning and Python, the Computer Vision Nanodegree program is ideal for you. You’ll learn all about the Computer Vision and deep learning techniques that are used to analyze images and spatial information. You’ll start by learning to code image classifiers using Python code, and build up to using deep learning frameworks like PyTorch for more complex classification and regression tasks. You’ll learn from a curriculum built in collaboration with Affectiva and NVIDIA, and you’ll hear from experts in the fields of emotion recognition and scene understanding.

This course will also cover the latest in deep learning architectures used in industry, including region-based convolutional neural networks and fast object recognition algorithms such as YOLO (“You Only Look Once” multiple object detection). With the practical skills you gain in this program, you’ll be able to program your own computer vision applications, extract information from any kind of image and spatial data, and solve real-world challenges.

Throughout the course, you’ll use real data to inform your work and train your deep learning models. You’ll complete three major computer vision projects, and build a strong portfolio in the process!

What You’ll Learn

The Computer Vision Nanodegree program is comprised of 3 main sections, each with an associated project.

1. Introduction to Computer Vision
First, you’ll learn the foundational math and programming concepts behind pattern recognition and classification tasks. This section will be all about creating algorithms that can: 1) isolate important, distinguishing information about an object in an image (like an object’s unique shape or color), and 2) ignore irrelevant parts of an image (like a plain background or noise). You’ll learn to program a green screen and define your own clothing classifier.

Project 1: Facial Keypoint Detection
Combine image processing techniques and deep learning techniques to detect faces in any image, and then detect facial keypoints, such as the position of the eyes, nose, and mouth on a face. In this project, you’ll define and train a convolutional neural network to recognize these keypoints.

2. Advanced Deep Learning & Computer Vision
Here, you’ll learn about the deep learning algorithms that have led to state-of-the-art advances in computer vision technology! This section covers architectures like Faster R-CNNs that identify where object are in an image. You’ll get to work with a code implementation of YOLO, and learn about models that use recurrent neural networks for generating sequences of data. This section will be all about applications that aim to reach human levels of scene understanding.

Project 2: Automatic Image Captioning
Image captioning requires that you create a deep learning model with two components: a CNN that transforms an input image into a set of features, and an RNN that turns those features into rich, descriptive language. In this project, you’ll focus on the part of the model that can generate descriptive sentences and demonstrate your mastery of deep learning architectures.

3. Object Tracking and Localization
To conclude, you’ll learn about object tracking techniques; using spatial information, gathered over time, you’ll learn about predicting the location of an object and determining its movement. This is an ongoing area of research especially in the field of autonomous vehicles like self driving cars and drones!

Project 3: Landmark Detection and Tracking (SLAM)
Implement a robust method for tracking an object over time, using elements of probability, motion models, and linear algebra. Use feature detection and keypoint descriptors to build a map of the environment with SLAM (Simultaneous Localization and Mapping).

A CNN architecture for facial keypoint detection. This input is an image of a cat’s face, and the output detects the location of the cat’s eyes, nose, ears, and mouth.

The Support You’ll Receive

Udacity offers you a wide array of support options to ensure you proceed through the program successfully. You’ll have access to a personal guide through our Classroom Mentorship program, and you’ll get detailed feedback on your project submission by one of our project reviewers. You’ll be able to connect with your fellow students by joining a Slack community where you can engage with our Community Manager, your fellow students, and even your instructors.

How to Enroll

We are now accepting new students to the Computer Vision Nanodegree program.

The program is comprised of a single three-month term. The tuition for the term is $999, paid prior to commencing your studies.

To learn more, you can also explore a Free Preview of this program (but don’t delay on enrollment, and miss your chance to save). You’ll meet your instructors, and explore some early lessons on pattern recognition. You’ll even have the opportunity to experiment with our in-classroom programming environment.

Computer vision is changing the way the world sees, and if you’d like to incorporate this in-demand skill into your work and learning journey, this is your invitation to join. Come learn Computer Vision and start building your skills today!

Udacity’s Computer Vision Nanodegree Program [TRAILER]

--

--