Photo by fabio on Unsplash

Demystifying Deep Learning at NVIDIA GTC

Eshan Chatty
The Startup
Published in
9 min readOct 7, 2020

--

This blog post summarizes what I learned in a presentation at NVIDIA’s GPU Technology Conference (GTC) by Will Ramey, Sr. Director, Global Head of Developer Programs, titled “Deep Learning Demystified”, as well as my own my own studies and experience.

Let us start with, what is Accelerated Data science?

When Data Analytics, Machine Learning, Deep learning, Deep cleansing is applied with GPUs to speed up the process, it is called accelerated Data Science. Every government is realizing that GPU-accelerated data science is required nowadays as it has its use cases in every industry.

If our data is organized into Tabular/sparse data then Data Analytics and machine learning approaches such as regressions, decision trees can be used to solve the problem.

$274B annual revenue by 2022 for big data and business analytics — IDC

But when it comes to dense data which constitute of 5-D tensors, images of high resolution, millions of dimensions, simple data analytics, and machine learning algorithms just cannot take the heat. Hence, Deep learning needs to be introduced. Deep learning comes with its full arsenal of neural networks with a broad application in every industry.

Source: NVIDIA

Deep learning to AI is the new electricity. Since electricity transformed lives 100 years ago, so will the implementation of Deep learning fuel AI. — Andrew NG

After the Big Bang, early researchers from various universities have presented their software in opensource frameworks, for people from technical backgrounds to implement these frameworks for various developments. All these frameworks tend to support GPU acceleration so as to speed up its application. About 5 years ago, the cost to build your own AI startup was very high because of high processing time and hardware incapabilities. but after the recent advancements in Cloud and GPUs, Capital cost has reduced for Deep learning startups drastically. On top of these frameworks are Cloud services and Amazon web services built, which basically provide a platform to build very, very deep learning models in a cost-effective manner and provide solutions within a few minutes. Startups use these platforms to give solutions to various problems in their own domain.

I request you to go through the below image in absolute detail and understand how each of these are linked.

Growth of Modern AI (Surely one of my favorite slides). Source: NVIDIA

Deep learning can be thought of as a map from the input domain to the output domain. It can be divided into text data, images, video, and audio. There are various types of data as well, such as geospatial data, Time-series data, but we shall only focus on these 4 for now. The output domain is the question asked to the input domain and the Deep learning task is to connect the dots ( rather repeatedly ) from the input to the output. Look at the systematic table shown during this session on how to think when it comes to solving an AI/Deep Learning problem.

Source: NVIDIA

One can see that Artificial intelligence has spread in almost every sector like Healthcare, Agriculture, Aerospace, Retail, and is not only limited to the IT-sector.

2.2 exabytes ( 2.2B ) of data is being created daily — McKinsey

The challenging part is to keep up with the increasing data and keep our algorithms and models up-to-date to perform optimally on the big data. Let us see as to how Deep Neural networks have gotten deeper and superior. Here is a timeline from left to right, depicting where we were, and as to whats the latest technology dwelling us in.

Source: NVIDIA

If you're worried you didn't know all of these, don't worry, neither did I! But I’m here to help you out :D

Convolutional neural networks: It is a special type of Neural Network used effectively for image recognition and classification. They are highly proficient in areas like the identification of objects, faces, and traffic signs apart from generating vision in self-driving cars and robots too.

Recurrent neural networks: They help in exhibiting temporal dynamic behavior, i.e they allow previous outputs to be used as inputs through hidden states. They are used in Music generation, Sentiment classification, machine translation.

Generative Adversarial Networks: These create generative models, i.e they create new data instances that resemble your training data. It can be used to re-create synthetic data such as the human face, a cityscape photograph, given a semantic image.

Reinforcement Learning: Reinforcement learning is the training of machine learning models to make a sequence of decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment. For example how a cat behaves in an external environment and to make certain decisions based on the environment.

New species: Neural collaborative filtering utilizes the flexibility, complexity, and non-linearity of Neural Network to build a recommender system. Block-sparse LSTM is a technique to reduce compute and memory requirements of deep learning models. LSTM has feedback connections, unlike standard feedforward neural networks. It can not only process single data points, but also entire sequences of data.

Let us now get down to the basics of deep learning deployment. Suppose we’ve got our training dataset which classifies a particular image as a cat or some other animal. Based on the number of outputs, it could either be a binary classifier or a multinomial classifier. Here, we take up a simple binary classification of whether the input image is a cat or a dog. We’ve then got our untrained Neural Network model with weights initialized (usually randomly)according to which neural network we are dealing with. The data is then trained on the model and we get our first output values. With this training, our model is able to distinguish on major features between a cat and a dog. The more training is done, the more the model learns these features. ( Make sure the model learns from the data and not memorize it completely, as it would lead to overfitting of data). This training involves finding the global minima of the loss function via backpropagation. After we’ve trained and validated our data, we apply our models on an app or a service that might need this type of classification. As the real world involves a lot of data to be involved with, the models built are usually then incorporated with GPUs, and there's no better company in business than NVIDIA.

A schematic representation of deployment of DL-models. Source: NVIDIA

After all this great modeling and complex neural networks, one must think, well its all done, isn't it? Deep Learning has solved all our problems! Well, real-world deployment requires a lot more than some training and testing. Deep learning needs to be updated with new skills and the latest algorithms because the AI sector is a rapidly evolving sector and the models must keep up with its pace. Deep learning also needs to have lesser training time for a model so that the impossible solution can be practically achieved. Deep learning must also be flexible when it comes to deployment in various sectors, as we know that AI has spread everywhere. The application of the deep learning model hence must not be restricted to a particular domain.

A 5 layer convolutional neural network for classifying between images has about 138240 parameters, while Inception_netV3 ( It is basically a convolutional neural network (CNN) which is 42 layers deep ) uses 21,802,784 parameters!

How NVIDIA is contributing to the Deep-learning environment in overcoming these challenges?

  1. The NVIDIA DEEP LEARNING INSTITUTE provides hands-on Training for Data Scientists and Software Engineers, helping the world to solve challenging problems using AI and deep learning. They cover complete workflows for applications in autonomous vehicles, healthcare, video analytics, and so on. Click on the link above for more information.
  2. NVIDIA INCEPTION has become a credible platform for AI-startups as it provides benefits like AI-expertise from the NVIDIA DLI, technology access from AWS, and cloud support by Oracle. It also provides you a global community for showcasing your innovation and go-to-market support.
  3. The latest Algorithms are provided via GPU accelerated frameworks and Deep learning Software Development Kits.
Source: NVIDIA

4. They provide Fast training using DGX, A100, V100, TITAN. DGX is Nvidia produced servers and workstations while A100, V100 are HPC Data Center Platforms. TITAN is one of the most powerful GPUs built for PC.

5. Deployment Platforms are provided by EGX, NGC, TensorRT,
A100/T4, Drive AGX, Jetson AGX. EGX Securely deploys and manages containerized AI frameworks and applications, including NVIDIA TensorRT, TensorRT Inference Server, and DeepStream. It also Includes Kubernetes plug-in, container runtime, NVIDIA drivers, and GPU monitoring. More here. www.nvidia.com/egx. NGC provides Optimized Containers, Pre-Trained Models, Model Scripts which are secure, scalable, and runs on any platform. For more, click here. www.nvidia.com/ngc

Additional resources in case you want to keep updated:

These are a few of the interesting questions that were covered by Will Ramey and NVIDIA engineers during the presentation!

How to get results if we have very high-resolution images like 2500*2500 (Like mammograms)?

For large images, you just need to increase your input/output size. However, as you increase the resolution of your input and output, your limiting factor will be GPU memory, as those large images get even larger as they are fed through the neural network. If your GPU is running out of memory there are a few ways to work around this. You can try processing your image in ‘tiles’. Breaking the main image down into smaller chunks that your GPU can handle, and then re-assembling them in a post-processing step. You can also train over multiple GPUs and have each GPU take a portion of the image.

How deep learning applies to Software Testing to identify bugs early and making the Software Testing process easy?

One use case I have seen at a large tech company is the use of DL to analyze code check-ins to determine which of a set of possible tests to run as a post-commit hook. Oftentimes, tests that are run post-check-in have nothing to do with the code that is checked in — so saving running these some of these tests on a per check-in basis can speed up the testing process considerably. Generally speaking, you can think of problems involving code as ‘natural language’ problems (even though they aren’t really natural language) — a lot of natural language advancements in the DL space have happened within the last 1–2 years with the Transformer family of deep learning models.

The recently launched NVIDIA Broadcast App magically removes background noise from ordinary microphones using AI, what kind of training data might have been used to train the AI?

This app takes, as input, noisy audio, and outputs clean audio. Therefore the training data might have looked like a bunch of different audio clips, one version that is noisy and one clean version. To do this, they could have recorded a bunch of dirty audio clips and manually cleaned them up OR (an easier way) would be to take a bunch of clean audio clips and add noise to them.

Can an NLP Machine translation system use DL work with only a monolingual data?

It depends on what other data you have access to. Generally speaking, DL models do need supervision and labeled data to do discriminative tasks like translation — however, we have become very efficient at transfer learning these days — so if you can find supervised data that is sufficiently similar to your monolingual data, you can transfer from your supervised data onto your monolingual test data. Models such as the BERT family of models are pretty good at this — being trained on one set of data, but ultimately evaluated on another.

Check out the NVIDIA GTC which has brought me to the infinite wandering of knowledge on Deep learning, GPUs, AI, and Machine Learning.

Please do follow me on Medium for more DL, AI blogs and connect with me on Linkedin!

--

--

Eshan Chatty
The Startup

Technical Writer At The Startup (Medium’s Largest Publication) Fuelled by everything that is related to data. A Deep-learning enthusiast, believe in good vibes!