My experience with the “Udacity Secure and Private AI Challenge”
I started this challenge course with little to no background in machine learning, AI, PyTorch, or Python. I had already accepted and completed another Udacity Challenge and obtained a Certificate for the Front End Web Developer (FEND) Nanodegree. There I experienced a Slack community for the first time, and published my first Medium article. I was at the point in my life where I had taken 20 years away from a traditional office / work environment to raise a family, but my desire to learn and regain my footing in technology has never dimmed.
How I learned about the challenge
Around the time that I completed the FEND challenge, I encountered other FEND Udacity scholars at the Google Chrome 2018 Developer’s conference in San Francisco who said that they had applied and been accepted to do the “PyTorch Challenge.” I felt like I had missed the boat. What was PyTorch? I wondered. What does machine learning & AI involve? They were new fields for me but ones that were being relentlessly debated in the news, especially in regards to privacy and data breaches. In the mean-time, I read a Medium article (Neural Style Transfer: Creating Art with Deep Learning using tf.keras and eager execution) about neural style transfer learning using Keras with step-by-step instructions on how to implement it. In this way, I was introduced to the Jupyter Notebook environment on Google CoLab. This was my first taste of AI and I enjoyed the new approach of working with code to train a model on my own images and artwork. Always involved in the arts, I had been using PhotoShop and Illustrator to do graphic design and was aware that those and other apps used Machine Learning (ML) & AI in the background to enhance their functionality. I was also playing with the language, Processing to build data-driven graphics. However, this (training a model) was a different type of coding for me, one where you train the computer to “learn” rather than detailing every single step for the computer to implement, as I was used to doing in JavaScript.
About 8 months later I saw the announcement for the Udacity Secure and Private AI Challenge in my Inbox. I applied right away and was thrilled to be accepted for the challenge!
The initiatives
In addition, the challenge course offers various initiatives for you to take on if you would like. These initiatives increase your chances of making it into phase two as well as offer additional opportunities to engage with your peers in study groups, group projects, and so on. My reality was that I had a very limited amount of time to spend on this course each day as it launched at the start of summer break (I have 2 kids) and I also run a concert series in my home and was going to be on the road visiting family quite a bit. I’ll bet that any person who is reading this has limitations on their time — family, health, bandwidth, school, job. This challenge course was geared towards a wide variety of availabilities and capacities in mind and for this I am grateful.
One of the major initiatives (goals) in this course is to participate in the Slack community. For someone who has been professionally “isolated” at home for 20 years, this has been a major benefit for me. Although it is not my nature to be constantly connected online, on a daily basis to make an effort to engage for at least 5–30 minutes a day. Even as a beginner I was able to contribute by encouraging other students, and by asking questions when I was up to it. I learned that the Udacity slack communities are from all over the world which is refreshing and exciting. In addition, many people that I know in my daily life don’t know what I’m doing and don’t know how to support me in this educational goal. When you are lost and in doubt this really takes you to the next step in learning and cannot be underestimated. The Udacity learning community has been a positive experience for me.
The other initiative which had helped my in my past learning experience with Udacity is the #60daysofudacity challenge (It was #100 days of Udacity for FEND). The challenge asks you to code / study for at least 30 minutes a day and to engage with the slack community. For people like me, a beginner, this course was demanding and all of the material took a lot of extra work to understand and research. Having a goal of studying every day like this and also engaging with the community really helped to hold me to the timeline of finishing on time. Again — the goal is to push you a little outside of your comfort zone because we all have very busy lives outside of this course. However, the little commitment every day contributes to the greater success and is a wonderful learning tool. This meant that I was studying in some very strange situations (in a hotel closet, in a car, at a fiddle camp, in a bathroom when I wanted to rig up a standing desk on the road).
Getting started was a bit rocky
I’ll admit that without a background in Python, Machine Learning, Statistics, or Calculus I floundered a bit in the beginning. However, because of my previous experience with a Udacity challenge course, I knew that persistence and reliance on the Udacity learning community was the key to approaching this course.
In the beginning I had a “deer in the headlights feeling” — everything was new and I didn’t know what was important for me to focus on. As many of you know, when you start learning about neural networks, deep learning, and AI, many of the explanations and papers come with mathematical depictions of the algorithms front and center. While this is a description that is useful to some, for me it was like describing a complex physics formula of why you should go 30 mph around a tight curve in a car. You know why, because you have practiced the maneuver over and over in a real-life context. In deep learning there is a phrase that kept coming up again and again — “develop an intuition for. “ This means that by spending lots of time with the code — either under the hood — or just working with it over and over by varying the parameters / optimizers / etc , you develop an intuition for how things behave within the context of neural networks.
In hindsight, what I should have done, is to first take another course also offered by Udacity : Intro to Deep Learning with PyTorch | Udacity. Instead, I decided to take the scenic route and delve right into multi-variable calculus as a newbie (which, by the way, is a place that I wanted to hang out in for longer but the clock was ticking!) The next very tantalizing place was the land of linear algebra, and vector transformations on multiple dimensions. (I’d recommend the 3Blue1Brown series — for some really binge-worthy vector math videos.) In addition I took the frequently recommended Andrew Ng Coursera course on deep learning for the first few chapters to understand deep learning.
My tools
I accomplished this course entirely on a Mac Air running Mojave with PyTorch & PySyft installed. (This took a bit of wrangling but was worth it when I had to code without an internet connection and also had downloaded videos for offline viewing.) I also ran some notebooks in the Udacity workspace and on Google Colab to test in an alternative environment and to use their free GPU !
A month had already flown by and I finally gave myself permission to launch into the challenge course after following many learning rabbit holes.
The course lessons — an overview
The course is organized into 10 Sections which are all self evaluated. This means that no one’s really looking over your shoulder. The previous challenge course I took was not self-evaluated. There is no personal gain by following the course and not making an effort to understand the material.
Lesson 2 : Deep Learning with PyTorch — (Taught by Mat Leonard, the head of Udacity’s School of AI.) In this lesson I learned how to build a basic neural network to classify images using PyTorch, a machine learning library for Python developed by the Facebook AI group. PyTorch has a module, nn, that has a lot of the classes, methods, functions for efficiently building large neural networks rather than doing the matrix multiplication manually. We learned about the torchvision package which contains datasets, model architectures, and image transformations for computer vision and also allows you to easily import these datasets. (https://pytorch.org/docs/stable/torchvision/datasets.html) .
In addition to defining the neural network, we learned how to instantiate it, train it, test it, and validate it. This chapter took me the longest to work through as many of the concepts were new to me. In addition, I wanted to understand some of the types of activation functions : sigmoid, ReLu, & Softmax (which help to normalize the output of each neuron to give you a proper probability distribution and make it easier for you to classify many items. ) All of these concepts take some time to wrap your head around and come with their own complex formulas to contemplate and unpack. In essence, we learned how to build a model that takes an image dataset (MNIST from the torchvision package) and train this model to classify the images in categories from 0 to 9.
Lessons 3–9 were taught by Andrew Trask, the author of “Grokking Deep Learning”, and a researcher at DeepMind and Oxford University in deep learning & privacy. He leads OpenMined.org, an open source community of over 3000 researchers & dev dedicated to building technology for safe AI.
Differential Privacy
In lessons 3–6 we learned all about differential privacy in the context of deep learning. It’s really helpful to understand the modern definition by Cynthia Dwork : “Differential Privacy” describes a promise, made by a data holder, or curator, to a data subject, and the promise is this: You will not be affected, adversely or otherwise, by allowing your data to be used in any study or analysis, no matter what other studies, data sets, or information sources are available.” When training neural networks on sensitive data, we want to make sure that the network is only learning what it’s supposed to learn. To this end we learned to generate parallel databases that simulate one user’s data being removed, to test whether that affected the privacy of the database. We learned that the maximum amount that the query changes when removing an individual from the database is called the sensitivity of the query. We learned how to perform differencing attacks where we can expose the value of a person represented on a certain row in a database with different kinds of queries : sum, mean, and threshold.
At the core of differential privacy technique is adding noise to the data. There are two ways to do this : Locally (local differential privacy ~ where noise is added to each data point) and globally (global differential privacy where noise is added to the output of the query) . We learned that with local differential privacy, you gain privacy but lose some accuracy. This is because when you’re adding noise to each datapoint, it’s collectively a lot of noise and you need a very large dataset to offset it. With global differential privacy we add less noise and gain accuracy but lose some privacy, and you can use a smaller dataset. Differential privacy aims to look for general characteristics and filters out unique private information.
We also learned about differential privacy in terms of deep learning through protecting the privacy of patients in hospitals. For example, let’s say we are a hospital with unlabeled x-rays, and we want to use the labeled data from 10 partner hospitals to label our data. To do that, we could ask the 10 partner hospitals to generate 10 models based on their labeled data. Next, we can use their models to generate labels on our data and choose the most common result ( by using a combination of a differentially private query, max result, and Laplacian noise). At last, we can train our new differentially private model on our own local dataset. In the course, we look in detail at how all of this works under the hood.
Federated Learning
In Lessons 7–9 we learn about Federated Learning, a widely used technique for privacy in deep learning. It is used to train machine learning models on data to which we do not have access, and has an ability to do remote executions and train models in parallel on a large number of machines. What does this mean? For example, we have mobile phones with sensitive data and we don’t want anyone to have access to the data. So rather than sending the data to some central server to train a model, the model is sent to the phone (or millions of phones) to train. No data leaves the phone and the updated model returns to a trusted server. The toolkit that supports Federated Learning is called PySyft (an extension of PyTorch) created by the OpenMined open source community for privacy-related tools for deep learning (openmined.org). . In addition, we learned how to further enhance privacy with additive secret sharing, a protocol for Secure Multi-Party Computation (SMPC) to perform the aggregation across several “virtual workers” while the gradients are still encrypted.
That is just a taste of what we learned in the Udacity Secure and Private AI Challenge course. In fact, this medium post was one of the initiatives that the course proposed. It’s pushing me a little out of my comfort zone, but I hope that some of you have found it helpful and maybe even a little inspiring in your own learning journeys!