Udacity AI Product Manager Program Review (Part I)

A Project Centric Review Approach

Erkan Hatipoğlu

Published in

Geek Culture

8 min readNov 12, 2021

A man is working on a desk with a computer screen and a keyboard. — Photo by Austin Distel on Unsplash

Overview

This is the first episode of a three parts series on the review of the AI Product Manager Program offered by Udacity.

I've been accepted to the Bertelsmann Technology Scholarship program phase 1 in December 2020 and to phase 2 in March 2021. During the phase 2 period, I completed AI Product Manager Nanodegree by Udacity and recently graduated.

The Nanodegree consists of 4 chapters and three project assignments. I believe the most critical part of an online course is the project assignments. Hence I will not go into the details of the lectures. Instead, I will try a different approach and do the review through project assignments.

I will publish each project in a different episode. Hence, the readers can read about the first project in part I, the second in part II, and the final in part III.

For each project, I will try to answer the following questions:

What is the project about?
What is the outcome of the project?
What are the sources and tools to complete the project?
What is the solution?
What are the problems faced during the project implementation?
How can those problems be solved?
How can the project be improved?
What are the references for the project?

Without further ado, let's dive into the first project;

Project 1 — Create a Medical Image Annotation Job

The first project of the Nanodegree is about designing a data labeling job for a given dataset and business goal.

The business goal is to make an AI product that distinguishes between healthy and Pneumonia chest x-ray images for children.

As AI product managers, our task is to build a labeled dataset for this product. ML engineers can, later on, use this dataset to create a classification product that can be used as diagnostics support by doctors.

Outcome

This project targets learning to create a high-quality dataset. Data quality is a crucial element of a successful machine learning project. We need to be sure that the data is big enough, complete, correctly labeled, and matches our use case to succeed with our project.

We will work on a modified subset of this Kaggle chest x-ray dataset, with most labels removed. To complete the project, we need to deliver a project proposal as a pdf file that includes the design details and strategies for quality assurance, along with an HTML file that contains the instructions, examples, and some sample test questions. We will use Appen's platform for the data labeling job and create the HTML file as a project requirement. Appen is a platform that collects and labels data for Artificial Intelligence systems.

Data Annotation Platform

To annotate an image data source with no labels, we can use data annotation platforms like Appen. Annotation platforms work with human annotators that can label the data as required. To do that, we must explain to human annotators what to do and how to do it for all the use cases. In addition, we must design for uncertainty and supply the annotator with a way to handle unclear images. Finally, we need to prepare some test questions (unknown to the annotator) to measure the job's quality and the annotator's performance.

The Design

Our design will be as follows. Since we are trying to differentiate between healthy and Pneumonia images, we need two labels (yes or no). In addition, we want to give a method to the annotator to handle unclear images (not sure). As a result, as the first step, we need an image with three choices as below.

A chest x-ray image on the left and a question with three answer options on the right. — Step-1 — *Image by Author*

We want to learn what symptoms the annotator has seen in the image. So we ask the symptoms as the second step if the annotator selects yes.

A chest x-ray image on the left and two questions with different answer options on the right. — Step-2 — *Image by Author*

Finally, we want to learn the annotator's opinion on the likelihood of Pneumonia if they are unsure about the image as the third step.

Appen's Platform

To use Appen's platform, we must first create an account as a client. After signing in, we will be directed to the job creation page, where we can find several job templates for specific use cases. We will use the Image Categorization template for our use case since we want to label full images. To do that, we must first click on the Image Categorization template, and after the template is loaded, we need to click the Use this template button at the top right of the screen.

The Data Tab

The first thing to do on the job page is to upload the data, and we can use the Data tab for this purpose. As previously mentioned, we will use a subset of the Kaggle chest x-ray dataset with 117 images. Sixteen of the images are labeled and will be used as examples or test questions, while 101 of the images are unlabeled and will be used for the annotation job.

The Design Tab

The next step is designing the job by clicking the Design tab. The design tab consists of the title, the CML code, and the instructions section. We need to give a relevant title to our job so that the annotators can understand what's to be done. Pneumonia Identification is a good choice for the title.

Custom Markup Language

CML (Custom Markup Language) is an HTML-based language specific to Appen. It is used to define the elements (such as radio buttons) that the annotators use to interact with the images in our dataset.

The CML Code of the Project-1

Instructions

As previously stated, we need to give detailed instructions to the annotators so that they can label the images as required. Consequently, the instruction section, which is an HTML code, must consist of an Overview part that narrates the job, a Steps part that describes the steps to be done in order, a Rules and Tips part that explains the details, and an Examples part that clarifies the job for all use-cases.

The Quality Tab

After finalizing the design, we need to switch to the Quality tab. In this tab, we will prepare some test questions for the annotators. These questions will be used to track the annotators' performance. Udacity suggests preparing at least 5 % of the dataset as test questions, and Appen suggests 8 for our case. As we have 16 labeled data, we can easily prepare 8 test questions. One thing to note here is that we must prepare a balanced number of test questions for each label. Otherwise, annotators may lean toward the most-frequent label while annotating.

While preparing the test questions, we must provide the answers and explanations so that if an annotator misses the question and answers incorrectly, she can learn the reason. The annotators must understand why they are wrong to improve their performance.

Saving the HTML File

After finishing the test questions, we are done with the platform. We can use the eye icon at the top right of the screen and preview our job. We can save the HTML file by right-clicking the page and selecting save as. We need to set save as type as Webpage, HTML Only. Note that we will not launch the data labeling job! Now it is time to work on the proposal.

The Proposal

The proposal consists of 3 sections with seven questions, some of which have been answered above. I will not cover all the details of the proposal; instead, I will focus on some critical aspects.

In our labeling job, we used three labels. Namely 'Has Pneumonia', 'Healthy (no pneumonia),' and 'Not Sure.' However, our model needs two labels. 'Has Pneumonia' and 'Healthy.' We need to find a way to decrease the labels so that our model can use them. If Not Sure answers are rare, then there is no problem. We can make manual checks. But if they are frequent, especially on specific images, we need a method to decide on the labels. We may use the mean of the scales for this purpose. We provided a likeliness of the Pneumonia scale to Not Sure answers for this purpose.

We can use Appen's platform tools to monitor annotators during the job period and take necessary actions. For example, sometimes, a respective amount of the annotators may miss a specific test question. This may happen because the test question is tricky or the instructions and examples are not clear. In this case, we may want to modify the instructions and examples. We may even use the test question as an example question since it is tricky.

We should also pay attention to the contributors' feedback and make necessary changes if needed. We can use Appen's platform monitoring tools for this purpose.

We must consider limitations and improvements for our job as well.

To do that, we must consider any source of bias and how to handle it. Since our dataset is minimal, we may have sampling bias. Moreover, we may have measurement bias since the images are slightly different in size and taken under faintly different exposure times. So, acquiring more data and using the same imaging procedure for new data may decrease these biases.

In addition, we must decide how to get new data and what to do with it in the long run. For our use case, it is reasonable to use a dynamic model continuously trained with new data since it evolves because of new imaging technologies and new symptoms or diseases (Covid 19, for example).

After finishing the proposal and creating a pdf file, we can submit our project for review.

Troubleshooting

I think the curriculum for this Nanodegree is designed in 2018, maybe even 2017. As a result, some information, especially about Appen's platform, is outdated. Consequently, students may have difficulty understanding the interface. In addition, since the CML is specific to Appen, it may not be easy to find tutorials on it. I suggest Udacity's mentor help platform and Appen's help pages in case of trouble.

Conclusion

Gathering clean data is one of the essential steps of an ML workflow. In this first project of the AI Product Manager Program, we have learned how to create a labeling job which is key to a successful ML project.

We have started with a business goal. Then we gathered the dataset. Since the dataset was unlabeled, we decided on the labels and used Appen's platform to annotate the dataset. We first downloaded the dataset to Appen's platform to do that. Later we made the design that includes the instructions, examples, test questions, and the unlabeled dataset.

While making the design, we focused on comprehensibility and quality. We have also considered the possible biases of the data and plan for longevity.

You can also find a complete project implementation in my GitHub repository.

It is now time to move on to the next project. Stay tuned for part II.

References

Udacity AI Product Manager Program Review (Part I)

A Project Centric Review Approach

Overview

Project 1 — Create a Medical Image Annotation Job

Written by Erkan Hatipoğlu