Crowdsourcing ML training data with the AutoML API and Firebase

Want to build an ML model but don’t have enough training data? In this post I’ll show you how I built an ML pipeline that gathers labeled, crowdsourced training data, uploads it to an AutoML dataset, and then trains a model. I’ll be showing an image classification model using AutoML Vision in this example but the same pipeline could easily be adapted to AutoML Natural Language. Here’s an overview of how it works:

  • A web app asks users to upload an image and assign a label
  • Using Cloud Functions for Firebase, the labeled image gets uploaded to Cloud Storage
  • When we have a specified number of training images per label, create a CSV of image paths and their corresponding label
  • Upload the CSV to an AutoML dataset
  • Kick off model training

Here’s an architecture diagram:

Want to jump to the code? The full example is available on GitHub.

Collecting crowdsourced images

Let’s say I’m building a model to detect types of cheese. I don’t have enough training images of my own, so I’d like to collect images from cheesemongers around the world. For that I’ve built a simple web app:

This model will only have 3 labels (brie, blue, camembert), and the user will be able to select the label for the photo they are uploading. I’m using the Firebase SDK for Cloud Storage to upload images to Cloud Storage directly from the web client.

Images will automatically be uploaded to a bucket called <project_name>.appspot.com. Because AutoML requires images to be in a bucket called <project_name>-vcm, I’ve created a cloud function that will copy uploaded images to this vcm bucket:

Once the image is copied, the function will write to my Firebase Realtime Database, where I’m keeping track of the number of images we’ve collected for each label:

I make use of Firebase transactions to update the label count in my Firebase database:

Uploading crowdsourced images to an AutoML dataset

I have a separate function that’s triggered whenever a label count is updated in my Firebase database shown above. AutoML Vision requires at least 10 images per label to train a model, but we’ll likely want more for higher accuracy. Here I’ve specified the number of images I’d like to collect for each label. If we’ve reached that number, we’re ready to upload our images to AutoML:

You could also write this function to kick off training periodically. For example, every time you have 500 new labeled images.

To upload labeled images to AutoML, we can create a CSV where the first column contains the GCS path of the image and the second column contains the label for that image. Once we’ve collected enough images, we’ll create a CSV:

Next, we’ll upload our CSV to Cloud Storage and then use the importData method from the AutoML API to add these images to our AutoML dataset. Our JSON request to importData includes our project and dataset IDs, along with the CSV filepath:

We could alternatively pass a list of individual image paths to inputUris, but currently the only way to upload labeled data to AutoML through the API is to provide a CSV path. With our request above, we’re ready to call importData:

Note that the maximum duration for a single cloud function is 9 minutes. If you’re collecting lots of images you should expect the AutoML import to take longer than this, and you’ll want to use something other than Cloud Functions to kick off your model training.

When you check your AutoML project, you should see that images are being uploaded to your dataset. Once this completes, the function will then kick off training for your model using the createModel method. This kicks off training, creates a new model, and deploys it:

That’s all you need to build a pipeline for collecting labeled training data!

Get started

Inspired to start crowdsourcing training data for your own ML models? Dive into the AutoML API docs. I used the Node.js client in this example, but there are also client libraries for Python and Java. In addition to the importData and createModel methods of the AutoML API I’ve shown, it has many other cool features like exporting a dataset, updating the IAM policy for a model, and of course generating predictions. It’s also worth noting that AutoML Vision has a human labeling feature. If you’ve already collected the images you’ll use to train your model but need help labeling them, it’s worth a try. Check out the full code for this example on GitHub, and let me know what you think on Twitter at @SRobTweets.