Training a Computer Vision Model on VertexAI (with and without Explainable AI option)
(For a Google Doc version of this post: https://docs.google.com/document/d/1MUFtCEFXfZQDuDRA3VG6ciRvqQI6o-kYbFw9FWfKymM/edit?usp=sharing)
Create a project in Google Cloud Platform
- Access the Google Cloud Console at: https://console.cloud.google.com
- Create a new project:
If it is the first time you create a project, you will be directed to create a new project.
Otherwise, click in the arrow near existing project:
Click on NEW PROJECT
Provide a name of your choice and click on CREATE
Download a Kaggle dataset inside a bucket in Google Storage
- Click on the Navigation Menu, then search for STORAGE -> Cloud Storage -> Browser:
2. Click on CREATE BUCKET:
3. Provide a name for your bucket and select as Region: us-central1 (Iowa)
4. Click on CREATE.
5. You should see something similar with this:
6. Click on Activate Cloud Shell
7. Type the following command:
curl https://users.rcc.uchicago.edu/~tszasz/datascienceworkshop/cats_dogs.zip | gsutil cp — gs://[YOUR_BUCKET_NAME]/cats_dogs.zip
- [YOUR_BUCKET_NAME] — is the bucket name you choose to store the data
- The data source is available at: https://www.kaggle.com/pybear/cats-vs-dogs
- The curl command uses this location where we copied the data: https://users.rcc.uchicago.edu/~tszasz/datascienceworkshop/
8. Click on AUTHORIZE
9. You should see the data “cats_dogs.zip” inside the bucket:
10. Unzip the data. In Cloud Shell:
a. Copy the zip file inside the cloud shell. Type:
gsutil cp gs://[YOUR_BUCKET_NAME]/cats_dogs.zip .
Note: [YOUR_BUCKET_NAME] — is the bucket name you choose to store the data
b. Unzip “archive.zip”. Type: unzip cats_dogs.zip
c. Copy the “cat” folder to the [YOUR_BUCKET_NAME] bucket:
gsutil -m cp -r dataset/training_set/cats gs://[YOUR_BUCKET_NAME]
d. Copy the “dog” folder to the [YOUR_BUCKET_NAME] bucket:
gsutil -m cp -r dataset/training_set/dogs gs://[YOUR_BUCKET_NAME]
You should see the “cats” and “dogs” in your storage bucket:
11. Generate a .csv file with the filenames:
a. Add the filenames from the “cats” folder to the “cat_dog_filenames.csv”:
gsutil ls -l gs://[YOUR_BUCKET_NAME]/cats/** | head -n-1 | awk ‘BEGIN { OFS = “,” }{print “”$3"””,cat”}’ > cat_dog_filenames.csv
b. Append the filenames from the “dogs” folder to the “cat_dog_filenames.csv”:
gsutil ls -l gs://[YOUR_BUCKET_NAME]/dogs/** | head -n-1 | awk ‘BEGIN { OFS = “,” }{print “”$3"””,dog”}’ >> cat_dog_filenames.csv
12. If you would like to see the contents of “cat_dog_filenames.csv”, you can Open Editor and open the file:
13. Save the cat_dog_filenames.csv file inside the bucket:
gsutil cp cat_dog_filenames.csv gs://[YOUR_BUCKET_NAME]
You should see the “cat_dog_filenames.csv” in your storage bucket:
Import a Vertex AI dataset
- Click on the Navigation Menu and search for Vertex AI under the ARTIFICIAL INTELLIGENCE. Then click on Datasets.
a. If it is the first time VertexAI is used in this project, click on Enable VertexAI API
2. Click on CREATE
3. Provide the “Dataset name” of your choice (“cat_dogs” in this example), select the “Image classification (Single-label)”, and “us-central1(Iowa)” region. Then click CREATE.
4. Click on Select import from Cloud Storage and BROWSE for the location of the .csv file that contains the filenames of the “cats” and “dogs” folders and the label. Click on CONTINUE.
5. This will take about 10 minutes to create the training dataset. Once it is ready, you will see the dataset in the “Dataset” section:
6. If you click on the “cat_dogs” link, you can visualize the dataset and its labels:
Train a Vertex AI model
- Click on Training, then click on CREATE
2. Select AutoML and click Continue
3. Then click CONTINUE.
4. Click CONTINUE
5. Add the Budget: We added 8 hours. The training will stop since we Enabled the early stopping. The click on START TRAINING.
6. This will take at least 1 hour and 30 minutes to complete.
Note: Sometimes, the training may fail with the following error: “Internal error occurred. Please retry in a few minutes”
Please retry again if this happens.
7. Once the model finished training, you will see it inside Vertex AI->Training
- If you click on the link corresponding to the model, you can see its performance in the EVALUATE section.
Run predictions
- In Models -> DEPLOY & TEST, click on DEPLOY TO ENDPOINT
2. Provide an Endpoint name and then click CONTINUE.
3. Enter 1 in the Number of compute nodes field.
4. Click DEPLOY.
5. The deployment process will take about 10 minutes. Once the endpoint is ready, its status will be “Active” and you will be able to Upload Image.
6. You can run API requests (see “Sample request”)
7. Or UPLOAD IMAGE to test the model. The size of the image has to be smaller than 1.5MB.
I used a picture of my cat — It looks like she is 100% a cat :-)
8. Enjoy building computer vision models using VertexAI!
Adding Explainability to the Vertex AI model
- CREATE a new model from the Models section of the Vertex AI:
2. Select the same dataset, and the AutoML option. Click CONTINUE.
3. In the Model details, click on ADVANCED OPTIONS.
4. Edit the Model name (in our case we named it cats_dogs_XAI to distinguish it from the model that does not have explainability), The percentage of data in Test to 3% (explainability requires to have less then 300 images in the test data) and the percentage of data in Validation to 17%:
5. Click CONTINUE.
6. Select Generate explainable bitmaps for each image in the test set. Then, CONTINUE.
7. Edit the Budget section with 8 node hours.
8. Click on START TRAINING. This will take about 1 hour to complete.
9. In Models -> DEPLOY & TEST, click on DEPLOY TO ENDPOINT
10. In the Model settings, select the Enable feature attributions for this model in the Explainability options section and then click EDIT. Select XRAI then click Done. Click DEPLOY.
11. Inside the Models section, select the one with Explainability and click on UPLOAD & EXPLAIN. Upload an image you would like to test and explain:
I chose Ziggy again. We can see that the area around the ears, nose, and eyes is the one that is most contributing to the prediction:
And another image: