Creating a Computer Vision API in 60 minutes

Thomas Nguyen
Sep 4, 2019 · 6 min read

How to create a Computer Vision API in 60 minutes using library, Flask, and Google Colab?


The requirements are straightforward. You need a Google account in order to able to access your Google Drive and use Google Colab.

1. Create your Image Dataset

We are going to classify flowers but, instead of taking existing datasets available from the Internet, we are going to build our own labeled image dataset using Google Images.

First, open your new notebook Google Colab and connect it to your drive with the following code:

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"

You will be asked to click a link and copy/past the code that will be given to you in the box appearing in the output of the cell.

Now you can access the data in your Drive from your notebook.

Download the images and store them into your Drive. To do this, go on Google Image and search for the objects you want to classify. In our case, we want to classify Sunflower, yellow Daisy and yellow Tulip. Let’s start with the Sunflowers. Search Sunflowers on Google image and scroll to the bottom until you see show more results. Then, open the developer mode of your browser and go the Web Console.

There, past the following code to download a CSV of all the image urls:

urls = Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou);'data:text/csv;charset=utf-8,' + escape(urls.join('\n')));

This triggers the download of the file. Save it and upload it using the widget on the left of the Notebook: Files > Upload. Do the same for the other flowers.

We are going to now use the library. Create a cell and paste:

!curl -s | bash

This will install the library and configure your Colab notebook to run smoothly with it. Then, activate the GPU going to Runtime > Change Runtime Type > GPU.

Now, import the library and download the images:

from import *
from fastai.metrics import error_rate
folders_files = [('sunflowers', 'sunflowers.csv'), ('yellow_daisy', 'yellow_daisy.csv'), ('yellow_tulips', 'yellow_tulips.csv')]
for (folder, file) in folders_files:
path = Path(root_dir + 'DeepLearning/Datasets/')
folder = (path/folder)
folder.mkdir(parents=True, exist_ok=True)
download_images(path/file, folder, max_pics=200)
verify_images(folder, delete=True, max_size=500)

Here we create a folder and download 200 images for each class and verify they are not corrupted.

Next, we create an ImageDataBunch from the downloaded images. This object will represent our data with their labels. To optimise the process, we also:

  • Split the data to have 20% of validation
  • Resize the data in squares of 224 pics
  • Operate image augmentation with get_transforms
  • Normalize the data
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

Let’s visualize our data:

data.show_batch(rows=3, figsize=(7,6))

Everything seems correct. However to ensure optimal performance, you should check the images manually and remove the non-consistent images from your dataset. The Google Drive UI is quite convenient for validating and filtering images manually.

Now we are going to train our model. The library is very concise and is set with good default features.

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

With these lines, we are downloading a pre-trained model ResNet34, passing our data and setting the metric as the error rate.

Then, we fit the model with the one cycle policy, as this mode of approach usually performs well.

We get an error rate of 6% which is not bad, but we could do better. We were training the top layers of the pre-trained model. Let’s unfreeze all layers so their parameters are able to be modified during the training phase. library provides lr_find which will launch an LR range test that will help you select a good learning rate. Plotting the curve:


A rule thumb is to spot the strongest downward slope. Therefore we pick the beginning of the range at 1e-5 and choose to stop at 1e-4 because we were already training at 1e-3 by default. That way, the first layers are will be trained with a learning rate of 3e-5 and the last ones at 3e-4.

learn.fit_one_cycle(10, max_lr=slice(3e-5,3e-4))

3.5% of error, which is better.

Let’s analyze the classification errors:

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

We see most of the errors come from the confusion between yellow daisy and sunflowers which can be understandable due to their visual similarities. Plotting the errors which had the biggest top loss can help us to further understand the causes of the error.

interp.plot_top_losses(9, figsize=(15,11))

Some normal images have been misclassified, but others don’t look like flowers at all. This process demonstrates that we have not cleaned the dataset to the extent that we should have. Therefore, to improve performance, we should clean the image dataset and rerun the process to have a consistent score.

After cleaning and retraining our model, it should be a shame to keep it in the notebook, let’s put it now in production!

Save your model:


Download it and save it at the base of your API local folder. We are going now to build a basic API which will allow the user to upload an image and get the prediction.

Create a virtual environment with Python 3 having Flask and library.

Create and past the following code:

import os
from flask import Flask, flash, request, redirect, url_for, send_from_directory, jsonify
from werkzeug.utils import secure_filename
from import *
app = Flask(__name__)UPLOAD_FOLDER = os.getcwd() + '/files/'
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg', 'gif'}
app = Flask(__name__)
def allowed_file(filename):
return '.' in filename and \
filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
@app.route('/', methods=['GET', 'POST'])
def upload_file():
if request.method == 'POST':
# check if the post request has the file part
if 'file' not in request.files:
flash('No file part')
return redirect(request.url)
file = request.files['file']
# if user does not select file, browser also
# submit an empty part without filename
if file.filename == '':
flash('No selected file')
return redirect(request.url)
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
img = open_image(file)
pred, _, losses = learner.predict(img)
print(pred)['UPLOAD_FOLDER'], filename))
return jsonify(str(pred))
return '''
<!doctype html>
<title>Upload new File</title>
<h1>Upload new File</h1>
<form method=post enctype=multipart/form-data>
<input type=file name=file>
<input type=submit value=Upload>
def uploaded_file(filename):
# return send_from_directory(app.config['UPLOAD_FOLDER'],
# filename)
return 'File updated!'
if __name__ == '__main__':
defaults.device = torch.device('cpu')
learner = load_learner('.')
print('OK')"", port=int("80"), debug=True)

We load our model line 58.

Launch your API with python and access with

There you can upload your image and get your classification. For example with this image:

You get a JSON response:

To go deeper into this topic, I highly recommend you try the excellent courses of

You can also find the code in the repository

DAIN Studios

Data Science, Artificial Intelligence and Advanced Analytics for Business.

Thomas Nguyen

Written by

I assist companies in building modern data platforms as a Data Engineer Consultant @DainStudios

DAIN Studios

Data Science, Artificial Intelligence and Advanced Analytics for Business.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade