Automatically classify images with Microsoft Cognitive Service — Vision API

Rajesh Sitaraman
Aug 9 · 3 min read


Let’s see how to automatically classify an image uploaded to the SharePoint library. We will add tags and descriptions as metadata to any image which is getting uploaded to the SharePoint library using one of the Microsoft Cognitive Services, Computer Vision API. It will analyze the image and provide us a full-sentence description of the image, and generate tags related to the objects in the image.

For events handling we will be using Microsoft Flow, which will be triggered on file upload or modify and invoke the Vision API to get the caption and tags and then store this information like tags and description properties of the image in SharePoint.

Let’s start with creating Computer Vision API,

Azure Cognitive Service: Computer Vision API

  1. Login to your Azure subscription.
  2. Create a new resource, search for Computer Vision.
  3. Provide the required details like name, location, pricing, and resource group.
  4. Once the resource is successfully deployed, grab the key and endpoint URL from the “Quickstart” section, we need this information later when we establish the connection from the Flow.

SharePoint Online

We can use a document library or Pictures Library to associate the Flow. By default, the Pictures Library provides the required metadata columns like description and Keywords. In the case of the document library, you need to explicitly add the columns.

In case if you want the metadata to be searchable, make sure to site columns instead of the library columns.

Microsoft Flow

  1. Flow is configured to trigger on a file created or modified event.
  2. Checks whether the uploaded file is an image by checking the file extensions.
  3. Gets the file content of the image and passes it to the Computer Vision API endpoint.
  4. Parse the output JSON from the Vision API, and use the caption as description and tags as keywords and update the file properties.

When an above image is uploaded to the SharePoint picture library, Computer Vision API responds with below captions and tags. It identifies objects like a fire hydrant, sidewalk, etc. along with a caption “a red fire hydrant sitting on the side of the street”. As of now, the Vision API supports only up to 4 MB in file size.


Tag Names:

You can download this Microsoft flow template from this Github repositary.

Rajesh Sitaraman

Written by

former @microsoft MVP | architect @corebts | @cricket fan

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade