Implementing ML APIs in GCP

Rajendra Sarpal
Gen-Y
Published in
5 min readJun 1, 2020

--

There are two ways the Google Cloud Platform can help you add machine learning to your applications.

On the left-hand side, Google Cloud Platform offers tools(TensorFlow) to help us building custom machine learning models.

In the right-hand side, this is what I like to call friendly machine learning which offers pre-trained ML APIs.

API

API stands for Application Programming Interface. An API is a software intermediary that allows two applications to talk to each other. In other words, an API is the messenger that delivers your request to the provider that you’re requesting it from and then delivers the response back to you.

These are a set of pre-trained APIs to give you access to pre-trained machine learning models with a single REST API request.

Pretrained ML APIs

Google offers set of pretrained model APIs which can be directly implemented in our machine learning model.

  • Vision API
  • Video Intelligence API
  • Speech API
  • Translation and Natural Language API

Vision API

Using Vision API

Cloud Vision is an API that lets you perform complex image detection with a single REST API request.

It actually able to detect landmark, face, emotions, labels, properties, Safe Search, objects etc. We can also see full JSON response from the API.

Video Intelligence API

Using Video Intelligence API

Cloud Video Intelligence is an API that lets you understand your video’s entities at shot,frame, or video level.

It can tell you what’s happening in every scene of your video, shot change Detection, Inappropriate scenes detection, object detection etc.

Speech API

Using Speech API

Cloud Speech is an API that lets you perform speech to text transcription in over 100 languages.

Speech API lets you pass it an audio file, and it returns a text transcription of that file. It also supports speech timestamps. So, what this will do is it’ll return the start and end time for every word in your audio transcription, which makes it really easy to search within your audio.

Translation and NL API

Cloud Natural Language is an API that lets you understand text with a single REST API request.

It can extract Entitles, detect sentiments, analyze text and classify content.

Using APIs on GCP

Step 1: Navigate to API & Services and then Credentials to generate the API Key.

Step 2 : Click in Create Credentials and then API Key to generate the API Key, and copy that API Key for futute use.

Step 3 : Paste that copied API in place of “CHANGE-THIS-KEY”.

AI Platform Notebooks

AI Platform notebooks run on virtual machines. AI Platform Notebooks are a fully hosted version of the popular JupyterLab notebook environment.

Two things follow from the fact that AI Platform Notebooks run on a Virtual Machine.

First, it means that you can actually control and change what sort of machine is running your notebook by for example, giving it more memory or adding a GPU without having to rewrite your notebook from scratch.

Second, VMs are ephemeral.

Step 4 : After logging to Google Cloud Platform, navigate to AI Platfrom then Notebooks.

Loading AI Notebooks in GCP

Step 5 : Create New Instance and select confirugations as per your requirements.

Creating Instance

Step 6 : Select the required Notebook.

Launching Notebooks

Invoking Translate API

Type your sample text to be translated in the “inputs” and change the source and target as per requirement. Here, source = en(Input text is in English Language) and target = fr(Output text is in France Language).

Invoking Vision API

Set the directory of the image in the IMAGE variable. We have used type to TEXT_DETECTION for our image. Our target variable is in ‘English(en)’ that can be changed as per requirement. After running cells, we get our translated output.

Sentiment Analysis with Language API

Let’s evaluate the sentiment of some famous quotes using Google Cloud Natural Language API. By using NL API we can detect polarity and magnitude of sentiments present in the quotes.

Speech API

Pass the directory of the input audio file in ‘uri’. On executing it gives a output with a Confidence Score of 0.98.

As machine learning matures, many of the reusable tasks will be available in pre-trained form, whether it’s Vision or Speech.

A key point that these machine Learning APIs teach us is that we want to take our Machine Learning models and make them just as easy to use.

--

--