Comparing Machine Learning (ML) Services from Various Cloud ML Service Providers

Tanya Thakur
Jan 23, 2018 · 6 min read

The ongoing hustle in the AI world is demonstrating that integrating ML/Natural Language Processing (NLP) in nearly any area, even making a machine and human converse via voice, is no rocket science. In fact, the technology/AI application enabling this — chat bots/voice bots — is fast evolving.

The launch of Amazon Echo and Google Home has shown the promise of voice, especially to businesses. This is giving us the impetus to start planning for and building voice bots. Interestingly, the arrival of Deep Learning technologies like Automatic Speech Recognition (ASR) and NLP from tech giants like Amazon and Google, has made the technology accessible.

These companies provide ML services, which help in:
Extracting keywords from words to understand what you are writing.
● Scanning and displaying the elements in an uploaded image or video.
● Converting text to speech and vice-versa.

We, at Kontiki Labs, understand that there are several service providers and services available for NLP, vision APIs and text & speech, making it difficult to decide which will be right for your business. So, we have looked through them and created a ready-reckoner that will help you make the right choice based on your business need. We have considered many crucial factors to create this quick guide which you can leverage to make a better product generation:

These are the Machine Learning services that we are going to cover:
1. Natural Language Processing services
2. The Vision APIs
3. Text-to-Speech & Speech-to-Text services

1. Natural Language Processing Services

Amazon Comprehend NLP vs. Google Cloud Natural Language vs. IBM Watson NLU vs. Microsoft Azure Text Analytics

When building a chatbot, there are numerous factors to consider. The first among them is: How will the machine know what the user is writing?

The Solution:
Leverage Amazon Comprehend, Google Cloud Natural Language and services from different providers that provide easy-to-use APIs. These NLP services use ML to extract insights from the text entered by the user.

How important keywords are extracted from the text.

Below are the 3 key aspects that we have considered to compare the NLP services — features, code execution and output, and price:

i. Feature Comparison

Feature Comparison of NLP services

ii. Code Execution & Output Comparison

The idea is to pass same text to all the NLP services and see which one provides us with the most relevant keywords, as output.

Amazon Comprehend:
Below is a very simple example prepared by our developers. The code will help you in executing the Amazon comprehend ML service:

Google Cloud Natural Language:
Execute the code below to know how Google NLP service responds to the same text you passed for Amazon above.

IBM Watson NLU:
The code below will reveal the features provided by the IBM Watson NLP service

Microsoft Azure Text Analytics:
Below is the code for executing the Microsoft Text Analytic service.

The output comparison for all the NLP service providers :

Time taken by the NLP services to execute the features provided.

iii. Price Comparison of NLP Services

Comparing price plans of NLP services

2. Vision APIs : Moving beyond text

A bot that extracts text and tags from an image.

Amazon Rekognition vs. Google Vision vs. IBM Watson Visual Recognition vs. Azure Computer Vision API

The services allow you to add image and video analysis to your bot/application. The easy-to-use APIs return the content in the images or videos uploaded.

Below are the 3 key aspects that we have considered to compare the vision API services — features, code execution and output, and price:

i. Feature Comparison

Feature Comparison of Vision services

ii. Code Execution & Output Comparison

We shared a single image with all the service providers API and compared the image recognition results.

We opted this image as this can extract maximum features : Facial Recognition, landmark and text extraction.

Amazon Rekognition:
Execute the code below to see how Amazon Rekognition service scans the image above.

Google Vision:
Below code is prepared by our team to see how Google Vision reads the image.

Microsoft Computer Vision API :
To analyse your image with Microsoft Azure use the sample code provided in the section below.

The output comparison for the Vision API :

Time taken by the vision services to execute the features provided.

iii. Price Comparison of Image Services

Comparing price plan of image services

3. Processing Text and Speech

Thinking of building a voice bot? Well, the API driven services from the below providers can become the backbone of your bot.

A Shop bot example : perceives human voice and transforms it into text.
Different provides and their services to process text

a. Text-to-Speech (TTS)

Below are the 3 key aspects that we have considered to compare text-to-speech services — TTS feature analysis, code execution and output, and price:

i. TTS Analysis

Comparing Text-to-Speech

ii. Code Execution & Output Comparison

We passed same text to all the speech API so that it becomes easy for you to figure out that which one provides clear output.

Amazon Polly:
The section below provides the code for executing text-to-speech via Amazon.

The Output from Amazon Polly

Microsoft Bing Speech API — Text-to-Speech:
Convert your audio to text by using the below code :

The Output form Microsoft Bing Speech API — Text-to-Speech:

IBM Watson Text-to-Speech:
Convert your written text to voice by using the below code

The Output form IBM Watson Text-to-Speech

iii. Price Comparison for Text-to-Speech

Price comparison for text-to-speech

b. Speech-to-Text (STT)

i. STT Analysis

Below are the 3 key aspects that we have considered to compare the speech-to-text services — STT feature analysis, code execution and output, and price:

Comparing Speech-to-Text

ii. Code Execution & Output Comparison

Google Cloud Speech API:
Below is the code for executing the Google STT API

Microsoft Bing Speech API — Speech-to-Text:
The section below will help you in converting your speech to text.

IBM Watson Speech-to-Text :
Execute the code below to convert your speech to text using IBM Watson

Time taken by Speech-to-Text services to execute the features provided.

iii. The Price Comparison for Speech-to-Text

Price comparison for speech-to-text

4. Conclusion

While most ML service products have common features, there are plenty that make them unique. Prior to selecting one, you should consider the type of product that you want to build. And to facilitate this decision, we, at
Kontiki Labs, have provided the code above for ML services that you can leverage to monitor the available services and opt for one that fulfils all your needs.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade