AWS Certified Machine Learning Cheat Sheet — High Level Machine Learning Services 1/2

tanta base
6 min readNov 20, 2023

--

AWS has it all thought out! If you want a machine learning system in your workflow or enterprise but not an engineer, AWS created these high level machine learning services that are ready to use and non-engineer friendly. This article will go over Comprehend, Translate, Transcribe and Polly.

Machine Learning certifications are all the rage now and AWS is one of the top cloud platforms.

Getting AWS certified can show employers your Machine Learning and cloud computing knowledge. AWS certifications can also give you life-time bragging rights!

So, whether you want a resume builder or just to consolidate your knowledge, the AWS Certified Machine Learning Exam is a great start!

This series has you covered on the high level machine learning services that are fully managed and cloud based. These services can be used without any machine learning expertise.

Want to know how I passed this exam? Check this guide out!

The installments in this series are:

robot sitting at a desk, looking at a monitor and typing on a keyboard
machine learning is also human learning!

TL;DR

  • Comprehend is a NLP service that can discover insights from documents. It does not support custom models. Most common use cases are customer sentiment analysis, index key phrases, entities, sentiment, and organize documents by topic.
  • Translate is a neural machine translation service that can translate text. You can define unique terms or names to customize the output. Any content passed to Translate is encrypted.
  • Transcribe is an AI service that transcribes speech to text. It can transcribe calls, videos and clinical conversations.
  • Polly is a neural engine that converts text to speech. It is HIPPA eligible.

Comprehend

What is it?

Comprehend is a NLP service that can discover insights from documents. You can call the Comprehend API in your application with the path to your text file or train your own model. The API will then output a JSON file. If you have medical documents, Comprehend Medical may be an option for you.

Most common use cases are customer sentiment analysis, provide a better search experience by allowing your search engine to index key phrases, entities, sentiment, and organize documents by topic.

Comprehend does not support custom models.

Services include:

  • Custom Entity Recognition, ex. find entities specific to business or policy numbers
  • Custom Classification, ex. build a custom model using your specific labels. Afterwards, you can send your text file to the model and receive the predictions
  • Entity Recognition, ex. API find entities like date, location, organization and returns a confidence score
  • Sentiment Analysis, ex. API returns a confidence score of the overall sentiment (Positive, Negative, Neutral, or Mixed) of the text
  • Target Sentiment, ex. is a granular sentiment (Positive, Negative, Neutral, or Mixed) analysis
  • PII identification and redaction, ex. API will identify and redact PII with a confidence score
  • Toxicity Detection, a out-of-the-box solution to moderate peer-to-peer conversations and generative AI
  • Prompt Safety Classification, a pre-trained binary classifier that can classify input as harmful or not, can be integrated into LLMs
  • Keyphrase Extraction, extracts key phrases or talking points and returns a confidence score
  • Events Detection, ex. answer who, what, when, where questions over large documents
  • Language Detection, ex. API parses text and returns dominant language with a confidence score
  • Syntax Analysis, API that analyze text using tokenization, parts of speech tagging, word boundaries, and labels like nouns and adjectives, and returns a confidence score
  • Topic Modeling, returns topics from a collection of documents stored in S3. Can group keywords by topics or group documents by topics
  • Multiple Language Support, perform text analysis on multiple languages. Can use Amazon Translate to convert text into a language supported by Comprehend.

Amazon Translate

What is it?

A neural machine translation service that can translate text. It uses deep learning models to produce a more natural sounding translation. You can do either real-time translations or batch translations on text, and real-time document translation. It currently supports 75 languages and can identify languages. You can use Active Custom Translation to customize output of batch translations jobs. You can define unique terms or names to customize the output. Any content passed to Translate is encrypted.

Amazon Transcribe

What is it?

An AI service that transcribes speech to text. It can transcribe calls, videos and clinical conversations in FLAC, MP3, MP4 or WAV format.

Typical use cases would be:

  • to get insights from call analytics, ex. customer service calls to look for specific phrases, sentiment, etc.
  • conversations
  • create subtitles, can create subtitles for live events
  • detect toxic content in audio
  • improve clinical documentation — trained on medical terminology and is HIPPA eligible
  • can identify speakers, ex. Speaker 1, Speaker 2
  • channel identification, ex. two callers could be transcribed separately, merged based on timing of utterances
  • can detect language and work with custom words
  • can also work with custom tables, you can include custom pronunciation of the words and how you want them transcribed

If you see many custom words in the transcription you can either reduce custom words to rare words, reduce custom words that are expected in that audio file or split the custom words into different lists for each use case. If the custom words sound too similar to each other you can try including them in a phrase.

It supports real time transcriptions and is device agnostic. Service calls are limited to 4 hours or 2GB per API for the batch service. Can use the streaming service for connections up to 4 hours.

Amazon Polly

What is it?

A neural engine that converts text to speech. It supports multiple languages and you can select the ideal voice for the use case. Polly can return the speech to your application so you can play it directly or you can store it in an audio file format. Polly is HIPAA Eligible.

Polly supports Speech Synthesis Markup Language (SSML) tags like prosody so you can adjust the speech rate, pitch, or volume. This gives control over emphasis, pronunciation, breathing, whispering, pauses, etc. You can also modify the pronunciation of the text using Lexicons. In addition, you can use Lexicons for acronyms ex. if the text has “World Wide Web”, say “WWW” instead.

You can add Speech Marks in a JSON file. Speech Marks are designed to complement the synthesized speech that is generated from the input text. Can encode when a sentence/word starts and ends in an audio stream, ex. use in lip-syncing animation.

Polly is a cloud based service, so it can reduce local resources, support many languages at a high quality and enhancements are instantly available without any additional updates.

Want more AWS Machine Learning Cheat Sheets? Well, I got you covered! Check out this series for SageMaker Built In Algorithms:

  • 1/5 for Linear Learner, XGBoost, Seq-to-Seq and DeepAR here
  • 2/5 for BlazingText, Object2Vec, Object Detection and Image Classification and DeepAR here
  • 3/5 for Semantic Segmentation, Random Cut Forest, Neural Topic Model and LDA here
  • 4/5 for KNN, K-Means, PCA and Factorization for here
  • 5/5 for IP insights and reinforcement learning here

and this installment for SageMaker Features:

and this article on lesser known high level features for industrial or educational purposes

and for ML-OPs in AWS:

and this article on Security in AWS

Thanks for reading and happy studying!

--

--

tanta base

I am data and machine learning engineer. I specialize in all things natural language, recommendation systems, information retrieval, chatbots and bioinformatics