AWS Announces AI Chips & 13 New ML Features; Consolidating Its Cloud Dominance

Amazon Web Services has unveiled two chips and 13 machine learning capabilities and services at its AWS re:Invent conference in Las Vegas. The releases reflect Amazon’s determination to attract more developers to AWS by broadening its range of tools and services. The stock market reacted favourably, with Amazon shares rising six percent after the announcement.

Industry leader Amazon’s cloud business has seen heated competition from the Google Cloud Platform and Microsoft Azure for years. Google’s homegrown AI chip — the Tensor Processing Unit (TPU) — was introduced in 2016 and is already in its third generation. Although Microsoft Azure has yet to release its own custom chips, its rapid expansion and strong growth illustrate just how competitive the cloud business has become, and how important innovation is for players who want to stay in the race.

New chip

Catching up with other public cloud vendors in the AI chip market, AWS has finally launched its own machine learning inference chip, the AWS Inferentia. “Inference” is the process trained machine learning models use to find patterns in large amounts of data. The Inferentia chip is designed for inference but can also handle larger workloads, and is compatible with all popular frameworks including TensorFlow, Apache MXNet, and Pytorch.

Inferentia has hundreds of teraflops per chip and thousands of teraflops per Amazon EC2 Instance, and supports various data types including INT-8, mixed precision FP -16, and bfloat16. The chip achieves improved performance while lowering power consumption and costs for both training and inference.

Amazon also introduced Elastic Inference, which allow developers to design their own inference processor capacities and cloud deployments based on workload. Inference typically accounts for about 90 percent of training costs. Because inference performance now can be customized via the Elastic Inference feature, users need only pay for what they need. AWS says this can reduce the cost of machine learning predictions by up to 75 percent.

Inferentia is not the only new AWS chip. In a surprising move, the company also revealed its homegrown ARM-based CPU, “Graviton”, which will begin powering AWS instances (EC2 A1) this week.

Graviton is akin to an Arm-based AMD chip. The CPU core is based on an Arm’s 2015 Cortex-A72 design, clocked at 2.3GHz clock, 64-bit Armv8-A, non-NUMA processor with floating point math, and supporting SIMD, AES, SHA-1, SHA-256, GCM and hardware acceleration of the CRC-32 algorithm, according to The Register report. AWS VP Peter DeSantis says Graviton can reduce workload cost on an AWS virtual machine by another 45 percent.

Faster AI implementation

Last year AWS introduced Sagemaker, an end-to-end service that can accelerate the implementation of machine learning models at scale. Sagemaker features three capabilities to free data scientists and developers from heavy labor: Zero-setup hosted Jupyter notebook IDEs for data preparation; a distributed model building, training, and validation service; and a model hosting service with HTTPs endpoints for invoking AI models to get realtime inferences.

This year AWS has pushed ahead with Sagemaker, introducing a series of new capabilities such as a better data labeling tool and a much more powerful compiler.

Amazon SageMaker Ground Truth provides users with different approaches for labeling their data before training their models, such as Mechanical Turk, third party vendors, or their own employees. Once Ground Truth learns how to annotate in real-time, it will use machine learning to apply labels to the rest of the dataset, saving up to 70 percent of the data labeling cost compared to purely human annotation.

AWS Marketplace for Machine Learning is a new treasure box of machine learning techniques. Over 150 popular algorithms and models collected from both academia and industry are ready for direct deployment to Amazon SageMaker.

AWS SageMaker Reinforcement Learning is a new reinforcement learning service that provides mainstream reinforcement learning algorithms and supports frameworks such as Intel Coach and Ray RL and simulation environments such as SimuLink and MatLab. It can also perfectly integrate with AWS RoboMaker, a new robotics service with a simulation platform.

Amazon SageMaker Neo is a new deep learning model compiler that can deploy trained models on any connected devices with up to 2X improvement in performance.

A toy car for reinforcement learning

AWS also announced AWS DeepRacer, a 1/18 scale fully autonomous model racing car which is driven by reinforcement learning models trained using Amazon SageMaker. DeepRacer is designed to help developers test their driverless technology and models by pitting their vehicles against those of other developers. Researchers can also train and debug machine learning models through an online simulator and then test them on their cars. DeepRacer is now available for pre-order.

AWS has also set up a special competition to encourage researchers and members of the public alike to learn more about reinforcement learning by getting hands-on with AWS DeepRacer. AWS CEO Andy Jassy calls it “the world’s first global autonomous racing league, open to everyone.” The 1/18th scale fully autonomous race car competition announcement comes amid speculation that Amazon might be interested in the field of self-driving vehicles.

Check out the YouTube video of DeepRacer below.

Also announced at AWS Re: Invent is a new family of AI services for non-machine learning experts, including Amazon Textract, Amazon Comprehend Medical, Amazon Personalize, and Amazon Forecast.

AWS is sending a strong signal with this week’s announcements and releases: The company wants to consolidate its cloud dominance. While AWS remains the №1 public cloud vendor, its advantage is shrinking as Microsoft, IBM, Google, and Alibaba play catch up. Microsoft’s 2018 Q1 cloud revenue topped Amazon’s; Google Cloud last week replaced its CEO with a former Oracle executive; and Alibaba recently announced its first major restructure, upgrading its cloud business unit to Alibaba Cloud Intelligent business unit.

Cloud computing is steadily becoming one of the main business areas for the world’s tech giants, and each and every one of them can be expected to continue working hard to find ways to outmaneuver their rivals.


Journalist: Synced Editorial Team | Editor: Michael Sarazen


2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.


Follow us on Twitter @Synced_Global for daily AI news!


We know you don’t want to miss any stories. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.