AWS Innovate Online Conference

“Putting Machine Learning into the Hands of Developers”

Darren Broderick (DBro)
LibertyIT
7 min readFeb 27, 2020

--

Conference
https://aws.amazon.com/events/aws-innovate

Agenda
https://s3-ap-southeast-1.amazonaws.com/mkt-pdf/Innovate-Agenda-PDF.pdf

Last week I attended a completely free online AWS conference for ML & AI with links provided above. There where 15 parallel sessions under 6 tracks of topics between 10am-2pm so only viewing a maximum of 5 talks live would be possible.

The rest of this post are notes from different ML / AI talks I virtually attended, starting with the opening Keynote.

The conference dashboard to help select your tracks (We sadly didn’t get our own avatars)
Opening Keynote looked at the ‘big picture’ of machine learning, a good starting point for beginners like myself

Machine Learning Flow

Live Drawn ML flow diagram
SageMaker encompasses a ‘majority share’ for the ML flow

With AWS completely focused on selling SageMaker as a ‘1 stop shop’ almost every talk I attended mentioned in great detail how to run your own notebooks in SageMaker and new features that would be added in the near future.

Hyper-parameter optimisation was talked about a lot as the main influencer to training a good ML model to help avoid ‘Overfitting’ and achieving a good ‘Convergence’.

Developers should invest time learning to debugging the model instead of leaving it in a blackbox for hours/days helps save time/money and get the desired result quicker.

Neuron was introduced to increase deep learning support

More about Neuron can be found at this GitHub http://github.com/aws/aws-neuron-sdk.

TLDR;
Neuron enables deep learning inference using AWS Inferentia (custom designed machine learning chips).
Deployed and ran on EC2 Inf1 instances.

Neo was another new feature introduced, it converts the framework-specific functions and operations for TensorFlow, MXNet, and PyTorch into a single compiled executable that can be run anywhere. Neo compiles and generates the required software code automatically.

Neo delivers compilation as a service

Neo (Model Optimisation)-> Parses model -> Optimises graph -> Optimises Tensors -> Generates code

Getting started with machine learning and improve productivity in a fully integrated development environment

SageMaker is the key to machine learning training within AWS land, as this service gives you a managed Jupyter Notebook, and it can grab labeled data from S3.

You pay for what you use, training the data will take a lot of experimentation and careful planning.

SageMaker gives the whole ML lifecycle together

You constantly need to compare and debug previous attempts. This will require a lot of troubleshooting and looking for bottlenecks that are not obvious, especially for beginners.

As the real world change, so does the data we consume, if we are not not careful, models can become ‘stale’ over time, models need re-trained to include updated datasets but in a way that doesn’t break the existing behaviour of even break the model being used.

SageMaker studio is designed to help developers in this areas from experimentation into production.

Studio is now offered by AWS

Studio organises ‘trials’ on your behalf to shape your data experiments, then has comparison and metric leaderboard tables e.g.(min/max/standard deviation). ‘Loss metric diagrams’ are implemented using python SDKs.

This update aims to get the developer to a high convergence quickly. You can have 100s of rules while your jobs are running to receive alerts to identify jobs that should be stopped for example. It can also be extended if the developer wants to add to the debugger for their purposes to get the model to a deployable state.

SageMaker AutoPilot can take raw data and make predictions. Useful for fraud detection for example.

Autopilot source code is available in the SageMaker notebooks and can be tweaked; recommended to start with the hyper parameters first to get a feel for the code.

Autopilot flow => Raw Data, Target Column prediction, create model, visible in notebook to select best model, deploy

ML Inference

Machine learning uses statistical algorithms that learn from existing data, a process called training, in order to make decisions about new data, a process called inference. During training, patterns and relationships in the data are identified to build a model.

Inference is the stage in which a trained model is used to infer/predict the testing samples and comprises of a similar forward pass as training to predict the values. Unlike training, it doesn’t include a backward pass to compute the error and update weights.

How does Amazon Elastic Inference work with SageMaker?
It prototypes deployments with notebooks in local mode and at hosted endpoints scales with low-cost elastic inference acceleration.

TensorFlow → Uses IAM-base authentication and auto discovers accelerators. It’s available via Deep Learning AMIs, for download via S3 or automatically through SageMaker.

AWS DeepRacer

Of course there was a huge section on DeepRacer and details of a virtual race.

https://console.aws.amazon.com/deepracer/home?region=us-east-1#races/arn%3Aaws%3Adeepracer%3A%3A324010867108%3Aleaderboard%2FAWS-DeepRacer-Innovate-Challenge

AWS hosted community virtual race with top 3 prizes receiving AWS credits

Meeting security and compliance objectives when using Amazon SageMaker

Started with created a basic default SageMaker instance connecting to S3 and first looked at setting up your own notebook, pretty default stuff but I snipped the high level detail of the dashboard afterwards, might be handy to compare against.

Also make sure you have the correct IAM you want for your S3 bucket.

Go to https://sagemaker-workshop.com to follow along a practice example.

AI / ML BlackJack Challenge

400,000 images generated from an initial size of 52 cards, they were blurred, skewed, distorted and background layers to generate more samples.

A GoPro HERO7 connected to a Raspberry Pi was used to study the cards over a green felt poker board.

Built in SageMaker, P3dn instance, for 8 hours.

Used a built in object detector box to use as a ‘single shot’.

Building machine learning workflows with Kubernetes and Amazon SageMaker

Main focus of a Data scientist is to get a model solution to solve business problem and get it into the real world, once there it can feed off data updates.

What does it take to make a ML model?

First collect and prepare training data, decide on an algorithm and model training process, decide on what predictions are needed, after fully training a model it needs to implement with your application and finally, it has to be scalable.

It takes large optimisation efforts and teams of experts to achieve a single goal.

Kubernetes is an open source software that allows you to deploy and manage containerised applications at scale. It manages clusters of Amazon EC2 compute instances and runs containers on those instances with processes for deployment, maintenance, and scaling. You can run any type of containerised applications using the same toolset on-premises and in the cloud.

Kubernetes and Amazon SageMaker working together

Machine learning is more than just the model. The ML workflow consists of sourcing and preparing data, building ML models, training and evaluating these models, deploying them to production, and ongoing post-production monitoring. Amazon SageMaker is AWS’s offer to build, train, deploy, and maintain models.

https://aws.amazon.com/blogs/machine-learning/introducing-amazon-sagemaker-operators-for-kubernetes/

AWS Machine Learning Cert was referenced after every talk so could be a good avenue for folks looking to learn under an organised fashion.
Other AI/ML areas to explore can be Computer Vision and DeepComposer, always easier to learn if you’re having fun!
I don’t know the costs yet of DeepComposer so be careful to do some research first.

Further Reading

If you are looking further reading I purchased a selection of Data Science books off;
https://www.humblebundle.com/software/data-science-and-machine-learning-software?hmb_source=humble_home&hmb_medium=product_tile&hmb_campaign=mosaic_section_2_layout_index_2_layout_type_twos_tile_index_1_c_datascienceandmachinelearning_softwarebundle

If you are looking copies I will gladly share mine out so just send me a tweet @IAM_dbro.

Thank you.

--

--