IllumiDesk Leverages AWS SageMaker to Automate AI Pipelines

Greg Werner
IllumiDesk
Published in
3 min readDec 6, 2017

Every year AWS’s re:Invent conference keeps us on the edge of our seats. It’s amazing how many new products and features they can roll out, especially now that AWS has grown into a multi-billion-dollar business.

What are AI pipelines anyway?

Much like developers have embraced the concepts of DevOps, Data Scientists are embracing the concepts of DevOps to automate tasks associated with fetching, cleaning, and transforming data, training models, and deploying trained models to production. During this process, it is also important to test models to make sure they comply with pre-defined performance metrics and to make sure the most efficient model, using concepts such as A/B testing, is always available to end-users.

There are some new buzzwords to describe this process. In addition to the classic DevOps terms such as Continuous Integration (CI) and Continuous Deployment (CD), new terms for data science DevOps have arrived such as Continuous Training (CT). As the name implies, CT deals with how models are automatically updated with new training data.

In short, a DevOps AI pipeline looks something like so:

ML / DL / AI DevOps

Enter AWS SageMaker

AWS SageMaker simplifies the process of creating a job to retrain machine learning and deep learning models, check results with Notebook servers, and finally deploy trained models with resilient production-grade environments. Best of all, SageMaker has a great API if you need to set things up programmatically.

Additionally, the SageMaker team has re-written some of the more common ML algorithms, such as linear regression and clustering algorithms, and has significantly increased their performance, a win in everyone’s book.

How Does IllumiDesk Use SageMaker?

When data scientists are working in teams (which is most of the time), it’s helpful to have a central, version-controlled file system to manage project files. Additionally, not all users need or want to log into the AWS Console to manage user servers, create training pipelines, etc. Even if they access these resources programmatically, decision overload can hinder performance. IllumiDesk has a nice abstraction on top of AWS SageMaker (among other services) so users can quickly set up their automated machine learning and deep learning workflows, manage project collaborators, centrally view files regardless of server used within the project, among many other features.

In short, we have AWS SageMaker as one of the services under the hood to make sure you have a battle-tested and production-grade data science pipeline management tool without the headache!

Closing Thoughts

We would rather leverage products and services that have already been built and tested rather than trying to ‘roll our own’. Only if something does not exist (a tall order these days), do we look at developing it ourselves. We also like to enhance existing services that may have some gaps. For example, SageMaker (as of this writing) does not directly allow you to deploy models with AWS Lambda, which can offer significant cost savings. IllumiDesk has abstracted away these configuration and support complexities, saving you time and effort!

Would you like to know more? Register for free today, no credit card required! Once you are within our application feel free to send us a note and we’ll get back to you as soon as possible.

--

--