Managing IAM Roles for Your Cloud Compute Service for Third-Party Tools

Derek Flint
@auger
Published in
3 min readAug 2, 2018

There are many great SaaS products and tools available for developers and data scientists. The best ones integrate with your existing favorite cloud computing service for maximum flexibility. In the case of Machine Learning and specifically Automated Machine Learning, you will want to train your data in an environment that can scale dynamically to fit your needs.

In order to allow you to scale CPU, memory and storage for your data science SaaS tool, you will generally need to provide that tool access to compute and storage capabilities of your cloud computing services account. There are several guidelines you should keep in mind when giving any SaaS data science tool (Auger.AI or any other data science tool) the ability to use your cloud computing account for scalable jobs. In this post I will be focusing specifically on integrating with AWS and using Auger as an example to share all the latest best practices to keep your account safe and secure.

Avoid Static Credentials

You want to avoid ever giving a third-party service static credentials. This would allow someone access to your account without expiration. The preferred way to give access is to create AWS Identity and Access Management (IAM) roles. With IAM roles, you can securely control access to AWS services and resources.

You should create a unique IAM role for each Third-Party service you are using. The benefits are:

  1. If the role is abused, you can easily audit and remove the role w/o effecting other services
  2. You can fine tune the policy permissions attached to the role
  3. Third-Party services use temporary security credentials that expire, usually within one hour, reducing the the risk if they are leaked
  4. It’s easier to manage Third-Party services by creating roles name-spaced by the service and tailored uniquely for them

Third-Party IAM Roles

When creating a Third-Party IAM Role, the role builds a trust relationship which specifies that the Third-Party (i.e. Auger) will be trusted to assume the role. This relationship is defined with the AWS account number of the vender.

AWS ensures that the the IAM role can only be assumed from the AWS account listed in the trust relationship. If anyone besides Auger in our example attempts to assume the role, AWS will prevent it.

IAM Policies

IAM Policies are JSON documents that describe the activities that can or cannot be done with the IAM role.

Here is a snippet of the Auger Access Policy:

{
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:HeadBucket",
"s3:ListAllMyBuckets"
],
"Resource": [
"*"
]
},
{
"Resource": [
"arn:aws:s3:::auger-acmeorg-g1fxrf",
"arn:aws:s3:::auger-acmeorg-g1fxrf/*"
],
"Effect": "Allow",
"Action": [
"s3:*"
]
}
],
"Version": "2012-10-17"
}

Here we show the permissions for accessing a bucket in S3. The first statement permits GetBucketLocation, HeadBucket and ListAllMyBuckets, which are used to create buckets to store your Machine Learning AutoML runs. Wildcard “*” access is granted for only the bucket Auger creates, which permits all read and write actions on the bucket.

It’s important to note that Auger does not directly store your data. It remains with you in your AWS account.

Auger AutoML Platform

Auger is an automated machine learning platform that allows you to connect to your favorite cloud service. Auger is Kubernetes native, so it is designed from the ground up to scale with your needs. This is a very powerful way to train thousands of models in parallel to obtain an ordered leaderboard of the best performing models. You can then deploy your model of choice as a production endpoint for real time use in a very cost effective manner.

Specifically with AWS, Auger uses EC2 instances to scale clusters and S3 for persistent storage of historical Machine Learning runs. AWS Lambda is used to serve a real time predictions so that top performing models can be deployed and readily used for production.

Conclusion

Its important to understand the policy behind the IAM role you create. You should use the tools from AWS or other cloud computing vendor to give appropriately fine-grained and restricted access to third parties. If you have any reservations about giving access to a third party vendor you should reach out to them with any questions.

Next learn more about the steps required to setup an IAM role on AWS.

--

--

Derek Flint
@auger
Editor for

Engineer, Data Science Practitioner and Automated Machine Learning Advocate