Online media photo created by creativeart — www.freepik.com

The 3 main reasons why Amazon S3 buckets are misconfigured

Karthik Subbarao

--

Every time I hear about a data breach incidence because of a misconfigured Amazon S3 Bucket, I keep wondering why and how the companies / individuals allow this to happen? Then when I revisit the experiences and learnings of my own cloud journey, it becomes clear. In most cases, “misconfigurations” are indeed “mis”-configurations and not intentional. I tried to consolidate and simplify the main reasons for such misconfigurations.

The moral of this story in one sentence:

Cloud is a very powerful platform, however should be dealt with responsibly. Knowing what you are doing helps!

Easy access to resources from internet

Amazon S3 is one of the most used services of AWS. It makes it so easy to store and retrieve files with a guarantee (11 9s of durability) that you will never loose your data and allows you to scale immensely. One of the major but simple use case of storing files on S3 is to share files with other users and or services. These users or services often need access to these files over the internet. The easiest way to do it is to enable public access to the bucket that stores these files. Enabling public access means, anyone on the internet can fetch these files! However, for some reason (I think this is because we have to login to create and configure a bucket, which gives us a feeling that whatever we do on cloud is also somehow secure) it might feel like it is ok.

So now, everyone is happy that they can access the files (unfortunately also the hackers!). Please note that it is perfectly fine to enable public access to a bucket if the files stored in it are not confidential. However, if these files are intended only for a specific audience then you need a more secure way of providing access to these files. We will look into this in another blog post in future.

Easy access to resources from other AWS services

AWS or any cloud for that matter has a rather sophisticated setup for authentication and authorization. I used the term “sophisticated” instead of “complicated” because I think that the authentication and authorization mechanism is necessary and good, however not easy to understand and apply.

When we store files on S3 and want to access these files from other Amazon services such as Lambda or EC2, there are multiple ways to do it. The most common approach is to create an appropriate bucket policy associated with the bucket or create a IAM policy and attach this policy to the corresponding service role. Since these approaches are a bit more sophisticated (for someone who has just started using cloud) than making the S3 bucket public, one might end up making the bucket publicly accessible (You might be thinking, why would someone access your bucket). So now, your AWS service can access these files (provided your service can access internet) and so can the entire world. (Not sure about your boss but the hackers will definitely love you!)

Ignorance

The most common reason, which also underlies the other reasons described above is not knowing what you are doing. Let’s visualize the journey of a new cloud user which might help us understand the reasons behind misconfigurations.

Let’s say Mr. X has decided to start using cloud for his AI startup. He has built really cool AI app, running on an on-premise server. His app processes images and videos and he needs a place to store the images and videos as they need a large storage. As of now, he has only test images and videos which are not confidential in nature. One of his friends suggests Amazon S3. Mr. X has no idea about cloud but he takes his friend’s advise and decides to use S3. In order to use the cloud services, first he needs to create an account. So he signs up at AWS, provides his credit card details and suddenly he has access to the largest tech infrastructure. In order to be more secure he also enables multi factor authentication. He does some online tutorials and feels confident in his cloud skills and goes ahead and creates an S3 bucket, uploads all the files. Now, when he tries to access these files from his app, it does not work! He then googles the issue (Access Denied) and finds that, one solution to access his files over the internet is to enable public access. So he does it. Now he can access the files from his app. He is super happy! Six months fast forward, his app is now live, real customers are using it. The files (including those that are confidential in nature) are being uploaded to the AI app, which in turn stores these files on the S3 bucket. The same old publicly accessible S3 bucket! I believe you know where I am going with this…

The hope the above fictional example gives us a clear picture of how one might end up misconfiguring the S3 buckets and end up paying a high price for seemingly innocent mistake. Therefore, it is very important to get a deeper understanding of cloud and its services and/or hire a cloud expert.

Thanks for reading! For more such insights into Cloud, K8s and DevOps, follow or add me on LinkedIn.

--

--

Karthik Subbarao

Specialist Solutions Architect @ Databricks | Helping customers implement Data & AI solutions | All opinions are my own