S3: Crispy on the outside, tender on the inside. Image credit: Susan Patterson

How to download files that others put in your AWS S3 bucket

At Artificial Industry, we ingest some periodic data from the Vanad Aloha call center platform. Every 5 minutes, CSV files are uploaded to a bucket we own. But how do we grant access to another team to upload to our bucket? I was stumped on this for over a week until I found an important gotcha. In a hurry? Skip to the TL;DR.

Granting access

Permissions for your bucket and it’s contents can be written in JSON and stored in the Bucket Policy:

Bucket policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::my-bucket"
}
]
}

Where

  • Principal is the user/account which will be granted access.
  • 123456789012 is the account id of the team trying to upload files to our bucket
  • Actions are the things they can do
  • Resource indicates to what bucket the rules apply

Policies for files (Objects in S3-speak) in a bucket are placed in the same bucket policy as policies for the bucket itself:

Adding Object statements to the bucket policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::my-bucket"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}

Notice the trailing slash and wildcard in the Resource value. This tells AWS we are defining rules for all objects in the bucket. The rule can be made more specific by using a value such as arn:aws:s3:::my-bucket/my-folder/* Also note that the other team can not read objects in your bucket (except for the ones they upload themselves). You may add s3:GetObject to the Object actions if desired.

Apply the bucket policy to your bucket by visiting the S3 Management Console, clicking your bucket, selecting the permissions tab, and then clicking the button Bucket Policy:

File ownership

So, someone else puts a file in your bucket. Who decides what happens to the file? You pay the bills for the bucket, so surely you can do whatever you want with it? Wrong! An object is always owned by the account which uploads it. The bucket owner can always delete objects in his bucket to keep costs under control, but that’s it. Trying to download someone else’s file will simply result in a stone cold Access denied error.

Confused? The AWS business graphics artists feel your pain.

Allowing them to allow you to download their files

Just like buckets, objects can have their own policies. The are called ACL’s though, for Access Control Lists. ACL seems to be a deprecated XML format that was superseded by the Policy JSON, but the term ACL lives on in API’s and documentation.

If you must, you can still use ACLs for buckets, too

The Object ACL allows the Object Owner (the other team uploading objects to your bucket) to grant you, the Bucket owner, full control over their objects. Notice the extra s3:PutObjectAcl action we added to the bucket policy? That’s what this article is all about. It allows them to attach an ACL to the objects they upload. To do this, they have to:

  • Put the objects of interest in our bucket
  • Add an ACL to the objects (which will grant us full access to those objects)

Uploading the files

Luckily, doing that is super easy. It can be done by adding an extra argument to the upload command of the AWS API. Example in the python AWS library called boto:

import boto3

client = boto3.client('s3')
local_file_path = '/home/me/data.csv'
bucket_name = 'my-bucket'
bucket_file_path = 'exports/data.csv'
client.upload_file(
local_file_path,
bucket_name,
bucket_file_path,
ExtraArgs={'ACL':'bucket-owner-full-control'}
)

Where bucket_file_path is the path in the bucket (or key) where the file will be uploaded to. If you want to upload string contents instead of a file on disk, see the documentation for put_object. To apply the ACL to an existing file, see put_object_acl.
And, if you must know, the boto library was named after a fresh water dolphin native to the Amazon river [1].

TL;DR

  • When someone else uploads a file to your bucket, they have to grant you permission to download it.
  • To do that, you need to grant them permission to set a file’s Access Control List. Add a statement to the bucket’s Policy:
[...]
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012::user/their-user"
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}
  • They need to set the bucket-owner-full-control ACL on files they upload:
client.upload_file(
local_file_path,
bucket_name,
bucket_file_path,
ExtraArgs={'ACL':'bucket-owner-full-control'}
)

Any questions? Let us know in the comments. If you liked the article, please hit the clap button so more people can read this story!


About Artificial Industry: We help entrepreneurs to change the world by transforming their ideas fast and efficiently into successful online businesses. We do this by creating (data) prototypes and MVP’s for our clients.