Control AWS S3 using Boto3

Introduction :

SivaraamTK
featurepreneur
10 min readJul 24, 2022

--

Boto3 is the name of the Python SDK for AWS. We can do the same things that we do in AWS Console and even more, but in a faster, repeated, and automated way. It allows you to directly create, update, and delete AWS resources from your Python scripts. I have written a detailed article about the Boto3 module and more ways to use Boto3, be sure to check it out before reading this.

One of the core components of AWS is Amazon Simple Storage Service (Amazon S3), the object storage service offered by AWS. With its impressive availability and durability, it has become the standard way to store videos, images, and data, commonly used for data analytics applications, machine learning, websites, and many more You can combine S3 with other services to build infinitely scalable applications.

Using the Boto3 library with Amazon Simple Storage Service (S3) allows you to create, update, and delete S3 Buckets, Objects, S3 Bucket Policies, and many more from Python programs or scripts with ease.

Prerequisites :

  • AWS Account Credentials (Access key, Secret key)
  • IAM User with full access to S3
  • Install AWS CLI and configure
  • python3
  • Install boto3

Installation :

To install AWS CLI, run the following command in your terminal:

Similarly, to install Boto3, run the following command in your terminal:

Configuration :

To configure the AWS environment, type the following command in your terminal:

This command will prompt you to enter information to form a connection with your AWS account. For the Access key and Secret key, enter your AWS Access Key and AWS Secret Access Key of the IAM User with the required permissions. For the Default region name, enter the server region in which the bucket you want to access is. If you haven’t created a bucket or it is in global region, use “us-east-1”. For the Default output format enter “json”

Alternatively, you can also pass this information as parameters to the client()

**NOTE: Storing your AWS credentials in your scripts is not secure and, you should never do this, we can set them as environment variables or use the `.env` file and load it into the Python script but even storing AWS Access and Secret Keys in a plain text file is not very secure. The better and more secure way is to store AWS Access and Secret Keys in the encrypted store, for example, aws-vault.

Create an S3 bucket using Boto3 :

To create the Amazon S3 Bucket using the Boto3 library, you need to either create_bucket client or create_bucket resource.

**NOTE: Every Amazon S3 Bucket must have a unique name. Moreover, this name must be unique across all AWS accounts and customers.

Creating S3 Bucket using Boto3 client:

**NOTE: To avoid various exceptions while working with the Amazon S3 service, we strongly recommend you define a specific AWS Region for the Boto3 client and S3 Bucket Configuration

Similarly, you can use the Boto3 resource to create an Amazon S3 bucket:

Listing Amazon S3 Buckets using Boto3 :

There are two ways of listing Amazon S3 Buckets:

Here’s an example of listing existing S3 Buckets using the S3 client:

Here’s an example of listing existing S3 Buckets using the S3 resource:

Deleting Amazon S3 Bucket using Boto3 :

There are two possible ways of deletingAmazon S3 Bucket using the Boto3 library:

Here’s an example of deleting the Amazon S3 bucket using the Boto3 client:

Here’s an example of deleting the Amazon S3 bucket using the Boto3 resource:

Deleting non-empty S3 Bucket using Boto3 :

To delete an S3 Bucket using the Boto3 library, you have to clean up the S3 Bucket. Otherwise, the Boto3 library will raise the BucketNotEmpty exception. The cleanup operation requires deleting all S3 Bucket objects and their versions:

Uploading a file to S3 Bucket using Boto3 :

The Boto3 library has two ways for uploading files and objects into an S3 Bucket:

The upload_file() method requires the following arguments:

  • file_name — filename on the local filesystem
  • bucket_name — the name of the S3 bucket
  • object_name — the name of the uploaded file (usually equals to the file_name)

Here’s an example of uploading a file to an S3 Bucket:

We’re using the pathlib module to get the script location path and save it to the BASE_DIR variable. Then, we’re creating the upload_files() method that is responsible for calling the S3 client and uploading the file.

Uploading multiple files to the S3 bucket :

To upload multiple files to the Amazon S3 bucket, you can use the glob() method from the glob module. This method returns all file paths that match a given pattern as a Python list. You can use glob to select certain files by a search pattern by using a wildcard character:

Uploading generated file object data to S3 Bucket using Boto3 :

If you need to upload file object data to the Amazon S3 Bucket, you can use the upload_fileobj() method. This method might be useful when you need to generate file content in memory (example) and then upload it to S3 without saving it on the file system.

**NOTE: the upload_fileobj() method requires opening a file in binary mode.

Here’s an example of uploading a generated file to the S3 Bucket:

Enabling S3 Server-Side Encryption (SSE-S3) for uploaded objects :

You can use S3 Server-Side Encryption (SSE-S3) encryption to protect your data in Amazon S3. We will use server-side encryption, which uses the AES-256 algorithm:

Getting a list of files from S3 Bucket:

The most convenient method to get a list of files from S3 Bucket using Boto3 is to use the S3Bucket.objects.all() method:

Otherwise, we can use tlist_objects_v2()

Filtering results of S3 list operation using Boto3 :

If you need to get a list of S3 objects whose keys are starting from the specific prefix, you can use the .filter() method to do this:

Downloading file object from S3 Bucket :

You can use the download_file() method to download the S3 object to your local file system:

Reading files from the S3 bucket into memory :

Deleting S3 objects using Boto3 :

To delete an object from Amazon S3 Bucket, you need to call the delete() method of the object instance representing that object:

Renaming S3 file object using Boto3:

There’s no single API call to rename an S3 object. So, to rename an S3 object, you need to copy it to a new object with a new name and then deleted the old object:

Copying file objects between S3 buckets using Boto3:

To copy file objects between S3 buckets using Boto3, you can use the copy_from() method.

Creating S3 Bucket Policy using Boto3 :

To specify requirements, conditions, or restrictions for accessing the Amazon S3 Bucket, you have to use Amazon S3 Bucket Policies

Let’s use the Boto3 library to set up this policy in the S3 bucket:

Deleting S3 Bucket Policy using Boto3 :

To delete the S3 Bucket Policy, you can use the delete_bucket_policy() method of the S3 client:

Generating S3 presigned URL using Boto3:

If you need to share files from a non-public Amazon S3 Bucket without granting access to AWS APIs to the final user, you can create a pre-signed URL to the Bucket Object:

The S3 client’s generate_presigned_url() method accepts the following parameters:

  • ClientMethod (string) — The Boto3 S3 client method to presign for
  • Params (dict) — The parameters need to be passed to the ClientMethod
  • ExpiresIn (int) — The number of seconds the presigned URL is valid for. By default, the presigned URL expires in an hour (3600 seconds)
  • HttpMethod (string) — The HTTP method to use for the generated URL. By default, the HTTP method is whatever is used in the method’s model

Enabling S3 Bucket versioning using Boto3 :

S3 Bucket versioning allows you to keep track of the S3 Bucket object’s versions over time. Also, it safeguards against accidental object deletion. Boto3 will retrieve the most recent version of a versioned object on request. When a new version of an object is added, the object takes up the size of storage of the versions added together; i.e., a 2MB file with 5 versions will take up 10MB of space in the storage. To enable versioning for the S3 Bucket, you need to use the enable_version() method:

Conclusion :

One of the things I always wished I knew before working on S3 using Boto3 is that S3 is object storage, it doesn’t have a real directory structure and The “/” is rather cosmetic that is used to simulate a simple file system and hence S3 objects cannot have “/” in their name. If you wish to explore more functionalities of Boto3 for S3 check this doc. And I guess that’s all for now. !HAPPY-CODING!

--

--

SivaraamTK
featurepreneur

An aspiring developer from Chennai who’s passionate to learn new technologies and overcome all challenges to become better than the me from yesterday