Get keys inside an S3 bucket at the subfolder level: Python

Using the boto3 prefix in Python we will extract all the keys of an s3 bucket at the subfolder level.

Aman Ranjan Verma
Towards Data Engineering

--

Welcome to our blog post on how to retrieve keys within a bucket at the subfolder level using Python. In this tutorial, we will show you how to use the Python boto3 library to access the contents of a bucket on Amazon S3, including all subfolders within the bucket.

Photo by Lubomirkin on Unsplash

S3 is a popular cloud storage service offered by Amazon Web Services (AWS). It allows users to store and retrieve data from anywhere on the internet, making it an ideal choice for storing and organizing large amounts of data. If you are working with S3 and Python, the boto3 library is a great resource to use.

By the end of this tutorial, you will have a good understanding of how to retrieve keys for files within a specific subfolder or all subfolders within an S3 bucket using Python and the boto3 library. Let’s get started!

Example: 1

How do get all keys inside the bucket if the number of objects is ≤ 1000?

import boto3

client = boto3.client('s3')
response = client.list_objects_v2(Bucket='mybucket')
for content in response.get('Contents'

--

--

Aman Ranjan Verma
Towards Data Engineering

Senior Data engineer, QuillBot | Ex-Flipkart | Ex-Sigmoid. I publish weekly. Available for 1:1 at topmate.io/arverma