Beginner’s Guide to AWS: How to Run a Python Script Stored in S3 on EC2

Amelia Tang
6 min readJul 3, 2023

--

Running Python scripts on our local machines is often convenient and straightforward. However, there are instances where our machines may lack the necessary computing power or storage capacity. In such cases, leveraging cloud services becomes a viable solution.

Recently, I encountered a situation where I needed to run a web scraping Python script that would take hours to complete. To overcome this challenge, I opted to run the script on an AWS EC2 instance.

In this guide, I will provide a concise demonstration on how to upload a Python script to an S3 bucket, execute the script on an EC2 instance, and save the output file back to the S3 bucket for further analysis.

Photo by Carl Heyerdahl on Unsplash

Step 1: Create an S3 Bucket

First, after logging into the AWS Console, search for “IAM.”

IAM stands for Identity and Access Management. It helps manage the access to AWS resources.

Search for IAM in AWS Console (Image by Author)

Then, click on “add users” to create a new user.

Add Users in IAM (Image by Author)

To grant the user full access to S3, select “Attach policies directly” and search for “AmazonS3FullAccess”. Attach the policy to the user to provide full S3 access.

Set permissions in IAM (image by author)
Permissions Policies in IAM (Image by Author)

We can confirm that the new user has been successfully created. Now, let’s proceed to upload the Python script. To begin, click on the newly created user.

IAM Users (Image by Author)

Under “Security Credentials” tab, we can find “Create access key.”

Create Access Key in IAM (Image by Author)

Click on “Create access key” to download the access key and secret access key as a .csv file to your local machine. This file will serve as a reference for future use.

Access Keys in IAM (Image by Author)

Next, we import useful packages in our local Jupyter Notebook or Jupyter Lab and set the access key and the secret access key as constants.

import logging
import boto3
from botocore.exceptions import ClientError

AWS_ACCESS_KEY_ID = access_key #It's in the file you saved in the previous step. Put it here in the string format.
AWS_SECRET_ACCESS_KEY = secret_access_key #It's in the file you saved in the previous step. Put it here in the string format.

Next, we can create the S3 bucket using boto3. You can also create an S3 bucket without writing code. However, we will not delve into that process at this time.

s3_client = boto3.client('s3', region_name = 'us-west-2', #choose the region you want. Here, 'us-west-2' is an example.
aws_access_key_id = AWS_ACCESS_KEY_ID,
aws_secret_access_key = AWS_SECRET_ACCESS_KEY)

location = {'LocationConstraint': 'us-west-2'}

s3_client.create_bucket(Bucket = 'your-own-bucket-name', # Here is the name of the bucket you just created
CreateBucketConfiguration = location)

Step 2: Upload the Python Script

After creating the bucket, we can now upload the Python script:

file_name = 'your-python-script.py'
bucket_name = 'your-own-bucket-name'
object_name = file_name #You can set your own object name or keep it the same as the original file name on the local machine.

with open(file_name, 'rb') as f:
s3_client.upload_fileobj(f, bucket_name, object_name)

When accessing the AWS console and navigating to the S3 bucket, we should see the uploaded script. Instead of the coding approach, as the root user or a user with appropriate access, you can simply click on the orange “Upload” button located in the upper right corner of the AWS console.

Upload Files to S3 (Image by Author)

Step 3: Grant EC2 Instance Access to S3 Bucket

First, we need to set up an EC2 instance with AWS Linux 2023 using the free-tier settings/configurations. The specific configuration of the EC2 instance can be tailored according to your requirements, but we won’t delve into the details of that process here.

To grant the EC2 instance access to all the files in the S3 bucket, we will navigate to IAM once again and create a role with full access to the S3 bucket. Alternatively, you can choose to grant access only to a specific S3 bucket. However, for the purpose of this demo, we will grant full access.

Create Roles in IAM (Image by Author)

When creating the role, we select the appropriate configurations, considering that we will be assigning this role to an EC2 instance.

Trusted Entity Type in IAM (Image by Author)

And also grant full S3 access by adding the permission.

IAM Role Permissions (Image by Author)

Adding a name and description for the new role and create the role:

Name and Description of IAM Role (Image by Author)

Then, we go back the the running EC2 instance, click on “Action” → “Security” → “Modify IAM role”

EC2 Modify IAM role (Image by Author)

Click on the arrow and select the IAM role we just created, and then click on “Update IAM role.” This ensures that the EC2 instance now has the necessary access to the S3 bucket.

EC2 Update IAM Role (Image by Author)

Step 4: Running the Python Script in EC2 using the AWS console

First, we need to run the following commands after connecting to the running EC2 instance to perform updates and install the necessary packages required to run the Python script.

Please note that these commands provided are for demonstration purposes. Depending on the specific script you would like to run, you may need to use different commands.

To become the root user:

sudo su 

Run updates:

sudo yum update

To see if it already has Python3 (Amazon Linux 2023 already has):

Python3

Install pip:

yum install python3-pip

Then we can install required packages using pip:

python3 -m pip install package-name

Then, we need to install AWS CLI.

You can use different commands for different operating systems you chose for your EC2 instance according to the AWS documentation. Mine is Linux.

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

To run the Python Script, we need to first copy the script from the S3 bucket using:

aws s3 cp s3://your-own-bucket-name/your-python-script.py ./your-python-script.py

Please note that in the second path, the dot (.) represents the current directory and should not be omitted.

Then, we can finally execute the Python script:

python3 your-python-script.py

If the Python script generates an output file that needs to be copied to the S3 bucket, you can use the following command as an example, assuming the file format is .csv:

aws s3 cp ./output_file.csv s3://your-own-bucket-name/output_file.csv
Photo by Brett Garwood on Unsplash

Conclusions

Congratulations! You’ve reached the end of this demo. I hope that sharing my personal learning experience with you has been helpful. Best of luck on your journey of learning Data Science! It’s truly a captivating field that will surely keep us engaged and fascinated.

References:

Install or update the latest version of the AWS CLI https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html

Mohammed Murtuza Qureshi, Working with AWS S3 Buckets using Python & boto3[MOOC], Coursera, https://www.coursera.org/projects/working-with-aws-s3-buckets-using-python-boto3

--

--