Pro-Tip: Store Sensitive Data in Amazon S3 (and retrieve it from within a Django application)

Deep Space Program
7 min readDec 28, 2018

--

This article will cover how to store sensitive information in an Amazon S3 bucket and retrieve it at runtime within a Django application, running on an EC2 instance deployed via Amazon Elastic Beanstalk.

Why?

Basic security practices have come a long way in the last decade of web development. It is standard practice to keep sensitive data out of source code at all costs. Yet, I have personally consulted for a number of companies who continue to store secrets in their source, making their data available to anyone with even temporary access to their repositories. Putting your own company’s information at risk is obviously frowned upon. But, storing sensitive information such a database passwords, secret keys, etc. in places that are visible to potentially harmful third-party players is blatant negligence, and can jeopardize the privacy and safety of users. Perpetuating careless practices surrounding sensitive data storage puts all of us at risk in the long run.

In light of this, we’re here to discuss how to store sensitive information in an Amazon S3 bucket and retrieve it from within your application. While this method is not perfect, it’s certainly an improvement on the all-too-common practice of storing secrets in environment variables––which are discoverable from the environment management console, and can be viewed by any user that has permission to describe configuration settings on your environment. It’s also possible that these variables appear within your instance logs, depending on which platform you’ve chosen to run your application. Long story short, they’re not safe.

Let’s get started.

How?

1. Upload Information to Amazon S3

Our first step is to create an Amazon S3 bucket for holding the files that store our application specific secrets. You may choose to implement this storage mechanism across several projects, therefore, we recommend naming your bucket something akin to “yourcompanyname-app-config” where yourcompanyname is…drumroll please…your company’s name.

If you’re not familiar with how to create an Amazon S3 bucket, you should probably pause here for a moment, and go read about them.

Once you have your bucket, open a new file in your favorite text editor. This is where we’re going to store the sensitive information we’d like to keep out of environment variables but need within our application. We store our information in a single JSON dictionary, within a file that we aptly name “appname-app-config.json”. We will ultimately 1) retrieve this .json file from S3 when our application is deployed 2) parse the file’s contents and 3) use our variables where they’re needed within our application.

IMPORTANT: We create this .json file in the root directory of our project. BUT we make sure to add it to the .gitignore (and all the other necessary “ignore” files). This ensures that the file is NEVER committed to source and the secrets are never exposed.

Below is an example file. Its JSON dictionary contains some of the necessary information for connecting to a PostgreSQL database via Django. If you have experience with PostgreSQL and Django, these variables will be self-explanatory. If not, you’re welcome to go check out Django’s documentation on using a PostgreSQL database.

# appname-app-config.json{
"DB_NAME": "your_database_name",
"DB_ENGINE": "django.db.backends.postgresql",
"DB_USER_NAME": "your_user_name",
"DB_PASSWORD": "your_password",
"DB_HOST": "your_database_hosted",
}

So, to summarize the above process in a few simple steps:

Create an Amazon S3 bucket

  1. Open the Amazon S3 console.
  2. Choose Create Bucket.
  3. Type a Bucket Name, and then choose a Region.
  4. Choose Create.

Create a JSON file locally with your application’s secrets

  1. touch appname-app-config.json
  2. Add a JSON dictionary to the file & store your secrets as key-value pairs.
# appname-app-config.json{
"SOME_KEY": "some_value",
"SOME_OTHER_KEY": "some_other_value",
}

Upload your JSON file to the newly created S3 bucket

  1. Open the bucket via the console, and then choose Upload.
  2. Follow the prompts to upload the file.

Now, let’s see how to access this file and its information from within your Django application.

2. Retrieve Secrets From Within Your Application

In order to retrieve the JSON file and parse the dictionary it contains f, you’ll first need to manage some permissions. Per Amazon’s documentation:

“Your account owns the file and has permission to manage it, but IAM users and roles do not unless you grant them access explicitly. Grant the instances in your Elastic Beanstalk environment by adding a policy to the instance profile.”

The default instance name for an application initiated on behalf of Elastic Beanstalk is aws-elasticbeanstalk-ec2-role. If you’re unsure of what your instance profile is named, you can find it from within the Configuration page in the environment management console.

In order to add permissions to the instance profile, you’ll need to complete the following steps:

  1. Open the IAM console.
  2. Choose Roles.
  3. Choose aws-elasticbeanstalk-ec2-role.
  4. Under Inline Policies, choose Create Role Policy. Choose Custom Policy.
  5. Add a policy that allows the instance to retrieve the file.
# An IAM Role Policy which grants your application "Get" access to the
# JSON file containing your application secrets.
# IMPORTANT: Replace the bucket and object names with the names of your
# particular bucket and object.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "database",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::my-secret-bucket/appname-app-config.json",
]
}
]
}

6. This last step differs depending on the configuration of your Elastic Beanstalk environment. Continue reading below…

Fork in the Road

At this stage, the path ahead forks depending on how your application is deployed to Elastic Beanstalk. If you’ve configured your application with a Docker Environment or a Multicontainer Docker Environment, then you’ll follow the steps immediately below. Otherwise, you’ll want to skip ahead to the section titled ‘Option 2: Non-Docker Environment’ and proceed from there.

Option 1: Docker & Multicontainer Docker Environment

If you’ve configured your application with one or multiple Docker containers, you’ll need to follow a slightly different pattern than if you had configured it using a default Elastic Beanstalk platform.

When using Docker, our application runs within an isolated container. Therefore, we can’t rely on the standard process––which uses a configuration file located within the .ebextensions directory to specify that the Elastic Beanstalk application download the file at the deploy time. Rather, in order to have access to the JSON file from within the container’s filesystem, we’re going to download the Amazon Web Console CLI onto our Docker image and use it from within our Docker environment to download the file.

What you’ll need to do is install the AWS CLI via your Python package manager. We use Pyenv.

pipenv install awscli -upgrade -user

Now, assuming you’ve set up your Docker image to install all the packages specified in your Pipfile (or your requirement.txt if you’re using Pip), you can be sure that the AWS CLI is available within your Python environment in your image. Next, we edit our production Docker ‘entrypoint.sh’ script to download the JSON file from S3, to a local file in the ‘tmp’ directory, so that it’s available to our application at runtime.

# ~/entrypoint.prod.sh...aws s3 cp s3://appname-app-config/appname-app-config.json ./tmp/appname-app.config.json...

If you followed the steps in the previous section and added the policy as instructed, this command should run successfully when the container is started. The AWS CLI will automatically recognize the instance profile of the EC2 instance on which it’s running and download the file from S3 without a hitch.

You can now skip ahead to the final section ‘Retrieving Secrets Within Django’.

Option 2: Non-Docker Environment

If you don’t have one already, create an .ebextensions directory in the root directory of your project. Add a configuration file to this directory which tells Elastic Beanstalk to download the file from Amazon S3 during deployment.

# ~/appname/.ebextensions/app.configResources:
AWSEBAutoScalingGroup:
Metadata:
AWS::CloudFormation::Authentication:
S3Auth:
type: "s3"
buckets: ["my-secret-bucket"]
rolename: "aws-elastic-beanstalk-ec2-role"
files:
"/tmp/appname-app-config.json" :
mode: "000644"
owner: root
group: root
authentication: "S3Auth"
source: https://s3-us-west-2.amazon.com/my-secret
-bucket/appname-app-config.json

What this configuration file accomplishes is twofold. First, using the Resourceskey, we add an authentication method to our environment's Auto Scaling group metadata that Elastic Beanstalk can use to access Amazon S3. Second, the filestells Elastic Beanstalk to download the file from Amazon S3 and store it locally in the /tmp/ directory.

If you’ve configured the permissions correctly, your deployment will succeed and the file will be downloaded to all of the instances in your environment. If not, the deployment will fail.

Retrieving Secrets Within Django

Regardless of which route you followed above, as the final step, you’ll need a way to retrieve the secret values from the JSON file from within your Django application. What we do at Deep Space Program is something like:

# /appname/settings/production.py...import json...# Parse the JSON file and retrieve our settings.SETTINGS = None
with open('./tmp/appname-app-config.json') as f:
SETTINGS = json.load(f)
# Use the settings to connect to our DB:DATABASES = {
'default': {
'ENGINE': SETTINGS["DB_ENGINE"],
'NAME': SETTINGS["DB_NAME"],
'USER': SETTINGS["DB_USER_NAME"],
'PASSWORD': SETTINGS["DB_PASSWORD"],
'HOST': SETTINGS["DB_HOST"],
'PORT': '5432',
}
}

We open the configuration file and parse its contents using the built-in json module. This allows us to use the respective settings to establish a connection to our database.

Congratulations, you’ve finished!

While this method isn’t foolproof, it’s certainly a step above storing your sensitive information in environment variables (and clearly better than inserting them directly into your source).

Comments & Questions

If you have any questions, please don’t hesitate to reach out to someone from our crew. You can get in touch via our email (hq@deepspaceprogram.com) or our studio’s contact page.

We live and breath technology at Deep Space Program and are always looking to share and learn from other developers, hobbyists, or whoever may want to share their love for technology.

--

--

Deep Space Program

A stealthish technology & design studio located in the Nevada desert. We help startups start up. Get in touch with us at hq@deepspaceprogram.com.