This is How I Reduced My CloudFront Bills by 80%

If you are using S3 and CloudFront to host your content and noticed that your bills are increasing, read this!

Aymen El Amri
Jun 9, 2018 · 13 min read

Ask Yourself: “What kind of content did I deploy to CloudFront ? How my users are using it ?”

These are some questions, that you should ask. Sometimes, just asking, helps in finding great solutions ..

Serving Private Content Trough CloudFront

This is the case where for example, you need only your mobile phone application or your web application to access your content and in this case, you have 2 choices:

  • or signed cookie
my_connection = boto.cloudfront.CloudFrontConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)my_distributions = my_connection.get_all_distributions()my_distribution = my_distributions[0].get_distribution()my_key_pair_id = "my_key_pair_id"my_private_key_file = "my_private_key_file.pem"
time_to_expire = int(time.time()) + 60000my_url = "http://<my_cloudfront_id>.cloudfront.net/my_resource_file"my_signed_url = my_distribution.create_signed_url(my_url, my_key_pair_id, time_to_expire, private_key_file=my_private_key_file)print (my_signed_url)

How Long Objects Stay in a CloudFront Edge Cache ?

In many cases, some files on your S3 bucket are not updated at all or updated rarely, this is why you should ask yourself this question: How Long Objects Stay in a CloudFront Edge Cache ?

Adding Headers to Your S3 Objects To Control Cache

Say we have a video in the header of a static website, we will never update it. Why not add these two headers to that object on S3:

You need to invalidate your CloudFront cache after modifying metadatas.

More Caching

Some AWS users, including me, use S3 to host static website and CloudFront for two reasons:

  • SSL certificate for a website in HTTPS

Sync Your Content The Right Way

When you want to synchronize a local directory with a remote S3 buket, you need usually to execute the following command (or its equivalent using the SDK):

aws s3 sync . s3://my_website.com --acl public-read
aws s3 sync --delete . s3://my_website.com --acl public-read

S3: Choosing The Best Naming Strategy = More Performance

While this is directly related to performance and not costs, but in some cases, you may decide to create a CloudFront distribution because you think it will solve your performance problem !

examplebucket/2013-26-05-15-00-00/cust1234234/photo1.jpg examplebucket/2013-26-05-15-00-00/cust3857422/photo2.jpg examplebucket/2013-26-05-15-00-00/cust1248473/photo2.jpg examplebucket/2013-26-05-15-00-00/cust8474937/photo2.jpg examplebucket/2013-26-05-15-00-00/cust1248473/photo3.jpg ... examplebucket/2013-26-05-15-00-01/cust1248473/photo4.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo5.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo6.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo7.jpg     ...
examplebucket/232a-2013-26-05-15-00-00/cust1234234/photo1.jpg examplebucket/7b54-2013-26-05-15-00-00/cust3857422/photo2.jpg examplebucket/921c-2013-26-05-15-00-00/cust1248473/photo2.jpg examplebucket/ba65-2013-26-05-15-00-00/cust8474937/photo2.jpg examplebucket/8761-2013-26-05-15-00-00/cust1248473/photo3.jpg examplebucket/2e4f-2013-26-05-15-00-01/cust1248473/photo4.jpg examplebucket/9810-2013-26-05-15-00-01/cust1248473/photo5.jpg examplebucket/7e34-2013-26-05-15-00-01/cust1248473/photo6.jpg examplebucket/c34a-2013-26-05-15-00-01/cust1248473/photo7.jpg     ...
examplebucket/2013-26-05-15-00-00/cust1234234/photo1.jpg examplebucket/2013-26-05-15-00-00/cust3857422/photo2.jpg examplebucket/2013-26-05-15-00-00/cust1248473/photo2.jpg examplebucket/2013-26-05-15-00-00/cust8474937/photo2.jpg examplebucket/2013-26-05-15-00-00/cust1248473/photo3.jpg ... examplebucket/2013-26-05-15-00-01/cust1248473/photo4.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo5.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo6.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo7.jpg     ...
examplebucket/232a-2013-26-05-15-00-00/cust1234234/photo1.jpg examplebucket/7b54-2013-26-05-15-00-00/cust3857422/photo2.jpg examplebucket/921c-2013-26-05-15-00-00/cust1248473/photo2.jpg examplebucket/ba65-2013-26-05-15-00-00/cust8474937/photo2.jpg examplebucket/8761-2013-26-05-15-00-00/cust1248473/photo3.jpg examplebucket/2e4f-2013-26-05-15-00-01/cust1248473/photo4.jpg examplebucket/9810-2013-26-05-15-00-01/cust1248473/photo5.jpg examplebucket/7e34-2013-26-05-15-00-01/cust1248473/photo6.jpg examplebucket/c34a-2013-26-05-15-00-01/cust1248473/photo7.jpg     ...

Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored in UTF-8 binary ordering across multiple partitions in the index. The key name dictates which partition the key is stored in.

Using a sequential prefix, such as time stamp or an alphabetical sequence, increases the likelihood that Amazon S3 will target a specific partition for a large number of your keys, overwhelming the I/O capacity of the partition. (from the official documentation)

Picking The Right Region

Choosing the right region may ameliorate your performance and reduce your costs. Take a look at the pricing table of AWS, check the price of data in/out, price per requests, in function of from where your users are using your content, you may find the best combination.

Do You Need HTTPS ?

In most cases, I would say yes but whenever you don’t really need it, use HTTP. We all know that HTTPS is safer, more standardized and everybody will sooner or later use SSL over HTTP, but according to the AWS pricing table, the price per 10,000 requests is cheaper for HTTP.

  • 0.0100 USD for 10,000 request over HTTPS
HTTP: (500/10000) * 0.0075 * 50000 * 30 * 12 = 6750 USDHTTPS: (500/10000) * 0.01 * 50000 * 30 * 12 = 9000 USD9000 - 6750 = 2250

Security

When some “script kiddies” get access to your credentials, they can use your resources without any limit on billings, so make sure to secure your account and credentials.

Delete Your Root Account

One of the best practices of securing access to your resources, is deleting the root user. Root is the user who have an administrator access. Don’t generate credentials with administrator access and try to refine your users’ rights and access to AWS resources.

Activate the MFA for your AWS account.

You have many choices:

  • Virtual MFA Device
  • Hardware Key Fob MFA Device
  • Hardware Display Card MFA Device
  • SMS MFA Device (Preview)
  • Hardware Key Fob
  • MFA Device for

Rotate your AWS keys.

When you change your access keys (access key ID and secret access key) on a regular schedule, you will shorten the period an access key is active and therefore reduce the impact on your billing if they are compromised.

aws iam list-access-keys --user-name user
{
    "AccessKeyMetadata": [
        {
            "UserName": "user",
            "AccessKeyId": "BBBCCCCDDDDEEEE",
            "Status": "Active",
            "CreateDate": "2018-05-31T23:07:29Z"
        }
    ]
}
aws iam create-access-key --user-name user{
    "AccessKey": {
        "UserName": "user",
        "AccessKeyId": "FFFFGGGGHHHHIIII",
        "Status": "Active",
        "SecretAccessKey": "xxxxx/xxxxxx/xxxxx",
        "CreateDate": "2018-06-05T20:07:05.344Z"
    }
}
aws iam list-access-keys --user-name user{

    "AccessKeyMetadata": [
        {
            "UserName": "user",
            "Status": "Active",
            "CreateDate": "2013-04-03T18:49:57Z",
            "AccessKeyId": "BBBCCCCDDDDEEEE"
        },
        {
            "UserName": "user",
            "Status": "Active",
            "CreateDate": "2013-09-06T17:09:10.384Z",
            "AccessKeyId": "FFFFGGGGHHHHIIII"
        }
    ]
}
aws iam update-access-key --access-key-id BBBCCCCDDDDEEEE --status Inactive --user-name user
aws iam delete-access-key --access-key-id BBBCCCCDDDDEEEE --user-name user

Geographic Restrictions

Another option to reduce costs that is not widely used but could be useful in some cases, is restricting the usage of your CloudFront files to certain countries. You have the choice to whitelist or blacklist a list of countries.

The Problem of Hotlinking and Controlling the Access to Your Files Using AWS WAF

When you host a static website using S3 and CloudFront or when you setup an CloudFront distriubtion to be consumed by your Wordpress blog or any other web app, your static files like images and videos, are invisible to public using URLs like:

https://static.your-website.com/assets/streaming/video.mp4
<video width="320" height="240" controls>
<source src="https://static.your-website.com/assets/streaming/video.mp4" type="video/mp4">
</video>

Analyzing your Bucket Usage

A good way to analyze who are using your S3 files, is activating logs. Start by creating a bucket for logging:

aws s3 mb s3://logs-zae45z4a5e4zr

Analyzing Your CloudFront Logs

Activating your CloudFront usage logs is not only good for marketing, but can also help you to identify and understand how and from where your objects are being consumed.

#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version fle-status fle-encrypted-fields2018-06-05 21:46:22 FRA50 33836 51.18.131.134 GET xxxxx.cloudfront.net /learning.jpeg 200 - Mozilla/5.0%2520(compatible;%2520) - - Hit skU3yiThb71WWW7HENzD9WB5Fzhn_gn-NNVpi4F6VV9uF_mpr7xOuw== website.com https 240 0.007 - TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/1.1 - -
  • time
  • x-edge-location
  • ip
  • method (get/post..)
  • Host
  • Referer
  • User-Agent
  • URI
  • Cookie
  • x-edge-request-id
  • x-host-header
  • x-forwarded-for
  • x-edge-response-result-type
  • ssl-cipher
  • Protocol version
  • fle-status
  • fle-encrypted-fields

A Quick Way to Download Your S3/CF Logs

If you followed the steps above, you can download your S3 logs to your localhost using:

aws s3 cp --recursive s3://logs-zafdf54sdfsdr/ .
cat $(ls -tr) > logs.txt

Using Compression

The cost of CloudFront data transfer is calculated in function of the total amount of data served so serving compressed files is less expensive than serving uncompressed files.

Content-Type
application/eot
application/x-otf
application/font
application/x-perl
application/font-sfnt
application/x-ttf
application/javascript
application/json
application/opentype
application/otf
application/opentype
application/pkcs7-mime
application/truetype
application/ttf
application/vnd.ms-fontobject
application/xhtml+xml
application/xml
application/xml+rss
application/x-font-opentype
application/x-font-truetype
application/x-mpegurl
application/x-javascript
application/x-opentype
application/x-httpd-cgi
application/x-font-ttf
font/eot
font/ttf
font/otf
font/opentype
image/svg+xml
text/css
text/csv
text/html
text/javascript
text/js
text/plain
text/richtext
text/tab-separated-values
text/xml
text/x-script
text/x-component
text/x-java-source
Accept-Encoding: gzip
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <AllowedHeader>Authorization</AllowedHeader>
    <AllowedHeader>Content-Length</AllowedHeader>
</CORSRule>
</CORSConfiguration>
curl -H “Accept-Encoding: gzip” -I http://<your_resource>

X-Cache (Hit vs Miss), ETag and Headers

When I troubleshoot my CloudFront distribution, I don’t use many tools, only CURL was enough to understand and reseolve some problems.

When X-Cache replies with HIT, it means that you are being served from the CloudFront distribution and when it is MISS, it means that CloudFront used S3 (and not its edges) to server you the requested file.

The ETag header could help in debugging and troubelshooting since it identifies a specific version of a resource. This is the definition given by Mozilla:

The ETag HTTP response header is an identifier for a specific version of a resource. It allows caches to be more efficient, and saves bandwidth, as a web server does not need to send a full response if the content has not changed. On the other side, if the content has changed, etags are useful to help prevent simultaneous updates of a resource from overwriting each other ("mid-air collisions").

“Via”, is another header that CF uses and it could be also helpful.

The Via general header is added by proxies, both forward and reverse proxies, and can appear in the request headers and the response headers. It is used for tracking message forwards, avoiding request loops, and identifying the protocol capabilities of senders along the request/response chain.

Note that some problems and optimizations need a good understanding of HTTP headers. AWS overrides the headers thay you don’t configure by some default values that you can find here.

The Origin Access Identity

Using an Origin Access Identity allows you to restrict access to your Amazon S3 content. When you create a CloudFront web distribution, you should create an “Origin”. This is the step in which you should configure the restriction.

To require that users always access your Amazon S3 content using CloudFront URLs, you assign a special CloudFront user — an origin access identity — to your origin. You can either create a new origin access identity or reuse an existing one (Reusing an existing identity is recommended for the common use case).

Some Quick Tips

  • Delete unused files: You pay for bandwidth sure but you also pay for storage
  • Use S3 lifecyle feature when needed
  • Clean your incomplete multipart uploads
  • Compress data before sending them to S3 (CSS, HTML, JS ..)
  • Set up a billing alert
  • Use CloudFront monitoring dashboard, even if they are not the greatest monitoring tool I used, but they could be helpful

Connect Deeper

If you resonated with this article, please subscribe to our newsletters:

  • Shipped: An Independent Newsletter Focused On Serverless, Containers, FaaS & Other Interesting Stuff
  • Kaptain: A Kubernetes Community Hub, Hand Curated Newsletter, Team Chat, Training & More

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts

970

970 claps
Aymen El Amri

Written by

Cloud&DevOps, Entrepreneur, TechAuthor, Founder/CEO www.eralabs.io & www.faun.dev , About me : www.aymenelamri.com

Faun

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts