Photo by Taylor Vick on Unsplash

Save cloud money #2 S3

Ruslan Dautov
HPCD Lab
Published in
4 min readSep 28, 2019

--

AWS S3 — Simple Storage Service

AWS provides several type of data storages with deferent conditions:

  • S3 Standard
  • S3 Intelligent-Tiering
  • S3 Standard Infrequent Access (IA)
  • S3 One Zone — Infrequent Access (S3 One Zone-IA)
  • S3 Glacier
  • S3 Glacier Deep Archive

Each of this products should be used in different scenarios. It does not matter how much data you have, what is more important is; how often you need this data and where will it be processed. For instance, backups maybe needed once a year where as some pdf files maybe needed more than 1000 time/day hence depending on this frequency, it is better to store the dats in different products.You can find out about AWS S3 pricing here.

7% wasted

According RightScale 7% of cloud bills are wasted on unused cloud storage. This can be happen in various use-cases hence today we will provide several advice on how to save this money.

Compress

Compress existing data to allocate less space. You can compress data locally and upload or you can do the same on AWS EC2 instance.

#From local to S3
gzip -c big_file | aws s3 cp - s3://bucket/foldername/big_file.gz
#From S3 to local
aws s3 cp s3://bucket/foldername/big_file.gz - | gunzip -c ...

Of course you can use any common compress tools like bzip2,gzip,pixz, pbzip2, xz, pigz and more.

Backups

Using S3 as backup storage is very convenient, however the incraesed costs may not make you happy. By default, S3 does not remove any objects from storage, so you will have to do the “garbage collection” yourself.

The obvious solution is to remove duplicates. So if S3 is used as a repository for automated backup, you can register the settings that performs the backup, deleting obsolete copies. Having developed logical constructions for commands, you can create scripts that will make backups with time division-for example, once a week, once a month, once a year, and specify for these copies a longer storage time for different archives, for example, a week, a month, and a year. They are stored for a week, a month, and a year before being deleted and replaced with the latest version, respectively.

In S3 there is a regular function of deleting files by time. Very convenient for backups. With a little configuration, you will have backups for the last 30 days. The old ones erase themselves.

Warning path specified as “ / “ does not mean the upper level. Needs to check how it is deleted, otherwise can be surprises.

Versioned

S3 has a feature called versioning. It can be utilized to preserve, recover and restore early versions of every object you store in your AWS S3 bucket. If you don’t need it and it on — it’s economically painful. Delete the old version of files-through GUI which may be a little bit complicated.

Replication

Replication helps to set up replication of S3 to another S3 in the same region or between regions. Same-Region replication (SRR) and Cross-Region replication (CRR). Replication is necessary for two reasons, firstly to minimize latency, data will be closer to processing data cluster and secondly for compliance however this two replication types have different charging model.

Replication in the same region does incur any cost, but cross region replication will charge similar to Data Transfer pricing.

If you want to replicate 4000 GB from S3 in Asia Pacific (Hong Kong) region to US West (Oregon) it will cost 4000 Gb * $0.09 USD/GB = $360 USD.

You can avoid this payment if you upload data to S3 in two regions at the same time. In this case data for both region will be charged as “in” data which is free.

The Right Region

Choosing the right region may improve your performance and reduce your costs. At the moment (September 2019) most expensive regions are Hong Kong and Sao Paulo.

Several useful command to work with S3 from CLI.

# List all buckets
aws s3 ls

# List content of a bucket
aws s3 ls s3://<bucket>
# Create a bucket
aws s3 mb s3://<bucket>

# Copy into bucket
aws s3 cp <path> s3://<bucket>
# Copy from bucket
aws s3 cp s3://<bucket> <path>

# Move within bucket
aws mv s3://<bucket>/<src> <dest>
# Remove empty bucket
aws s3 rb s3://<bucket>

# Remove object from bucket
aws s3 rm s3://<bucket>/<path>

Read our previous posts on saving cloud money here:

About HPCD Lab

HPCD Lab ( https://hpcdlab.com )is a multi-cloud SaaS platform which helps companies save money on their cloud usage (AWS, Azure, Google Cloud, Alibaba Cloud). The platform analyzes your bills and cloud infrastructure usage after which’s it’s machine learning system automatically prepares recommendations for your engineers and managers.

--

--