How to save some $$$ using Amazon Web Services?

Published in

Fandom Engineering

5 min readSep 25, 2020

Is using AWS cloud cool? Definitely! What is not so cool about using those kinds of services is quite a limited way to keep track of how much it costs or avoid surprising bills at the end of the month. Working on-premise does not make you concerned about those matters, but on the other hand, you have to give up on benefits that the cloud serves, like flexibility, auto-scaling, short time-to-market, robustness, easy research, and development, etc.
The aforementioned end of the month comes, and after receiving a receipt, you are thinking only about two things: how is it possible that I burned so much money on this and how to make the next receipt less painful. Answer six questions below and find out how much you can save.

Am I still using it?

As it may sound obvious, you should start by reviewing all the services that you are using. Maybe you spun out an EMR cluster 2 months ago to make some calculations, and somehow you forgot to terminate it? EC2 instances are even worse in terms of the tracking because some of them are EMR dependent, and if you don’t use any tagging (to which I am encouraging you strongly), it is hard to say if a given instance is still needed.
Going to other services does not make it easier: obsolete Redshift clusters, RDS databases, Kinesis streams, Lambdas — be prepared for wading through the thicket.
And don’t forget about availability zones — there are plenty of them, and it is not possible to show all the instances for all the AZs at one time.
However, there is still hope. AWS provides AWS Config, which helps you to analyze the resources that are in use. Also, Billing tools are pretty handy, you can review all the services you have paid for.

I am sure that you have regret updating your software at least once in a lifetime: blue-screen, device slow-down, or at worst brick happens when the provider does not test the patch properly.
AWS is a different story. Providing billions of services for over a million enterprise users (according to Amazon) fail is not something that should happen anytime at all. With this confidence, I encourage you to use an updated version of any service that allows you to do it.
EMR and EC2 are the kings in this area. Each new generation brings two big pros: optimization and lower cost. As it may sound strange, the newest generation of hardware is, in most cases, cheaper than the previous one. It is usually better equipped (in terms of RAM, storage, and CPU but also supports a newer version of Spark or Hadoop).
What is the cost then? Yes, you have to check if your code would run on the newer generation. With easy and fast cluster spin up this won’t take a lot of your precious time.

Do I still need this data?

S3 is one of the biggest players in AWS services. I am sure that you have heard about different storage options but was a bit afraid to move your cherished files somewhere else. In my opinion, you are right having doubts.
First of all, we have three major types of storage: S3 Standard, Infrequent Access (IA), and Glacier.
Starting from the last one, you should use it only if you are 99% sure that you won’t need this data ever (disaster backup). Storage there is extra cheap but bringing data takes time and money.
Infrequent Access is something in the middle — less expensive than Standard but less troublesome about retrieving data from Glacier. When you are sure that you will use the data stored in IA once a month or even less often, you should go for it. However, be careful if you are using this data with API or have tables created in Athena on top of them. In that case, every query touching a given file will generate additional costs and quickly overtake Standard Storage costs.

Can I replace it?

Although keeping everything in the cloud at one provider is preferred (obviously), it is not always the most economical one. Maybe you or your company already have some on-premise servers to use? Of course, auto-scaling is not an option here, but you do not always need that.
Your SQL database server, message broker, Kubernetes servers, or simple storage on HDD can save you some money. The other option is to rely on more than one cloud provider. Some services may be cheaper here or there.
However, think (and calculate) twice before e.g. moving storage somewhere else because providers often charge extra for moving data out given infrastructure.

Do I need this power?

Generally, if someone is not sure how much resources they need, they go with the best setup they can afford. However, analyzing usage can bring quite substantial savings. Again, CloudWatch is a convenient tool to check if your EMR uses all the resources, or DynamoDB is not over-provisioned. Cloud is not always about scaling-up; sometimes scale-down is a true blessing. Very likely that you are burning your money on stuff you do not even use.

Can I pay nothing?

Last but not least, free tier. In most of its services, Amazon offers a free option, which usually is the simplest set-up. Although it cannot handle big traffic or usage, it is a great way to test things before bringing it to production. 750 hours of EC2 per month, 5 GB on S3, or 1M free Lambda requests per month is more than enough to play with or help some start-up to take-off. And remember, it is free only for the first 12 months.

Conclusion

As Peter Parker principle says: “With great power comes great responsibility.” Cloud is an almost infinite source of resources, and using it cautiously won’t expose you to infinite receipt.

Originally published at https://dev.fandom.com.