Expedia Group Technology โ€” Software

Save cloud resources == save ๐Ÿ’ฐ and save trees ๐ŸŒณ

Planting a seed ๐ŸŒฑ

Charles Gerber
Expedia Group Technology

--

An image of a plant growing out of a keyboard
Photo by Charles Gerber

Nowadays EVERYTHING is in the cloud โ˜๏ธ even the servers that run our application that we developers craft with our hands ๐Ÿคฒ.

Lately in Expedia Groupโ„ข we are optimizing our AWS configurations in order to save money ๐Ÿ’ฐ or I should say โ€œstop wasting moneyโ€ in memory/cpu/storage that we donโ€™t need or use.

As developers we often talk about clean code and complex architectures but rarely about costs ๐Ÿ’ฐ and even less โ€œgreenโ€ efficiency ๐ŸŒณ. I work in an โ€œeco-friendlyโ€ team (very proud of them ๐Ÿ’ช), so we gathered to look at the data, find solutions and measure โš–๏ธ the impact of our applications ๐Ÿญ on the environment ๐ŸŒŽ.

Philosoaaptor sayingย : โ€œHumans cut down trees, make paper, then write โ€œsaves treesโ€ on them.
Raptor saying : โ€œHumans cut down trees, make paper, then write โ€œsaves treesโ€ on them.

Let me share with you our journey to save some trees ๐ŸŒณ

Context first

Here is our stack and the points we might focus on during our crusade:

  • Our team โ€œownsโ€ 9 applications
  • AWS
  • Amazon EKS
  • Elasticache (kibana) for logs

Where should we look ? ๐Ÿ‘€

Inception pictureย : we need to go deeper
Inception: Too much data to analyze. We need to go deeper
  • Where do we consume the most energy?
  • What kind of AWS resources do we own?
  • What about EKS? How many containers do we have in each region and stage? Is our memory usage ratio good?
  • Can we lower our impact on Elasticsearch as well? How many logs are we producing? How are we using them in general? Are they relevant? Could we reduce them without sacrificing our monitoring?

Tip ๐Ÿ’ก: We built a kibana dashboard to follow the size of the logs we produce by application. It helped us realize where we have to focus the most and also helped us to track our progress ๐Ÿ“‰.

Kibana logs production dashboard

Collect the data ๐Ÿšš

AWS

Our team uses many AWS products including EKS, Elasticsearch, DynamoDB, KMS, Lambda, S3, Secrets Manager, and SQS. However most of our costs came from EKS and Elasticsearch.

EKS

We had the following data:

  • 9 applications over 2 regions (i.e. eu-west-1, us-west-2) and 4 stages (i.e. alpha, beta, gamma, prod)
  • 43.5 cores requested
  • 190 GB memory requested

Average memory usage (it is the percentage of memory we are actually using on the amount we asked for, the closest to 100% the better):

  • 10% in lab
  • 17% in prod
Pie charts depicting memory usage
Diagram of % memory used in lab and prod

Elasticsearch

Here are the quantity of logs we produce for our applications:

  • 2GB weekly in lab
  • 50GB weekly in prod

We realized several things

  1. Our AWS consumption was not so bad overall, but we had some โ€œtestโ€ resources that could be deleted
  2. Clearly we were underusing our EKS containers in terms of memory
  3. Alpha and Gamma stages were left alive after new releases deployments
  4. We did not need to keep logs for GET.base.version
  5. There was a problem with logging in one of our applications

Action plans ๐Ÿ‘ท

Yoda suggesting that we need a plan
Yoda saying : Listen โ€ฆ we need a plan

We decided to all gather in a room and update all our apps at once! As stated above, we also created dashboards to follow-up with the optimizations.

To reduce our footprint, we made modifications in this order ๐Ÿ‘ฃ

AWS (excluding EKS and Elasticsearch)

  • delete unused left behind โ€œtestโ€ resources: in order to save storage

EKS

  • Delete alpha and gamma stages after successful deployment and/or automated acceptance tests : Alpha and Gamma are โ€œjustโ€ used for acceptance and deployment test but they were left alive after new releases deployments. So we decided to โ€œkillโ€ them automatically in our pipelines.
  • Divide memory request by 4 in lab
  • Divide memory request by 2 in prod

Elasticsearch

  • Not log GET.base.version anymore: Those logs are not necessary for our monitoring and represent 1.5 GB a week.
  • Remove unnecessary RabbitMQ logs
  • Remove a log of a big payload happening many times in one of our apps: Saved around 40 GB a week! We discovered that in one of our apps, we twice logged a large payload that we didnโ€™t need to for the following reasons:
  1. Because we filtered out this message (the message received from the queue didnโ€™t concern our application) anyway in 80% of the time
  2. Because another application (belonging to another team) was already logging this payload before sending it to us

Results ๐Ÿ“ฃ

The impact for our owned AWS resources was relatively small. However, the results for EKS and Elasticsearch were glaring:

For EKS:

  • Now requesting 29 cores instead of 43.5
  • Now requesting 44GB memory instead of 190GB

Thus, we increased our average memory usage from :

  • 10% to 23% in lab
  • 17% to 25% in prod

(we might decrease it a little bit in the future, but we need to keep โ€œsecurityโ€)

For Elasticsearch:

  • Now consuming 0.3 GB weekly in lab instead of 2 GB : 85% saved
  • Now consuming 3 GB weekly in prod instead of 50 GB : 94% saved

๐Ÿ‘ฎ Of course we took care to not alter our debugging capabilities.

Chuck Norris saying: I donโ€™t use debuggers. I stare down until the code confesses

How much money ๐Ÿ’ฐ are we saving per week?

On EKS we estimate savings 75% of the cost PER WEEK!! ๐Ÿ’ฐ

How much energy โšก๏ธ are we saving per month?

In total we saved 531 kWh โšก๏ธ

How much CO2 ๐Ÿญ are we saving in a year?

Saving approximately 2712kg of CO2 emissions ๐Ÿญ (in a year). Or around 3 tons!! **

How many trees ๐ŸŒณ are we saving in a year?

Here is a math problem for you ๐Ÿ‘ฉโ€๐Ÿซ :

Xavier ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป is a developer at Egenciaยฎ, he optimized all his applications to reduce their CO2 emissions by 2712kg, knowing that 1 tree ๐ŸŒณ consumes approximately 5kg of CO2 โ˜๏ธ a year โ€ฆ How many trees ๐ŸŒณ are Xavier and his team โ€œsavingโ€ each year ? โ€ฆ.

You have 2 minutes!!

The answer is here (you just need to count):

๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ ๐ŸŒณ
๐ŸŒณ ๐ŸŒณ ๐ŸŒณ

Thatโ€™s right the work (CO2 consumption) of 542 trees ๐ŸŒณ saved EVERY year!

Captain planet : The power is yours!

Notes

* One EKS node is 64GiB, 16vCPU, ~0.5โ€“0.75kWh (i.e. M5_4XLARGE)

** AWS average power mix carbon intensity (gr/kWh)(2015, excl. Carbon offset): 393 gr/kWh (cf. https://aws.amazon.com/about-aws/sustainability/).

Also us-west-2 and eu-west-1 regions are supposed to be carbon neutral, but AWS is quite shady about how those emissions are compensated (cf. Greenpeace ClickClean 2017 Report).

Some other links we used during our quest to have a more eco-friendly team:

Stay safe and wash your hands! ๐Ÿฆ 

--

--