How TrueCar Saves 40% on AWS with EC2 Reserved Instances

Driven by Code
Jul 9 · 9 min read

By: David Wang

Since all of our development and production workloads are run in AWS, costs can get out of control very quickly if left unchecked. We took a look at our monthly bill and it became clear that EC2s made up the bulk of our costs. Luckily, we were able to take advantage of reserved instances (RIs) and reduce our costs significantly without having to modify our resource usage. In this post, we’ll introduce the concept of RIs, describe how we manage them at TrueCar, and raise some important issues to keep in mind.

Reserved Instance Overview

Reserved instances (RIs) are capacity reservations for AWS resources. In exchange for your commitment to pay, RIs offer a significant discount (35–75%) over the pricing for normal, on-demand usage.

It’s similar to purchasing a membership for a museum.

You pay a large up-front fee and, in return, you get free or discounted museum tickets. If you go frequently enough, the cost of the membership will be less than if you had paid for individual tickets every time you went. Similarly, the goal is to purchase RIs in such a way that your RI payment ends up being less than the regular on-demand price.

RIs exist for several AWS products, including EC2, RDS, Redshift, and Elasticache. This article will focus specifically on EC2 RIs.

RI Configurations

RIs have plenty of configuration options. We’ll discuss each aspect in turn.

Instance type

With the exception of the Linux platform, RI types can only cover running instances of the same type. For example, a Windows m5.2xlarge can only cover a running Windows m5.2xlarge, and not a Windows m5.xlarge or m5.4xlarge.

With Linux platforms, RIs are applied based on a normalization factor. Each instance size has its own factor: a t2.large has a normalization factor of 4, and a t2.xlarge has double that, a normalization factor of 8. RIs are applied based on these factors.

For example, let’s say you have a t2.4xlarge RI, which has a normalization factor of 32. This could cover:

  • One t2.4xlarge (32 units) instance
  • Two t2.2xlarge (2 x 16 units) instances
  • Four t2.xlarge (4 x 8 units) instances
  • One t2.2xlarge (16 units) and two t2.xlarge (2 x 8 units) instances

One useful policy we have in place is to standardize the Linux instance types we use. If we run m5 instance types for most of our workloads, we don’t have to worry too much about the specific sizes of the reservations, as long as we manage the normalization factors.

Instance region

RIs can be purchased for a specific availability zone or for a whole region. We run almost all of our workloads in the us-west-2 region, so we purchase all of our RIs scoped to that region. RIs scoped to availability zones (AZs) have less flexibility and can only cover instances run in that specific AZ. According to AWS, having a zonal RI provides a capacity reservation in the specified AZ. Since we haven’t run into issues with AWS running out of EC2 capacity in our AZs, and our systems are designed with high availability in mind, we opted to use the more flexible region-scoped RIs.

An RI can change its scope between the region and an AZ at any point. However, RIs cannot be transferred between regions.

RI offering class

RIs come in two offering classes: standard and convertible. Here are the main differences.

Standard

  • Can be modified but cannot be converted
  • Can be sold on the RI marketplace

Convertible

  • Can be modified and converted
  • Cannot be sold on the RI marketplace

Modifications and conversions are different when it comes to RIs. A modification includes changing:

  • Availability zone
  • Scope (region vs. AZ)
  • Network platform
  • Instance size (within the same family)

Conversions include all of the above, but also allow you to convert between instance families.

For example, changing a reservation from one m5.4xlarge to two m5.2xlarge instances is a modification, which you can accomplish with both RI offerings. Changing a reservation from one m5.4xlarge to one c5.4xlarge is a conversion, which can only be done with a convertible RI.

Although convertible RIs offer less savings, they offer much more flexibility.

Typically the standard class has 5–10% more savings than the convertible. We like to use the convertible EC2s for their flexibility — if we decide to change our instance types, we can easily convert the unused RIs we have. With standard RIs, if your needs change, they have to be sold on the marketplace.

Although convertible RIs offer less savings, they offer much more flexibility. However, one caveat of convertible RIs is that you cannot convert an RI of lower value: each exchange has to be to a configuration with a higher cost. For example, at the time of this writing, we have a reservation with two m3.2xlarge instances. If we wanted to exchange them for the t3.large type, we would only be able to exchange them for 19 instances, leading to another up-front payment of about $244 to make the exchange. AWS will not let you exchange RIs to something with a lower cost value.

RI term

With EC2 RIs, you pay for instances with a one-year or three-year commitment. The three-year commitments carry the biggest discounts — usually 10–30% higher than one-year commitments — but of course require a longer time commitment.

In most cases, a three-year commitment offers significant savings over a one-year commitment. With the addition of convertible RIs, significant savings can be achieved while still having flexibility should your workloads change.

Payment option

There are three main payment options:

  1. All Up-front — Pay all of the commitment cost up front and don’t pay a monthly fee (results in the most cost savings)
  2. Partial Up-front — Pay some of the up-front cost, with the remaining cost being charged as a fixed monthly cost (medium cost savings)
  3. No Up-front — Pay nothing up front, and pay the rest of the cost as monthly payments (lower but still significant cost savings)

Here’s a quick example of the cash flows for the different payment schedules. This is just for illustrative purposes — the actual amortization schedule will depend on your company’s cash flow policies and any applicable regulations.

For a one-year convertible r5.4xlarge EC2 RI purchase:

Planning for RI Purchases

Once we were aware of all the information about RIs, our next step was to plan out and execute our purchases.

Determine need and break-even points

Predicting RI need is one of the more difficult tasks to do. We use convertible RIs so we get the flexibility of changing instance types when we need to, but each conversion costs money. To help reduce the number of conversions, we set a few standard instance types across the company. With Linux RIs, since they are utilized based on their normalization factor, it is easier to plan out those purchases, and we don’t have to worry about converting or modifying as long as our instances tend to stay in the same family.

To help reduce the number of conversions, we set a few standard instance types across the company.

As an additional note, instances cannot be moved across regions, so be careful to keep that in mind when predicting your RI need.

When looking at intended usage, another important factor is the break-even point — the amount of time before an RI costs less than paying on demand. This is pretty straightforward to calculate. For example:

Monthly cost of an on-demand instance vs. a reserved instance.

If a one-year RI is purchased and only used for a few months, then it becomes more expensive than paying for it on demand. In this particular case, it looks like the break-even point is between eight and nine months — that is, at that point the equivalent monthly cost of the RI becomes less than the cost of on-demand. If expected usage is at least to the break-even point, then it is cheaper to go for a reserved instance, even if it will not be utilized 100% of the time.

To calculate this break-even point, take the term and multiply it by one minus the savings percentage.

This is the cost table for the RI from the graph above. For the partial up-front, multiply the term 12 months, or 365 days, by (1 minus 31%), or 69%. The result is 252 days, or approximately 8.3 months. This is the same result shown in the graph. As long as instances are running longer than this, the cost of RIs will be less than the on-demand cost.

We look at all of our EC2s, converse with different teams to determine their computing needs, then buy RIs accordingly.

Purchase RIs and monitor usage

We work with our accounting and finance departments to make sure they are aware of the RIs we are purchasing so they can be sure they are properly accounted for in our budget.

Once the RIs have been purchased, we monitor their usage and convert when we see a need to. We use two primary tools to accomplish this: AWS Cost Explorer and our in-house RI Converter.

Monitoring RI usage

Through AWS Cost Explorer, we can see how our RIs are being utilized. We use two main views: RI coverage, which shows how much of your on-demand instance usage is covered by RIs, and RI utilization, which shows how many of your RIs are being utilized. The goal is to have RI coverage as high as possible while also making sure RI utilization doesn’t dip down too low. We strive for about 90% coverage on long-running instances and 100% utilization.

Reservation utilization shows us how many of our RIs we are using

Reservation coverage shows us how many of our running on-demand instances are covered by RIs.

Instead of having to check this tool every day, we have a Slack alert that notifies us of our coverage.

This information comes directly from the AWS Cost Explorer API. We use a Lambda function written in Python 3.7. This is an example of an API call:

Converting RIs

In cases where our users change their instance types, or where we overestimated our EC2 demands, we end up not fully using our RIs. With convertible RIs, we can manually switch the instance types. We automated this exchange process, which we will talk about in a later article.

There are several caveats to converting RIs:

  • Only convertible type RIs can be converted.
  • No up-front RIs can be exchanged for any other payment type.
  • All up-front and partial up-front RIs can be exchanged for each other, but not for no up-fronts.
  • The term (one-year or three-year) must be the same for the destination RI type.
  • RI exchanges can only be converted to an RI of greater value, not less, so each exchange will incur a cost.

The last point is important. This means every conversion requires an additional payment. It is important to keep your conversions to a minimum. For more detailed information on exchanges, please see this AWS article.

If a conversion is necessary, it would be worthwhile to cycle through the different conversion options to determine which conversion would cost the least. We have an automated solution for this, which, as mentioned, will be discussed in another article.

A Penny Saved

Through using EC2 RIs, we were able to save over 40% on our AWS costs in the past year on our running resources, without modifying usage patterns.

When looking to reduce AWS costs, it’s best to first take a look at the biggest spenders. For a lot of companies, EC2 cost will be high. Compared to other cost saving measures, purchasing RIs requires little to no modification of current running workloads and thus is easily implemented. However, it does require having a good plan in place and somewhat accurate forecasts of future demand.

We engage in other cost optimization efforts, but reserved instances are definitely our biggest and easiest win. We’re able to spend the extra cash in our pocket toward additional cloud resources, developers, and other ways to improve the way we do business.

Driven by Code

Technology is our art. We learn so much from the community and we want to give back. This is our contribution.

Driven by Code

Written by

Welcome to TrueCar’s technology blog, where we write about the interesting things we‘re working on. Read, engage, and come work with us!

Driven by Code

Technology is our art. We learn so much from the community and we want to give back. This is our contribution.