Cloud Spanner scales as you grow. Scaling up an instance allows the instance to handle increased traffic and increase data sizes, allowing business to focus their time on what needs their attention, and reducing their cost of maintaining databases.
When to scale up
When you anticipate an increase in traffic, it is good to scale up your instance prior to the event. Scaling up is just a matter of increasing the number of nodes in your instance (While typically “scaling up” refers to adding more resources to one node, in this blog I’m using it to mean increasing the number of nodes, and “scaling down” to mean decreasing the number of nodes). While you can increase your instance size in one shot (there is no need to break up the increase into stages), it is recommended that you budget 30 minutes of rebalancing time for every doubling of your instance size. For example, if you start with 100 nodes, and need to scale up to 1000 nodes, you can make a single change to the instance size, but you should do this at least 2 hours before your event. During scale ups, there may be a slight increase in your tail latencies. Hence, we recommend that you scale up (and down) your instance during times of low traffic.
If you notice that your CPU utilization is above the recommended threshold, it is time to scale your instance up. To see how much your utilization is:
Step 1: Go to Cloud Console -> Spanner
Step 2: Click on the instance that you want to find out the CPU utilization for
Step 3: Click on Monitoring
You should be looking at the per instance CPU utilization, not per database, when deciding whether or not to scale your instance. (Note that my instance, jerene-test-instance, is selected instead of my database, gti)
Step 4: Review your rolling average for 24 hour CPU utilization and CPU utilization — High priority. Ensure that both are below their respective red lines. Note that these values are different for regional vs multi-region instances.
How much to scale up
The number of nodes that you want to scale up by will vary based on your use case. It is important to first establish a baseline for typical utilization amounts. Cloud Spanner scales linearly so you can use that to estimate how many nodes you will need. For example, if you have an instance with 15 nodes currently at 80% CPU utilization and you would like to reduce that to 60%, scaling your instance up by 5 nodes is a good starting point.
Issues with underprovisioning an instance
Unlike usage based databases which reject queries when you set a limit that is below your current traffic, Cloud Spanner allows requests to complete even when it is over the recommended CPU threshold. However, requests may take longer to complete. In some situations where the instance is severely underprovisioned, this may exceed request deadlines, causing requests to be timed out. Hence we recommend that users stay below the recommended thresholds for your instance type (regional or multi-region).
When will scaling up not be beneficial
If you are experiencing hotspotting due to a particular key(could be either a base table key or an index key) being accessed extremely frequently by tons of requests, resulting in increased latencies when reading or writing to this key, scaling up your instance will not help. Instead, consider some alternatives listed here. This is because a key resides in just one split which is on one node (and its replicas). Adding more nodes will not change this. So if you are seeing increased latencies (especially tail latencies) with low CPU utilization, perhaps take a look at resolving possible hotspots.
Scaling down instances will reduce the amount of idle resources and provide great cost savings to the customer. Here are some things to note when scaling down an instance.
When to Scale Down
Unlike scaling up, scaling down an instance involves unloading database splits from existing servers that are serving them and moving them to different servers in your instance. Traffic which was previously issued to these servers will be redirected to the new servers that are hosting these database splits. Hence we expect to see an increase in tail latencies when a scale down event occurs.
How Much to Scale Down
Like scaling up, because Cloud Spanner scales linearly, you can estimate the number of nodes you can scale down by as a factor of your baseline workload. For example, if you currently have 10 nodes with overall CPU utilization of 30%, if you scale that down to 5 nodes, you can expect an overall CPU utilization of roughly 60% for most cases. When scaling down, unlike scaling up, recommend you do this in stages so as to not disturb your live application traffic. Scaling down is best done at times of low traffic as well. Workload latencies respond to scale down events differently, so to get an accurate sense of how your workload responds, we recommend you scale down your instance by a maximum of 10% the first time you scale down. If the impact is within acceptable ranges, you can increase the % of nodes that you are scaling down by gradually.
How Often to Scale Down
We recommend waiting for 30 minutes between scale downs, or a minimum of at least 10 minutes. This allows the database splits to move to the new servers before the next wave of scale down happens. We do not recommend you to scale your instance up and down every few minutes. It is also important to note that Cloud Spanner currently charges by the hour, so scaling up and down your instance multiple times every hour may not provide any cost savings.
Tools and Automation
Cloud Spanner’s open source ecosystem has an autoscaler that users can deploy to automatically scale instances up and down based on CPU utilization. This open source autoscaler is now generally available. This is an ecosystem solution, and not a native functionality on Cloud Spanner.