Scaling your application on AWS
Scaling is hard, period.
Most of the time we are tasked with scaling a mature application that was built a few years ago. The design assumptions that were made at that time to facilitate fast feature rollout are not valid anymore and most of the time hinder scale and performance of the application. As a result of this, we need to think of creative ways to scale the application and/or perform ‘brain surgery’ and re-write the heart or core of the application to make it faster and serve our millions of clients. I’ve been part of this process many times.
For those who are fortunate enough to have an opportunity to build a platform/api/framework from scratch should design for fast feature rollout as well as think about scale and performance. I understand the primary goal is get something out there, fast. But very quickly that ‘something’ evolves into a giant pile of spaghetti — or something you’re not very proud of. Trust me, its going to bite you later.
Here are some basic notes and principles based on my experiences as well as discussions with experts and attending various AWS sessions. Its all common sense really.
- Design lightweight de-coupled components as much as possible. Gives you the flexibility to move them out to individual servers later on if necessary.
- If you’re designing an API, keep it stateless, and keep calls as light as possible. Don’t design calls that ask for the sun and the moon.
- As a general rule of thumb, offload heavy processing to a separate asynchronous processes.
- Proxies / caches / CDNs are your friends. Use them to reduce load on your api/web servers.
- Most often, the first crippling point is your database. Make sure you leverage read replica’s so you distribute the load. Writes can still go to a master. When you have trouble with writes is when its time to think of sharding.
- Have monitoring in place to measure your throughput and performance. You don’t want your customers to tell you your site/api is down. You should have monitors in place to tell you that so that you can do something about it.
- Periodically run tests that pushes your servers to the limit — so that you know when it would break.
- Bake performance testing into your build verification process. Run these performance tests before you deploy, because a simple change can sometimes break performance. It doesn’t have to be painful, just something that measures your basic and critical flows.
Here are some screen shots and notes of a Scaling Up to Your First 10 Million Users AWS session i attended that i thought was pretty great!
- Auto scaling is not a silver bullet. You need to design your application well first, and there’s some pre-work you can do upfront.
- At the most basic level, horizontally scale by adding additional servers in different availability zones under your ELB.
- Have Master/Slave db’s in different availability zones.
4. As you grow, start adding those read replica databases.
5. Move popular reads (that contain static data — or data that doesnt change much) to caches — ElastiCashe in this case.
6. When you get to this point, you’re ready to add auto scaling. Define min/max pools, define cloud watch metrics to drive scale. An autoscaling group across availability zones is the recommendation.
7. Start using S3 and CDNs (CloudFront in this case) for static assets. Your architecture should look like this now.
8. Move compute intensive processes into their own workers. Here’s where the lose coupling helps.
Your architecture starts to look like this (one availability zone has been omitted due to space):
9. At this point, you’ve done some good things now. But at the next phase of changes (most often) are done at the data tier; Database Federation, sharding, NoSQL in some cases (for dynamic data)
These are what i found useful. I’m always looking for additions, so comments are welcome.
Credit to Joel Williams, AWS