Building Infrastructure to Serve Millions of Users on AWS: 10K Users

6 min readJun 1, 2024

Infrastructure to serve 10K Users.

The series includes:

Lesson 0: Install WordPress on EC2 and create RDS + S3
Lesson 1 — 1k users: package EC2 and set up a load balancer and autoscaling group
Lesson 2 — 10k users: create database replica, connect master — replica
Lesson 3 — 100k users: implement cache system by creating Elasticache
Lesson 4 — More than 500k users: implement HTTP cache system

In the previous article, we established a foundation to support 1,000 users. In this article, we will expand the system to accommodate 10,000 users—the infrastructure we built in the previous article.

Next, we will enhance this infrastructure to serve 10,000 users effectively.

Problems with Previous Infrastructure

Previous current infrastructure is very stable and can handle thousands of users. However, if the number of users increases to tens of thousands, problems will appear, particularly at the database layer. This is because we have only applied expansion at the application layer, while the database layer still needs to be scaled. Let’s consider the following scenario.

Our infrastructure consists of an EC2 instance and a database. Currently, the number of users is around a few hundred, and everything is running smoothly.

As the number of users increases to thousands, a single EC2 instance can no longer handle the load. However, since we have implemented an Auto Scaling Group, the number of EC2 instances will automatically expand. At this stage, everything still runs fine.

The problem arises when the number of users grows and reaches tens of thousands. At this point, the Database layer becomes the bottleneck, as it has yet to be scaled along with the application layer. The database cannot handle the increased load, leading to performance issues and potential system instability.

To address this problem, we need to scale the database layer.

Scale the database

To scale the database layer, we have two methods: vertical scaling and horizontal scaling. However, scaling the database is not as straightforward as scaling the application layer.

Vertical scaling

Vertical scaling involves expanding the database by adding more CPU and memory resources:

On-premises data center: if the database is deployed in an on-premises data center, expanding it requires contacting the IT organization that owns the data center. This process involves several steps, from sending the request to waiting for confirmation, and then waiting for IT to perform the actual expansion. This process can take anywhere from a day to a few weeks, which can be time-consuming and effortful.
Cloud: if the database is deployed in the Cloud, the expansion process is much simpler. However, this convenience comes with a cost. In the cloud, more powerful database instances are more expensive, potentially costing 2 to 3 times more than on-premises data center deployments.

Horizontal Scaling

Expanding the database using horizontal scaling is more complex. There are two main approaches to horizontal scaling:

Partitioning: dividing a large database into smaller, data is distributed across these smaller databases.
Replication: adds additional databases with a read-only function, called replicas. Data is synchronized from the primary database to the secondary databases.

In the case of WordPress, we will use replication — because it is easy on the cloud.

Adding Replica for AWS RDS

When it comes to expanding the database through replication with AWS, the process is straightforward. We have access to the AWS RDS Console (please read the previous article for clarity).

Select the Action button -> Add reader.

Fill in the following details:

DB Instance Identifier: mysql-instance-02
Instance Configuration -> Burstable classes -> db.t3.small
Connectivity -> Publicly Accessible
Keep the other parameters at their default settings

Wait for AWS to create the RDS Replica.

After successfully creating the AWS RDS Replica, the next step is to configure WordPress to use the RDS Master and RDS Replica correctly.

Update WordPress Configuration

To achieve this, we do not need to write custom code or rely on libraries specific to the programming language used. Instead, we will utilize the HyperDB Plugin, which is specifically designed for WordPress:

Navigate to the WordPress Plugins section
Search for the HyperDB plugin
Install the plugin to enable it for your WordPress site

The next step is to access the EC2 instance that is hosting your WordPress site. Navigate to the /var/www/html directory and create a new file named db-config.php with the following content:

<?php
/**
 * Register the primary server to HyperDB
 */
$wpdb->add_database(array(
    'host' => 'mysql-cluster.cluster-cz2tvtl6lcep.ap-southeast-1.rds.amazonaws.com',
    'user' => 'devopsvn',
    'password' => 'devopsvn',
    'name' => 'devopsvn',
    'write' => 1,
    'read' => 1,
));

$wpdb->add_database(array(
    'host' => 'mysql-cluster.cluster-ro-cz2tvtl6lcep.ap-southeast-1.rds.amazonaws.com',
    'user' => 'devopsvn',
    'password' => 'devopsvn',
    'name' => 'devopsvn',
    'read' => 1,
));

Check the RDS console to find the RDS primary and replica endpoints.

The final step is to move the db.php file from the HyperDB Plugin directory to the wp-content directory.

sudo mv wp-content/plugins/hyperdb/db.php wp-content

To check if your replica mysql-instance-02 is connected or not, follow these steps:

Refresh the page multiple times
Navigate to the RDS Monitoring Section
Check for the status of mysql-instance-02 in the Monitoring section to verify if it is connected or not

If you see mysql-instance-02 connected in the Monitoring section of the MySQL Cluster, it indicates that you have successfully configured the HyperDB plugin and the RDS replica.

To ensure that all EC2 instances created from the Auto Scaling Group have the HyperDB configuration, you need to package the current EC2 instance into an Amazon Machine Image (AMI). Please read the previous article.

Conclusion

We have learned how to scale the infrastructure to serve tens of thousands of users. In the next article, we will learn how to expand the current infrastructure to serve hundreds of thousands of users.