Hello World! in the light of scalability
This is my first medium post and hopefully will be a box office hit, just like Sam Mendes’s debut movie American Beauty got huge success! Lights, Camera, Action…
What this post is about
We will write an application server that echo’s hello world every time a request comes, later we will scale the application using nginx web server to serve thousand requests per second!
Who should and who shouldn’t read this post?
- Junior developer looking for first hand’s on scalable system experience? Read it.
- Always wanted to implement a simple multithreaded server yourself? There’s a simple example for you.
- Too lazy to setup linux environments for your first scalable system as you are a Windows user? I am just like you.
- Want to know about load testing using jmeter? Keep reading.
- A veteran of high scalable system? Run, save your time.
How to utilize this post
Take this post as a birds eye view of a real life scalable system. First reproduce the system this post describes in your machine and then gradually improve the system so that it can serve millions of requests per second just like real world systems.
Hello World! Once again
The following is a 87 line C++ code that works as an application server, all it does is send a Hello world! response to client when a request comes. In no way this is an optimized code but it works. The code is pretty self explanatory and is a modified version of client-server example provided by Windows dev center so if you find it difficult to understand or interested to learn about windows socket programming, get all the explanations from here.
#undef UNICODE
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <winsock2.h>
#include <ws2tcpip.h>
#include <stdlib.h>
#include <stdio.h>
#include<thread>
// Need to link with Ws2_32.lib
#pragma comment (lib, "Ws2_32.lib")
// #pragma comment (lib, "Mswsock.lib")
// Port id where this application server listens
#define DEFAULT_PORT "27020"
// No of thread working to send response
#define TOTAL_THREAD 4
//Response from this application server, last two digits of port id after the Hello World
const char* Response = "HTTP/1.1 200 OK\r\nContent-Type:text/html\r\nContent-Length: 14\r\n\nHello World20!";
//Mehtod to send response to client
void SendResponse(SOCKET ClientSocket)
{
//send the response
send(ClientSocket, (char *)Response,strlen(Response), 0);
// shutdown the connection since we're done
shutdown(ClientSocket, SD_SEND);
//close ClientSocket
closesocket(ClientSocket);
}
int __cdecl main(void)
{
//In the following block we decalre some builtin structure as the are necessary
//-------------------------------------
WSADATA wsaData;
SOCKET ListenSocket = INVALID_SOCKET;
struct addrinfo *result = NULL;
struct addrinfo hints;
//-------------------------------------
// Initialize Winsock
WSAStartup(MAKEWORD(2, 2), &wsaData);
//Initialize the structure we declared above
ZeroMemory(&hints, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;
hints.ai_flags = AI_PASSIVE;
// Resolve the server address and port, This application runs on localhost:27020
getaddrinfo(NULL, DEFAULT_PORT, &hints, &result);
// Create a SOCKET for connecting to server
ListenSocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
// Setup the TCP listening socket
bind(ListenSocket, result->ai_addr, (int)result->ai_addrlen);
//We can free the structure as it will not be required
freeaddrinfo(result);
//Keep listening on the port specified
listen(ListenSocket, SOMAXCONN);
//Threads that will send resposne
std::thread Worker[TOTAL_THREAD];
int iThreadNo = 0;
while (1)
{
iThreadNo++;
iThreadNo = iThreadNo % TOTAL_THREAD;
Worker[iThreadNo] = std::thread(SendResponse,accept(ListenSocket, NULL, NULL));
Worker[iThreadNo].detach();
}
// No longer need server socket
closesocket(ListenSocket);
// cleanup
WSACleanup();
return 0;
}
After running the application, hit a request from your browser to http://localhost:27020/ to see a response from your server. Congratulations, You have implemented your hello world application server successfully.
Scaling the application
Now consider that your hello world application was a huge success and clients of your application server is growing rapidly. One thing you can do to scale the application is add more RAM, buy better processor to get faster computation. This is what is called vertical scaling. But eventually you will hit the roof as there is a limitation how much resource you can add to a single machine. More importantly what if your single machine dies for any reason? Power out, hard disk failure anything can happen. Suddenly all your happy customers will not get a Hello World! response(Google to know more about Single Point of failure). What we need to do is instead of using single machine use multiple machines where a single application server runs in each so that it can get it’s own resource and single point of failure does not happen, this is called horizontal scaling.
To keep things simple we will run multiple application servers in the same machine by changing the port id. Simply change the DEFAULT_PORT value of the above program and build it. This new server will listen to the new port specified, also change the last two digit of Response string to make sure you your new app echos different port id when localhost:new_port gets hit. Our new system still suffers from single point of failure as all application servers run on single machine but you get the idea to extend it to multiple machines.
Introducing Nginx:
Nginx is a popular web server that can be used for many purposes (load balancing, proxy server; google the terms if you are unfamiliar). To put it simply application server (our hello world application) is where business logic is computed and response is generated, web server (in this case nginx) is something that sits between your application server and client to manage the request/response cycle between client and application server.
Ideally web servers are run on a single machine apart from application server but again we will run nginx on the same machine only listening to a different port for simplicity. Download nginx for windows and replace the nginx.conf file content with the following.
worker_processes auto;events {
worker_connections 1024;
multi_accept on;
}http {
include mime.types;
default_type application/octet-stream;
keepalive_requests 1000;
proxy_next_upstream error timeout;sendfile on;
keepalive_timeout 128;
upstream testlb {
#Ipaddress:port where out application server runs
server 127.0.0.1:27015 max_fails=0;
server 127.0.0.1:27016 max_fails=0;
server 127.0.0.1:27017 max_fails=0;
server 127.0.0.1:27018 max_fails=0;
server 127.0.0.1:27019 max_fails=0;
}server {
# listens to port 80, backlog indicates size of request queue
listen 80 backlog=2000;
#ip where nginx runs
server_name 127.0.0.1;location / {
proxy_pass http://testlb/; }error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}}
There are hundreds of ways you can configure nginx to get the best out of it according to your need. In this configuration file I set up nginx to forward the client requests to five application servers running from port 27015 to 27019. I strongly encourage you to know more about configuring nginx as it will prove very helpful when you design highly scalable system using nginx. Our system will look like following if we use multiple machine to host application servers.
Load testing
We will use a popular test tool called jemter to send 1000 request per second to nginx. Download jmeter and configure it in the following way.
- Run jmeter.bat located in bin folder from cmd
- Create a new test from File->New
- Right click Test Plan->Add->Threads(User)->Thread Group
- From Thread Group change no of threads to 1000 and keep the rest same
- Right click Thread Group->Add->Config element->Http Request Defaults
- From Http Request Defaults set server name as 127.0.0.1 and port 80
- Right click Thread Group->Add->Sampler->Http request
- Right click Thread Group->Add->Listner->View Results Tree
- Start the test by clicking green play button
I tested in a machine of 4GB of RAM and core:i3 dual core processor . Almost everytime requests completed within 1 second, in few cases some requests throw error but I am sure by taking time to configure the nginx the problem can be fixed.
What’s next:
- Try to use multiple machines and tweak your server code to compute something difficult. See how that compares when using single machine.
- Use multiple nginx on different machine to overcome single point of failure.
- Spend time to optimize nginx configuration file for best performance under heavy loads, varying from ten thousands to 1 million per second.