Negating the past year, I’ve spend many years doing web development. I created a startup that scaled the world in PHP when I was in high school until it crashed because I violated my servers Terms of Service. Whoops. Sold the domain and called it quits. Also, I’ll never touch PHP again. Ugh.
This past year, I’ve been doing two things. One is working full-time as an iOS Developer, which I have a ton of passion for. I consider it my “front-end” expertise. On the backend, both at my day job and at home, I write Node.JS.
At work, typically they’re small scripts to perform API tasks from some other service and pipe it to the iOS apps via some RESTful (or not) API. I really learned to love node during this time, especially with NPM. I love NPM so much that my friend and I came up with a drinking game, the rules go a little like this:
- Think of a verb, say it out loud
- Someone else looks on NPM for <noun>.JS, if they find it, you take a drink.
Very fun game, plan to get smashed ☺
LOAD BALANCING YOUR APP
Where to start. Hm. Well, hopefully you’re decently familiar with Node.JS, but you’ve never had to load balance. However, this article is fairly generic, so if you’re using Python or some other stack, this might work for you still!
So what is load balancing?
There are two main types of load balancing you might want to do with your Node.JS website, balancing across CPU cores, and balancing across different physical servers. Both follow the same principal, but the architecture of where things like your database, redis, etc will be different. For my architecture, I decided to keep things on one physical server, but since I have 8 physical cores, I decided to load balance across 5 of them, leaving the rest for other things such as redis, MS MongoMyPostOracleSQL Server*, and the OS.
Side bar: Each node process can consume up to 1.5GB of RAM. Ensure you have 1.5*n GB of RAM as a safety measure. In my case we have 16GB per each physical server. Typically, I never see anything above 150 MB per process.
Assuming you have an Express.JS (3.x, Ihaven’t upgraded to 4.x yet), do something like this:
(I have turned this into a proper NPM module, here: https://www.npmjs.com/package/portbalance)
Hopefully you’re familiar with the app.listen part, and if you are familiar with it you’ve probably seeing EADDRINUSE errors— this means that you’re already listening on that port somewhere on your server. I’ve wrote an easy solution to port management, where each process scans for an open port in the defined range (Up to 5). This way if a process fails or crashes, when it restarts, it will find a port it needs to fulfill. You aren’t mapping a certain process number to a certain port. If you spawn too many processes, they will just exit since they aren’t needed.
Now we have multiple node processes running on multiple CPU cores, and each process is listening on its own port ranging from 8000 to 8004. Obviously, we don’t want to serve traffic off of port 8000, much less do we want to serve off of four different ports. This is where, in my opinion, the most critical part of any Node website comes into play: The reverse proxy. Reverse proxying is having a middle man between the browser and our server, he will listen on the standard port 80, and hand off the requests in a (load) balanced fashion to each of our processes. Our server replies to the load balancer, and the load balancer sends it back to the client.
Side note: Do not run your app server on port 80! Ports 1–1023 can only** be listened on by a privileged user with elevated permissions. Put your system administrator hat on and understand that our code shouldn’t be given elevated privileges, that’s how vulnerabilities are born!
Nginx is what I use. It’s free, fast, and has the right features for our needs. Apache2 has more features, but it is much more bloated and nginx seems to be a hit in the Node community anyways. Nginx is also extremely, extremely well and pen tested, so I feel confident in saying: Allow it to listen on port 80.
So you’ve installed nginx, depending on your flavor of OS there is some configurations somewhere go find them, but mine were in /etc/nginx/sites-available/ (Debian 7, Wheezy) This is the configurations for nginx that tell it what to do, such as serve PHP, reverse proxy, perform SSL termination, etc. Delete the default file and create a file with something similar to this:
In the first 7 lines, I’m defining where the different processes will live. If they reside on a different server, you could use IP addresses or domains. Since I’m keeping this on the same server, I’m just changing the port number, and using localhost.
The rest is fairly boilerplate, I’m listening on port 80, giving it a domain name, and redirecting requests to the list of appserver pool (lines 1–7). By default, this does round-robin. Most likely that will be good enough for your needs, but if you want to explore other options, feel free to visit nginx’s documentation on load balancing: http://nginx.org/en/docs/http/load_balancing.html
Once you get that done, fire up your node processes and try to login or something. Refresh a few times.
You haven’t graduated pirate school yet.
Eegh! Well we have a problem: You’re using sessions or something to keep your user logged in. The problem with that is sessions are bound (local) to the process and are making your app STATE-FUL. What is a state-ful app?
Being state-ful means that a process holds local variables in reference to the user, in this example, the users session. A process needs to be able to gather all the required data to process the request based on a small seed. Typically, this is a cookie, but it can be other things such as your IP address, a username, blah blah. This small seed unlocks the rest of the data to allow ANY process to fulfill your request.
You CAN NOT load balance a state-ful app! It causes weird issues like log in sessions only work 20% of the time, because other processes don’t know about your session, so they act like you’re logged out. How do we fix this?
We need to keep things like session data, or anything that would make our app state-ful, in redis. I’m not going to dive into how to connect to redis, as there are many tutorials if you, but if you use Express.JS, here is a good tutorial: http://blog.modulus.io/nodejs-and-express-sessions
In an express app, if you need to keep any data on a user, put it in a session or in the database. It’s the easiest way to keep your app state-less.
So… redis is like a database, right?
NO. Data in redis should be expirable! It should NOT be mission critical data. Yes, redis can be persistable, but do not depend on it! Redis is also kept 100% IN MEMORY. That is what makes it so fast, but you’d quickly run out of memory if you used it as a database. Probably. Just don’t do it
Side note: Your database must be accessed from all processes.
Side note: Use query transactions if your DBMS supports them. It will save you headaches with data integrity later on
Ok so I’m a state-less pirate now, can I please deploy?
Go for it. Have fun keeping 5 processes up 24/7/365. Maybe set up a text alert for one goes dow— ok I couldn’t finish, lol. We need a process manager to keep our processes alive, because node can (and should) crash often. If you’re using screen or tmux to keep your processes alive: Listen up!
So you have your code on the production server, using git (…right??), etc. You have PM2 installed, and configured with their keymetrics (below) system. How do we fire up our server?
- pm2 start server.js -f
- pm2 start server.js -f
- pm2 start server.js -f
- pm2 start server.js -f
- pm2 start server.js -f
One for each process we want to keep alive, so 5 in this case. After 30 seconds or so, use
- pm2 list
Remember how we assigned the port to the process up top? It auto fills the ports needed as processes spawn, so we don’t need to pass any parameters or anything silly.
PM2 will ensure that your processes stay alive. If the app crashes, it will restart automagically.
The people who made PM2 also made KeyMetrics. They’re in a public beta, which is absolutely awesome yet buggy. It is very easy to connect your server to their website and get running, and their SDK is very neat. I’m not sure if it’s going to cost a fortune in the future, but it’s free at the moment.
Note how you can view that there are 4 processes running on the right hand side. The RAM usage is cumulative for all 4 processes. This is for our staging environment, so things are configured a little bit differently than production.
I highly recommend checking it out. It will take you 5 minutes to setup.
How do I test load balancing?
Using AB! Ab is a command line tool created by the Apache project. Typically, I use something like:
- ab -n 10000 -c 200 http://yourawesomewebsite.com/someURL
Which performs 10,000 HTTP requests, 200 concurrently at any given time. Once it’s finished testing it will give you a report. I recommend trying with only one server in the nginx configuration, and adding one between tests. This allows you to see how much performance you will gain as your user base grows. I’ve tested my site with around 4,000 concurrent users and it holds up like a champ.
If you’re not seeing any performance enhancements at all, make sure the load balancing is working: type
- pm2 logs all
On the server, and perform ab again. If you log HTTP data, you should see various servers showing requests, like so:
I’ve noticed a +80% speed up with every process I add to the load balance, as long as I have the CPU cores to support it.
I’ve load balanced and now I’ve found a bug I need to fix!
Load balancing with nginx adds a very neat feature: If nginx detects that a process is no longer responding, it marks it as failed for a certain timeout (depending on your configuration). I believe it’s around 15 seconds by default. This adds the ability for us to take processes down individually and allow upgrades. So go ahead and fix your code locally, push it to git, and do the following. And by do the following I mean write a script to do this for you:
- Go to the apps code base, and pull the latest version you want to deploy.
- Perform an AB test during this. Watch what happens.
If you did the ab test correctly, you will notice that all requests were served, with zero downtime, and your code was updated! That’s an extra added benefit of load balancing. As long as one process is alive, your requests should be fulfilled.
- Performing an AB test during this is a good way to prove that you are having zero downtime. You don’t have to crank the settings up too high. Ensure that your requests are succeeded and you’re not receiving non-200 responses. I wouldn’t do this for live code pushes to production, just as testing the concept.
Holy crap I wrote a blog! I hope you’ve enjoyed it. If you have any notes or see an errors, please email me at email@example.com.
Thanks so much for reading! I plan on writing about some other topics, some lengthy, some short. If you want me to talk about something, send me an email and if I’m familiar with the topic I’ll give it a go.
- *I don’t care about DBs so please never ask me which I prefer
- **There are ways to do it as a regular user… but I wouldn’t.
- Thanks to /u/nschubach for pointing out a mistake, ports 1–1023 are privileged, not 1–1000.