Performance test nodejs socket.io with Artillery

Published in

deqode

6 min readApr 10, 2017

We at TechRacers need to develop a node application that primarily has real time features. You can think of it as a service that notifies half a million clients for a particular event. The requirement is that, there should not be a delay more than 1–2 seconds in event notification.

We have chosen to use socket.io for this. Integration was easy & quick but we had to make sure whether this will be able to handle a certain number of connections on a 1GB 1Core S3 instance running Ubuntu 14.04.4 LTS. We used this small server for POC only. With this POC, we can be certain that we are not on the wrong path. As this was a small machine, we targeted this to handle at least 1300 connections and application should be able to broadcast a thousand characters string to all the 1300 open connections with a propagation delay of no more than 300 ms.

We configured Nginx with HTTP Upgrade mechanism to upgrade the connection from HTTP to WebSocket and started node socket.io application in production using PM2. As soon as everything is started up and running, CPU was ideal and overall memory consumption was around 357MB, this included OS, Nginx, and Redis as well. Node socket app used only around 88MB of memory.

Initial Node app memory consumption

Initial system memory consumption

We used Artillery for opening multiple socket connections. Artillery provides support engine for socket.io, we only have to specify the URL of the websocket server and write the test scenarios.

An example scenario would look like:-

We define one load phase, which will last 360 seconds with 5 new virtual users(arriving every second (on average) and second load phase 3600 with an arrival rate of 0.5 (one virtual user every two seconds). We defined second test phase of 3600 duration because we do not want to finish the test, artillery drops all the connections once the test is over. Keeping test running with the negligible arrival rate of virtual users will give us enough time to test broadcasting propagation delay.

For every virtual user, the flow will be that, artillery will open a socket.io connection and will emit MT5TerminalData as clientKey on “join_room” channel and will wait for one hour ({“think”: 3600}). By this, we will be having too many open connections although no data will be flowing inbound or outbound.

On receiving the MT5TerminalData on “join_room” channel, the socket.io server will attach the connection to room MT5TerminalData. In our scenario, we are making all virtual users join the same room(MT5TerminalData). Please see below the serverside socket.io code for more details.

Server side socket.io code:-

We ran the artillery scenario and reached 1800 connections easily with a 4Mbps connection.

While the connections were increasing, we analyzed server resources. System memory consumption increased gradually.

Line Plots of resources:-

Initial overall system memory utilization was around 360MB when everything was ideal it increased to 493MB when 1800 connections were open. (493–360)/1800 = 0.0739
So for keeping a single connection open and alive 0.0739MB of memory is required.

Initial Node app memory consumption increased to 152MB from initial 90MB.
(152–90)/1800 = 0.034
So for a single connection 0.034MB memory is increased.

From the above analysis, we can see that overall system memory consumption for a single connection is around 0.0739MB and Node app memory consumption is around 0.034MB. Both are having a large difference, the reason is that all the connection states are stored in Redis as Redis was running on the same server, overall system memory consumption for a single connection was equal to (Redis memory + Node app memory). We used redis adapter for socket.io so that we can easily sacale the app horizontally. Around 0.039MB data is stored in Redis for every connection.

Below is a line plot of CPU usage when virtual users were arriving at the rate of 5/second, we saw a peak 7.9% of CPU usage when to be reached more than 1600 connections. This line plot may not be that accurate as the plot is based on some random snapshots. The peak CPU usage was only seen when we connections were arriving at the high rate and this is not a problem.

After seeing a peak CPU usage of 7.9%, we were curious to see how our application behaves when no new virtual users arrival rate is negligible(one user every 2 seconds, the second phase of the test) and 1800 connections are open. We found that for maintaining around 1800 open connection no data flowing max CPU usage was no more than 1.2%.

Our application was performing very well and with very few resources we were able to open 1800 connections. Now we have to broadcast data to all these connections and check the how much time it took. So we created a simple HTML page and in that, we will connect to the same socket.io server and will join the same room on which 1800 connection are present. After that, we will broadcast a 1000 chars string along with the server timestamp from the server. Our browser will be listening to all the broadcast from the server in “MT5TerminalData” room so when we will broadcast data from server our browser will receive the data and js code emit the same data back to the server on a new socket channel. We can not compare the server timestamp with browser current time because clocks of server and browser can not be synchronized with accuracy better way is that we send the timestamp and data back to the server and divide the delay of roundtrip by 2.

Also, another problem is that Javascript timestamps are inaccurate and network state cannot be same for all transmission so for our benchmarking we will take multiple samples when all 1800 connections are open and will find a mean. After taking samples we will disconnect all the connections and will take samples with only one connection.