More horsepower to dotnet core — from redBus

Background — refer to the previous post here. We moved to dotnet core (1.1) and were quite happy with the overall performance. One important thing we wanted to improve was to see if we can get better throughput with less servers. We knew 2.0 was available at the time of our movement but did not go live with that as it was still in beta. Once 2.0 was released, we put a team quickly to do the migration.

Why was it required ? We were facing some issues still during our peak load — the only way to circumvent this was to add more instances at that time. We observed the following :

  • For a c4xlarge (default Ubuntu configuration), we had tested with a peak load of 2k requests/min. Any increase in the number of requests, caused latency and ELB 5XX errors shooting up. On looking at system metrics, we saw the system thread count was quite high. We played around with few configurations such as max_concurrent_conn, max_threads, min_threads etc .. but at the end we were unable to get it more than 2K req/s on our application. This had a catastrophic affect, as the system went down before the auto-scaling rules could kick in etc .. [AWS ELBs remove the instances from their load balancer when the health check criteria is not met correctly]. We took care of the auto scaling rules to ensure the application always remain healthy — but this resulted in more number of servers.

So, what did we do ? We knew dotnet core can offer much more and we took the dotnet core 2.0 route. There are quite a good number of advantages 2.0 provides. This article provides information in a greater detail.

The 2 properties set above are quite interesting here: The kestrel server limits are well explained here. Do take a look at these 2 properties.

  1. MaxConcurrentConnections
  2. MaxConcurrentUpgradedConnections

Tweaking this helped us to some extent, but we still had some problems around high thread_wait during peak load. We observed that many modules of our code were not async — which obviously caused the increase in thread_wait. ASP DOTNET core provides async <> await. This article provides an in depth explanation of how this works.

Finally we fine tuned the /etc/sysctl.conf file at the OS level.

After these changes along with async await on ASP.Net Core 2.0 we were able to take as high as 11 to 12 K req/minute per server. This was a huge jump from 1.5K req/minute per server i.e about 7 times improvement of our load handling capacity. This is a remarkable jump w.r.t what we had earlier.

Conclusion:

We believe dotnet core is heading in the right direction and we are excited about this. We plan to move some of our other micro services on to this stack.

Refer : Blue with 2.0 (latest config) and Orange with 1.1 (old config)

  • We were able to achieve 5X more throughput.

Number of requests on our Load Balancer

Number of instances

  • We saw a decrease in the latency as well.