Container Networking — PART2 & CloudFoundry Performance

In PART 1, I was describing Container Networking on CloudFoundry (PWS/PCF) and how to migrate(very easily) your existing apps to benefit from it.

While the advantages are obvious, without a benchmark, you might wonder if you should use Container Networking yet. The second part of this blog is to compare the performance with and without Container Networking.

I’ve been using the same app as PART1. The source code is available on GitHub. I’m running both apps on Pivotal Web Services (PWS). As the GitHub repo is having Container Networking enabled, i have updated it to execute without it. Basically applied the following 2 steps:

  1. Removing the direct access with cf remove-access travel-client travel-service — protocol tcp — port 8080
  2. Adding a route on the travel-service

To run the tests, I’ve been using Locust.io with a few hundred users for about 3 minutes each. As a reminder, the goal of the tests is to estimate the performance gain using Container Networking.

Here are the results(in ms) WITHOUT Container Networking:

Name              #fails     Avg  Min   Max | Median 
- — — — — — — — — — — — — — — — — — — — — — — - — —
GET /destinations 0(0.00%) 135 52 588 | 100

Here are the results(in ms) WITH Container Networking enabled:

Name              #fails     Avg  Min   Max | Median 
- — — — — — — — — — — — — — — — — — — — — — — - — —
GET /destinations 0(0.00%) 100 53 498 | 76

As you can see, with Container Networking enabled, I’ve saved about 35ms average between the travel-client and travel-service. If you are in a microservices architecture, the gain is quite significant and you should definitely use c2c.

Performance on Pivotal Web Services Cloud Foundry

If you wonder how CloudFoundry performs in production, here are some real benchmarks from one of my critical microservice in production. The results are from NewRelic APM tool.

Weekly Response time on PWS

This table reports the weekly number of requests with the average response time. It’s really interesting to observe that the more requests my microservice gets, the lower response time is. This is the same app with only 2 instances running. No autoscaling, or deployments with improvements.

Having a service responding under 10ms but having a huge network latency between services, it’s a great improvement to be able to reduce that latency by 30ms.