Astonishing Performance of .NET 5: More Data

Alex Yakunin
Nov 20 · 4 min read

The same day .NET 5 was released I shared a single screenshot showing how much faster .NET 5 is relatively to .NET Core 3.1. I promised to share more data later — and here it is.

1. Fusion’s Caching Test — running on Ubuntu + Intel and AMD CPUs

Code: https://github.com/servicetitan/Stl.Fusion.Samples , see “Caching Sample” section there.

Overall, these tests stress an API endpoint fetching a small object from SQL database, the only difference is how it is exposed (locally or via HTTP) and if Fusion is used there to provide transparent caching.

My initial test was performed on Window & AMD CPU, but almost every production service runs on Linux nowadays, so I decided to fix this and add Intel-based system to the test.

Image for post
Image for post
Image for post
Image for post

You might notice web service speedup is actually lower than in my initial post — most likely, that’s because .NET Core 3.1 version of code used this time was built with the newest (5.0.0) versions NuGet packages. In other words, it compares just the runtimes, but factors out the libraries.

2. YetAnotherStupidBenchmark — SIMD intrinsics, tight loops, IO

Code: https://github.com/alexyakunin/YetAnotherStupidBenchmark

Originally written as an attempt to prove or disprove a thesis that C# could be nearly as efficient as C++ on “data crunching” problems, this simple benchmark compares C# and C++ efficiency on ~ “scan, decode, and aggregate” task.

The results were produced as follows:

  • Run each test 6 times — for each framework and CPU
  • Take the best result among all 6 runs
  • Compute the speedup as a geometric mean of (NET31_Time / NET5_Time) across all tested systems.
Image for post
Image for post
Image for post
Image for post

Surprisingly, .NET 5 manages to speed up even a heavily optimized code by up to 7% in comparison to .NET Core 3.1.

You could also notice that:

  • ~1.5-years old results on the same Core-i7 8700K were 95ms for C++ and 101ms, i.e. they are now ~ 5% worse for both CPUs. An impact of Spectre and Meltdown mitigation?
  • .NET 5 gets extremely close to C++ on Core i7: it produces the same 99ms on SIMD version of the test, but relying on async pipeline reader rather than memory mapped file.
  • But there is a significant leap between C++ (74ms) and .NET (93ms) on the same test on Ryzen Threadripper. And since it’s almost identical SIMD code, I’m unsure what to blame here. If you know — please share this in comments.

3. GCBurn

Code: https://github.com/alexyakunin/GCBurn

Originally written to compare .NET and Go garbage collection and peak allocation performance. I’ll focus only on allocations here, because all other metrics are pretty similar (though you’re welcome to compare the raw output).

Image for post
Image for post
Image for post
Image for post

Notes on this test:

  • Interestingly, .NET 5 is faster on burst allocations and smaller heaps, but slower on larger heaps — though not dramatically.
  • I had to exclude 75% RAM static set result from the chart — the 110M ops/s for .NET Core 3.1 there is way off from everything else, and I didn’t have enough time to find out why, though my guess is: somehow one extra full GCs were triggered during this test (it takes ~ 5s on ~ 50GB heap).

It worth saying that GCBurn is definitely the least useful one among these tests: it runs the code you’ll hardly ever see in production. Its goal is to turn memory allocator and GC into a bottleneck to measure their efficiency, but in reality this never happens, and moreover, even if you hit e.g. memory allocator performance limit, you have a number of workarounds to address this.

Besides that, GCBurn doesn’t use pinned objects, so it completely disregards one of key improvements in .NET 5 GC.

So is .NET 5 GC better or worse? One more chart illustrates how big could be the difference between the stress test and the reality:

Image for post
Image for post

This is how RAM usage changed for the only service (a tiny one) we migrated to .NET 5 so far. As you see, it’s dropped by more than 2x. This is certainly quite motivating, so I hope to write another post with more of production data soon.

You can find the raw data produced by these tests in this Google Sheet; the raw output of GCBurn and YetAnotherStupidBenchmark could be found in their repositories — see “/results” folder there.

P.S. If you love reading about .NET and performance, check out Fusion — my bold attempt to revolutionize the way we implement caching and real-time UI updates.

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

Alex Yakunin

Written by

CTO @ ServiceTitan.com, creator of https://github.com/servicetitan/Stl.Fusion

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

Alex Yakunin

Written by

CTO @ ServiceTitan.com, creator of https://github.com/servicetitan/Stl.Fusion

The Startup

Medium's largest active publication, followed by +730K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store