Evaluating a Refactor, Part I

Stephen Darlington
Trimble Maps Engineering Blog
4 min readAug 30, 2021

We’ve talked a lot on this blog about getting started with various technologies and different ways to architect your code or your solution. We’ve also talked about testing and measuring performance. But one thing we do a lot as software engineers is refactor. As a professional engineer, it is your responsibility to make your product’s code better each time you touch it. Without continuous improvement in our code, it will begin to rot. You can get a sense if the code is easier to read by, well, reading it. But how do you know if the code performs better? That’s where benchmarking comes in.

They’re both fruit.

Benchmarking is the act of running a computer program multiple times in order to assess its relative performance. Common metrics include execution time and memory usage. One big challenge in benchmarking is that things take longer when the system is cold than when it’s warm. So how do we benchmark and how do we account for the variations in our systems? There’s a handy NuGet package that can take all the guesswork out of it called BenchmarkDotNet.

Getting Your Hands on Benchmarks

If you’ve worked with .NET for any length of time, you know how expensive it is to work with strings. The string class is immutable; each time you manipulate a string, it gets recreated and the previous memory is cleaned up. This is why Microsoft has added StringBuilder, string interpolation, and other methods of working with strings. For the purposes of our benchmark, we’re going to compare basic string manipulation against StringBuilder.

To get started, create a new Console Application in Visual Studio or VS Code. Next, install the BenchmarkDotNet NuGet package and you’re ready to start benchmarking.

The first thing we’re going to do is create a class called MyTestClass which will hold the methods that we want to test. Next, we’ll create a method to do classic string manipulation. (I would NEVER write code like this in real life, so it feels wrong to do this.)

[Benchmark]
public string GiveMeAnA()
{
var output = string.Empty;
for (int i = 0; i < 100; i++)
{
output += "a";
}
return output;
}

Take note of the Benchmark attribute added to our method. This tells BenchmarkDotNet that this is a method we want to measure.

The next thing we’re going to do is create a second method to do the same logic using a StringBuilder.

[Benchmark]
public string GiveMeAnAWithStringBuilder()
{
var output = new StringBuilder();
for(int i = 0; i < 100; i++)
{
output.Append('a');
}
return output.ToString();
}

Finally, in the Main() of our console app, we tell BenchMarkDotNet to execute all the methods tagged in our class:

static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<MyTestClass>();
}

Now that we’re all set up, run the application in Release mode and you’ll get the following output:

|                     Method |       Mean |    Error |   StdDev |
|--------------------------- |-----------:|---------:|---------:|
| GiveMeAnA | 1,957.9 ns | 16.19 ns | 15.14 ns |
| GiveMeAnAWithStringBuilder | 313.2 ns | 1.74 ns | 1.45 ns |

We can see that StringBuilder performs significantly faster than string manipulation. You could do the math to figure out how much faster, but BenchmarkDotNet can do that for you! All you have to do is tag your primary method with Baseline = true like so:

[Benchmark(Baseline = true)]
public string GiveMeAnA()

Running the benchmarks again will show us the ratio of the baseline performance vs our refactored method (error and standard deviation omitted for length):

|                     Method |       Mean | Ratio |
|--------------------------- |-----------:|------:|
| GiveMeAnA | 1,902.1 ns | 1.00 |
| GiveMeAnAWithStringBuilder | 314.6 ns | 0.17 |

Using StringBuilder takes 17% of the time that basic string manipulation takes on my machine.

Sometimes you want to see how much more or less memory one approach takes vs another. BenchmarkDotNet handles that too with its MemoryDiagnoser attribute. Tagging the class that contains your methods with this attribute will export information about the memory usage of your algorithm:

|                     Method |       Mean |  Gen 0 | Allocated |
|--------------------------- |-----------:|-------:|----------:|
| GiveMeAnA | 1,918.6 ns | 3.0060 | 12,576 B |
| GiveMeAnAWithStringBuilder | 314.6 ns | 0.1836 | 768 B |

Simple string addition uses 12,575 bytes of memory and incurs 3 garbage collections on Generation 0 per 1,000 operations while StringBuilder only needs 768 bytes and performs less than 1 garbage collection per 1,000 operations. Garbage collection is expensive, so not only does is StringBuilder significantly faster and use a lot less memory, it also performs much fewer expensive garbage collection calls.

For reference, here is the full source code of our final solution:

using System;
using System.Text;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
namespace benchmark
{
class Program
{
static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<MyTestClass>();
}
}
[MemoryDiagnoser]
public class MyTestClass
{
[Benchmark(Baseline = true)]
public string GiveMeAnA()
{
var output = string.Empty;
for (int i = 0; i < 100; i++)
{
output += "a";
}
return output;
}
[Benchmark]
public string GiveMeAnAWithStringBuilder()
{
var output = new StringBuilder();
for(int i = 0; i < 100; i++)
{
output.Append('a');
}
return output.ToString();
}
}
}

Conclusion

By benchmarking your project, you can get real numbers to help you show how much your changes improved your code. With things like the memory allocation, you can quantify how much less memory an proposed solution will use if you’re trying to solve an issue with memory pressure in production.

BenchmarkDotNet includes many more metrics and allows you to compare execution stats of your algorithm against different version of .NET Core, .NET Standard, and .NET Framework. We’ve barely scratched the surface, but hopefully you’ve been inspired to dig deeper and see what else it can teach you about your code.

In Part II of this series, we’ll look at how you can evaluate a larger refactor under production load. Until then, happy benchmarking!

--

--