Improving FreeImage.Standard performance with Span<T>

  • Span<T> — represents an area of contiguous memory (managed or unmanaged)
  • ArrayPool<T> — provides a pool of reusable array objects
  • stackallocate — allocate a block of memory on the stack
  • Span can only be allocated on the stack, so you can’t store a reference to it (e.g. in a property)

Tools

Initial Performance

Basic Improvements

static unsafe uint streamWrite(IntPtr buffer, uint size, uint count, fi_handle handle) {    Stream stream = handle.GetObject() as Stream;
uint writeCount = 0;
// Allocate a new byte[] every time
byte[] bufferTemp = new byte[size];
byte* ptr = (byte*)buffer;
while (writeCount < count) {
// Copy the unamanaged buffer into the new array,
// one byte at a time

for (int i = 0; i < size; i++, ptr++) {
bufferTemp[i] = *ptr;
}
try {
stream.Write(bufferTemp, 0, bufferTemp.Length);
} catch {
return writeCount;
}
writeCount++;
}
return writeCount;
}
static unsafe uint streamWrite(IntPtr buffer, uint size, uint count, fi_handle handle) {
Stream stream = handle.GetObject() as Stream;
int sizeInt = (int)size;
// Use the shared ArrayPool for byte arrays
var arrayPool = ArrayPool<byte>.Shared;
byte* ptr = (byte*)buffer;
uint writeCount = 0;
// Rent an existing buffer from the ArrayPool
byte[] managedBuffer = arrayPool.Rent(sizeInt);
try {
while (writeCount < count) {
// Represent the source (unmanaged) buffer as a
// ReadOnlySpan<byte>

var source = new ReadOnlySpan<byte>(ptr, sizeInt);
ptr += sizeInt; // Copy from the source to the managed buffer - rather
// than doing this one byte at a time, just leave the
// copy mechanism to the runtime (I am assuming it will
// do a block copy for performance)

source.CopyTo(managedBuffer);
stream.Write(managedBuffer, 0, sizeInt); writeCount++;
}
return writeCount;
} finally {
// Return the rented buffer back to the pool for reuse -
// do not access managedBuffer again!

arrayPool.Return(managedBuffer);
}
}

Real-World Impact

  • Throughput up by 984 ops/day (+20%)
  • Reduced allocations by 3.2 TB/day (-16%)
  • Reduced Gen0 collections 79%
  • Reduced Gen1 collections 25%
  • Reduced number of allocations by 74%
  • System.Byte[] has gone from (by far) the highest source of allocations to the third highest

Conclusion

Additional Resources

--

--

--

Full time nerd. Professional eater of cake.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Welcome to the Djaba Engineering Blog!

Dynamically updating Ansible Inventory

Agile methodology at Zomato

Boston Children’s first to launch on Google’s health study app

DFZ Metadata 2.0

How to unlock Kubernetes REST API and link it to Swagger UI

How to boot Linux (RHEL) Systems into Different Targets Manually?

call(), apply() and bind() in JavaScript

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ben Owen

Ben Owen

Full time nerd. Professional eater of cake.

More from Medium

Avoiding Puppeteer Antipatterns

Fixing Intel compiler’s unfair CPU dispatcher (Part 2/2)

GitHub Copilot Pros and Cons

How PVS-Studio prevents rash code changes, example N5