File Streaming Performance in dotnet

A multi-part file streaming upload package.

TL;DR — released UploadStream package for dotnet - improving multi-part file uploads performance by ~25% reduced CPU usage (us) and ~50% less memory impact (gen0 gc).

Recently I became involved in an angular/dotnet project that was uploading photos, turns out it was performing file uploads via json with conversion of files to/from base64. In additional to this several legacy systems required upgrading from using file system storage to blob storage — presenting another use case for architecting a file API.

In order to design a file API with confidence of scaling, I was curious as to the performance differences between using the default method of upload — model binding to IEnumerable<IFormFile>, which buffers the entire stream — and streaming large files as described by the Microsoft documentation— I was also keen to highlight the difference in performance over uploading/converting from base64.

The first issue I came across was that the code in the Microsoft documentation did not lend itself to simplicity of implementation. This provided motivation to rewrite the code and release it as a nuget package ~ UploadStream.

As shown from the code, implementation is now relatively straight forward — specify the [DisableFormModelBinding] attribute — which crazily enough disables form model binding — and using this.StreamFiles<T>(x => {}) to process the stream via the specified delegate as well returning a validated model.

Believe implementation provides full flexibility to process the streams as desired, while still maintaining typical strongly-typed generics model binding functionality. An overload — this.StreamFiles(x => {}) - is provided if no model binding is required. As this is implemented as an extension method on the controller, the this keyword is required as implicit extension methods from within a type is not a supported language feature.

As to the performance, while this method for streaming large files appeared to be the recommended method by Microsoft for implementing streaming for large files, some performance metrics would be useful — basically answering the question “is is worthwhile implementing for typical photo sized images?”

I used the awesome Benchmark.Net to test a variety of aspects — note that the use of is only indicative of performance improvements, and real world load testing should be performed (let me know results if you do!).

Following was of interest :-

  • Performance of streaming vs default model binding with IFormFile
  • Impact of uploading files as base64
  • Upload performance on a range of file sizes — 6kb, ~100kb, ~860kb, ~6Mb, ~22Mb, ~124Mb
  • Performance of multiple api requests vs one api request with multiple files — how much impact on the system does this have?

Results were great, showing a convergence on ~25% reduced CPU usage (us), and ~50% less memory impact (gen0 garbage collection) for typical image sizes — over 6MB, performance improvements for smaller files (10kb-1Mb) range from 5%-20% reduced CPU usage and 5%-30% reduced memory impact.

As expected base64 performance was abysmal~5x-20x worse performance for the majority of file sizes.

streaming performance, results normalised to default IFormFile model binding

Results normalised in comparison to IFormFile, performance measured execution time (us) and memory impact measured as number of gen0 garbage collects per 1k ops - highlighting the impact on memory.

Unsurprisingly Base64 also shows high memory heap thrashing with gen1&2 garbage collects.

Uploading many files separately — load testing 20x API calls in parallel — shows little difference between IFormFile and StreamFiles for small files sizes (~100kb), after this differences become more pronounced — 33% improved performance and 65% less memory impact, base64 decreases further in performance to 40x compared to using the default IFormFile.

20x API calls, demonstrates further performance improvements when streaming

Uploading 20x files in one call with StreamFiles highlights similar performance improvements as with single file uploads of ~25% over IEnumerable<IFormFile>, however memory impact improves to ~60% reduced memory impact over IEnumerable<IFormFile> model binding.

Generally uploading with 20 files in one API call of 20 files compared to 20x API calls offers a ~2.5x improvement in performance.

20x file uploads in one API call comparison.

Next steps would be to perform load tests on a server to validate/quantify the actual performance impact/improvement.

Detailed results can be found on the github project page.