AWS S3 .NET Client High Memory Usage

Contents

Problem discovery

One of the things we do at Codeweavers is help people find their next vehicle. That usually involves customers seeing what vehicle they are buying — I mean, would you buy a car without seeing what it looks like? The application that holds this responsibility is the worst offender for obscene amounts of allocations, time spent in GC, and generally eating RAM like the Cookie Monster eats well…cookies.

0;chunk-signature=48ebf1394fcc452801d4ccebf0598177c7b31876e3fbcb7f6156213f931b261d
  1. Images are pushed to us
  2. We pull images from a SFTP
procdump64.exe -ma -64 AWS-S3.exe

Why is it a problem?

The LOH is a region of memory that is collected but never compacted — though as of .NET v4.5.1 compaction is now possible — word of warning compaction of the LOH is expensive; around 2.3 milliseconds per megabyte. A good rule of thumb is that short-lived objects should never make it onto the LOH.

Introducing the best magic number — 81,920

Thanks to dotTrace we were able to establish exactly what was causing the LOH fragmentation. It also showed us that the fixed cost of 0.3 MB per invocation of PutObject happened inside of the constructor for ChunkedUploadWrapperStream:-

  1. Use System.Buffers to rent two byte[] arrays from a pool of byte[] arrays
  2. Use Microsoft.IO.RecycableMemoryStream and operate directly on the incoming stream using a pool of Stream's
  3. Expose DefaultChunkSize so that consumers of the API can set it themselves
  4. Lower DefaultChunkSize to a number that is below LOH threshold (85,000 bytes)

Idle hands

Whilst waiting for my pull request to be reviewed I decided to poke around the AWS S3 documentation and I stumbled across the concept of pre-signed URLs. That sounds interesting! Creating V2 of the uploader:-

Just one more thing

Thanks to my last article I have learnt what nerd sniping is — something I do to myself quite a lot. At this stage I was feeling that giddiness about what else could be shaved off, I was wholly looking at the 0.4 MB remaining on the LOH. Again, dotTrace points us in the direction of code path causing that 0.4 MB allocation to the LOH:-

https://##bucket_name##.s3.##region_name##.amazonaws.com/##path##/##file_name##?X-Amz-Expires=300&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=##access_key##/20180613/##region_name##/s3/aws4_request&X-Amz-Date=20180613T233349Z&X-Amz-SignedHeaders=host;x-amz-acl&X-Amz-Signature=6bbcb0f802ad86022674e827d574b7a34a00ba76cd1411016c3581ba27fa5450
  1. 2018–03–07 — Issue created on the aws-sdk-net repository
  2. 2018–03–13 — Pull request sent in
  3. 2018–03–29 — Pull request merged
  4. 2018–03–29 — New version of AWSSDK.Core released to NuGet

TLDR — Give me the good stuff

Versions of AWSSDK.Core below 3.3.21.19 caused a fixed cost of 0.3 MB per invocation of PutObject on the AWS S3 .NET client. This was rectified in versions 3.3.21.19 and above. For particularly hot code paths, it is worth exploring the use of GetPreSignedURL on the AWS S3 .NET client as that dropped LOH allocations by 98% in our context and use case.

Footnotes

¹ Another reason may be that WinDbg still scares me.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store