Did the big allocations of RAM contain pointers, directly or indirectly (actual pointers, strings, interface or function values, maps, or slices)?
If you allocate a giant byte, int, etc., or an array of pointerless structs, Go treats it as ‘noscan’ so the GC’s scanning phase gets a break. That can require gimmicks like using numeric IDs/offsets instead of pointers, much as if you were implementing an on-disk database, but you can continue to use language constructs to allocate instead of having to go to the OS. (And noscan allocs can still add time to the sweep phase, but I think that’d be low for a few large allocations.)
Also sort of curious what the user-facing app latency numbers looked like in the version where the GC was spending a lot of time. That’s one thing that you can sort of only understand from real apps and I haven’t seen much out there about.