It’s interesting to see the the parallels between web development and native graphics. Browsers batch updates to the DOM as well, and any attempt at reading back computed data would force a re-paint, which will force pending updates to the browser DOM to complete, possibly losing performance.
GC pauses is the bane of any interactive application. The somewhat-obvious-once-you-think-about-it-but-often-overlooked fact is that GC time scales with the amount of live-memory, rather than just the amount of garbage (some GCs in fact don’t care about the amount of garbage at all). Producing garbage will trigger GC and its live-memory-scaled overhead though. Good generational GCs have various tricks and heuristics to deal with this. It’s surprising that Unity doesn’t provide an API which lets you reuse the buffer (perhaps they only see this API as used for debugging or screenshots, and not continuous capture?).
I’m not familiar with Unity, but what keeps rendering from happening twice, once to the real frame-buffer, and once to the texture that you are reading back?