Optimising memory consumption on a rails monolith

Abunachar Yeahhia
GoComet
Published in
3 min readJun 10, 2024

Around last December, our application servers were using ~70 GB of memory to serve requests at peak hours. As part of platform team, I started investigating ways to bring down our memory usage and discovered jemalloc.

memory bloat on our application server

In this article, I’ll be sharing the root cause for our memory bloat and how jemalloc affected the overall memory utilisation.

Memory Allocators

Memory allocation in Ruby involves three layers, ordered from high to low level:

  1. The Ruby interpreter manages Ruby objects.
  2. The operating system’s memory allocator library.
  3. The kernel.

Ruby interpreter

On the Ruby side, Ruby organises objects in memory areas called Ruby heap pages. Such a Ruby heap page is split into equal-sized slots, where one object occupies one slot. Whether it’s a string, hash table, array, class, or whatever, it occupies one slot

Operation System’s Memory Allocator

The operating system’s memory allocator is a library that is part of glibc (the C runtime). It has a simple API:

  • Memory is allocated by calling malloc(size). You pass it the number of bytes you want to allocate, and it returns either the address of the allocation or an error.
  • Allocated memory is freed by calling free(address).

Kernel

The kernel can only allocate memory in units of 4 KB. One such 4 KB unit is called a page. Not to be confused with Ruby heap pages, which again have nothing to do with this.

The reason for this is complicated, but suffice to say that all modern kernels have this property.

Allocating memory via the kernel also has a significant performance impact, so memory allocators try to minimize the number of kernel calls.

Memory issue due to Memory fragmentation

What is memory fragmentation?

memory fragmentation

Imagine the heap as your Lego box. It's where Ruby objects are allocated and stored.

  • Over time, your app creates and destroys objects, leaving behind “holes” in the heap like missing Lego pieces.
  • These holes are small, scattered chunks of memory that can’t be used for larger objects.
  • When your app needs a big block of memory (e.g., to load a large image), it has to scan through the fragmented heap, potentially taking longer and even failing if no suitable chunk is found.

Mitigating Memory fragmentation

Reducing Avenues
The major cause of fragmentation appears to be the large number of glibc memory arenas in heavily multi-threaded programs. “Heavily multi-threaded” — sound familiar? That’s Sidekiq. That reduces the “heavily multi-threaded” trigger and leads to less bloat.

Change memory allocator
Change the default memory allocator from glibc to a different one ( like Jemalloc)

Jemalloc vs glibc

Jemalloc in action

The results were very impressive. In fact, the outcome was much better than we expected.

As seen, we were able to bring our peak memory consumption to ~20 GB.

--

--