Precise Method and Constant Invalidation in TruffleRuby

Brandon Fish
graalvm
Published in
7 min readAug 30, 2021

Co-authored by Benoit Daloze

In this blog post, we explore precise method and constant invalidation which reduces the number of deoptimizations in typical Ruby workloads from hundreds to zero and significantly improves startup performance.

When a method is called in Ruby, an expensive method lookup occurs by traversing the module hierarchy to find the correct method to call. The module hierarchy includes the receiver’s class, parent classes, and prepended and included modules. A similar search happens for constants to which the following discussion on caching also applies. To avoid these expensive lookups, Ruby implementations use inline caching based on assumptions that classes and methods have not changed.

These method lookups will typically result in the same method entry for a given receiver, assuming that the receiver’s methods have not changed. This means that lookups can be cached based upon the methods in this hierarchy not changing. Methods may be defined, redefined, or removed anywhere in this hierarchy. A method change may or may not need to invalidate a method cache depending on whether the change happens in the hierarchy before or at the previously found cached method. The following illustrates how method lookup works:

Method Lookup for First Call — Per Class Caching

Defining a new method bar in the following figure results in all method lookups in class B and child classes of class B being invalidated, even though they would not need to be (c.foo still calls the same B#foo):

Defining a New Method — Per Class Caching

For CRuby versions less than 3.0 and TruffleRuby versions less than 21.2, constant and method lookups have been cached based upon the module where the methods are defined not having any methods changed like being redefined, added, or removed. So, if any method is changed in a module/class, it would invalidate all the caches of all the method lookups that traversed the module/class during method lookup.

For instance, defining a new method on Kernel, like the following example, would invalidate all method caches with method lookups that traversed the Kernel module:

class Kernel
def my_new_method
end
end

It is common in Rails Active Support, for example, to redefine classes like this when it is loaded. For example, the core Object/Array/Hash classes are modified here. Ideally, we should not invalidate any existing just-in-time compiled (JITed) code when methods are added to existing classes like this.

The above is an unfortunate case of unnecessary cache assumption invalidation because updating modules is common in Ruby code and existing method lookups would be invalidated even though the added method would have no impact on method lookup results that were invalidated. In TruffleRuby, when a method cache is invalidated, it can harm performance by causing compiled code to deoptimize to the interpreter and require a recompilation, as well as causing a full method lookup to occur when the cache is missed for the next time a method is called.

Method Caching Improvements in TruffleRuby 21.2

To solve this issue of unnecessary method cache invalidation in TruffleRuby, we have updated the method lookup to allow caching based on the method name which we call “per method name caching,” since each method name on a module will have a method entry on its owning module which might be invalidated for lookups. As shown in the following example, when a method lookup is done, the lookup will receive a list of assumptions for each ancestor in the lookup chain for that method name only:

Method Lookup for First Call — Per Method Name Caching

Then, when a new method is added, only the lookups of the exact methods which have been modified will need to be invalidated:

Defining a New Method — Per Method Name Caching

Challenges

Ruby has many ways of updating methods in a receiver’s module chain including prepend, include, extend, constant/method redefinition, and undefine. This makes it a challenge to ensure that each of these scenarios will invalidate the required method caches correctly while not invalidating more than necessary.

Cache Invalidation Examples

The following examples illustrate a few scenarios of invalidating a method cache when modules or methods are changed.

This scenario shows how the method cache is invalidated when a method is defined in the module chain before other definitions:

Invalidation — Adding a New Method to a Child Class

This next scenario shows how to invalidate the cache when a module is included earlier in the module chain than a previously used method:

Invalidation — Including a Module

Prepending is another challenging scenario where existing methods in the module being prepended to must be invalidated when they share the same name in the prepended module:

Invalidation — Prepending a Module

The last example shows how invalidation occurs when a module is prepended and then a method with a previously used name is added to a prepended module:

Invalidation — Adding a Method to a Prepended Module

The above scenario has additional complexity since the lookup result for the foo method occurs after the prepended module M in the module chain. To invalidate foo correctly in this case, we track that M is prepended to B and also invalidate method names in B when they are invalidated in M.

These are just a subset of scenarios that might happen to invalidate a method cache. Other scenarios include extending modules, and removing or updating methods.

Drawbacks

The assumption that a method is not changed in TruffleRuby is represented by a Truffle Assumption object. The additional Assumptions consume more memory since each method will now have an Assumption versus previously where the entire module had one Assumption.

Results

We tested against a benchmark which simulates a rails blog application running the “rails routes” command.

Before
1422 total invalidations. 720 invalidations due to constant and methods being modified.

After
893 total invalidations. 4 invalidations due to constant or methods being modified.

Nearly all invalidations due to constant and method changes were able to be removed due to using more precise invalidation of caches. This results in improved warmup, as there is no need to recompile those methods and blocks which were previously invalidated, and they keep running in compiled code instead of in the interpreter during recompilation.

To test a real-world application, Chris Seaton and Maple Ong helped us measure the number of method and constant invalidations on the Shopify Storefront Renderer during startup. Before this optimization, there were 992 invalidations due to methods being modified and 424 invalidations due to constant being modified. With precise invalidation, there are zero invalidations for both methods and constants during the Storefront Renderer startup! This reduced the application’s startup time by about 9% after updating method invalidation and about another 8% after updating constant invalidation, for a total of about 16% reduction.

Additional Improvements in the Future

The main driver of improving method caching was to increase how quickly applications can reach peak performance by eliminating any redundant invalidations and recompilations that were occurring.

Another area of exploration could be to reduce the number of assumptions needed by exploring other data structures to handle invalidation.

Method Caching Improvements in CRuby 3.0

In CRuby version 3.0, a per class and per method cache has been added which is similar to the caching implementation described above for TruffleRuby.
A description of how CRuby’s updated method caching can be found here: https://bugs.ruby-lang.org/issues/16614. That design makes slightly different trade-offs, for instance it minimizes memory footprint but is less precise and might cause unnecessary cache invalidations.

Conclusion

We found that precise method and constant invalidation are key to Ruby warmup, and the design explained in this blog is able to reduce the number of invalidations from hundreds to zero while loading an application. There are many edge cases, e.g., when adding a method later to an existing prepended module, but in the end the intuition is simple: we need to invalidate any cached lookup with that method name which looked through the modified module.

For more information about TruffleRuby please visit our website and follow us on Twitter for more updates.

--

--