When we first started working on TruffleRuby, the GraalVM’s implementation of Ruby, we looked at what Charles Nutter, the co-lead of JRuby, had to say about implementing Ruby, and specifically implementing Ruby on the JVM.
In his 2012 blog post, So You Want To Optimize Ruby, written a couple of months before we started, he talked about all the things that make Ruby a difficult language to optimise and suggested people might want to tackle these things before making significant performance claims. In the past we've used this list to show that the TruffleRuby project was ready to talk about performance and we’ve shown how much of it we have ticked off.
Some of the items on this list we have already solved like supporting C extensions with our novel approach to executing native code and our work on automatically synchronising shared objects. But there are a couple of things we still need to do here —
callcc, continuations and fibers.
A continuation is an object that represents a current point of execution in the program — its call stack, local variables and so on. You can resume executing a given continuation object after it has been created, and you resume with that call stack, local variables and so on, as they were at the point the continuation was created.
callcc is the
Kernel method to create a continuation.
A fiber is like a thread, but the expectation is that they're implemented by the interpreter, rather than by the kernel. This means that they don’t give you parallelism, but they also don’t need a call into the kernel to create them or to switch between them, and they don’t consume precious kernel memory resources as a thread does. You can implement a fiber using a Java thread, which is what TruffleRuby has done historically. This will be technically correct but in practice it won't have performance characteristics anything like what people would expect, due to the calls into the kernel, to the point where it's not really working in practice. You also won’t be able to create many fibers if you use threads to implement them, due to the more limited kernel memory.
For example, the Celluloid project has struggled in the past with this strategy of implementing fibers.
callcc are perhaps not frequently used features. A lot of people think they're a bad idea for many reasons, including Koichi Sasada and we think Matz has also said that he agrees they should be removed. We have only once or twice seen anyone attempting to use them outside of a demo. However, our policy at TruffleRuby is to implement any feature of the Ruby language that at least some developers find useful and productive, in order to avoid the slippery slope of justifying taking shortcuts in order to achieve performance.
Unlike continuations, fibers are used in important parts of the Ruby ecosystem. For example, any time you create an
Enumerator with a method using
enum_for, or call a method that returns an
Enumerator from a call which usually yields to a block, and then use the external iteration methods like
next, a fiber is created to keep their state in. You may be surprised that you are using a lot of fibers in your code base without knowing it, due to this feature of
We can see the problem with the way JRuby and TruffleRuby at the moment implement fibers using this code which creates a lot of in-progress enumerators.
On my system with default configurations, MRI will happily create around 300,000 concurrent enumerators, while TruffleRuby runs out of Java threads to run the fibers needed at around 5,000 with a generic
NoMemoryError message. JRuby fails in the same way and even has a special error message to explain the problem because people may not connect their use of
Enumerator with fibers.
Error: Your application demanded too many live threads, perhaps for Fiber or Enumerator.
Ensure your old Fibers and Enumerators are being cleaned up.
We're used to TruffleRuby and JRuby being able to do more and faster thanks to the JVM , but here we've got an annoying limitation.
In a new experimental branch of TruffleRuby, we can now easily create around two orders of magnitude more concurrent
Enumerator objects or fibers — 330,000 on my system. And what's more TruffleRuby can create them far more quickly than MRI. To do this we're using new technology from the JVM called Project Loom, which implements continuations directly at the JVM level. You can learn more about Loom on their mailing list. On top of implementing continuations, Loom also plans to provide direct support for tail-calls and the more powerful type of continuation needed for
callcc so we will be able to implement both of these in TruffleRuby in the future.
Duncan MacGregor, who has been working with the Loom team and applying the work to TruffleRuby, has a technical blog post which goes into more depth than this announcement. JRuby should be able to use the same technology, even without using Truffle, but we think fibers’ performance on GraalVM will be better than it is on a standard JVM.
You can follow the development of TruffleRuby and how it can leverage the support of Project Loom on GitHub. We’ll continue to solve the challenges of creating a high performance implementation of Ruby and make it compatible with the ecosystem.