Unravelling Ruby Threads

Mickey Sheridan
6 min readMar 21, 2019

--

Multitasking is a rarely pleasant, but often essential aspect of our lives. When we’re busy, we need to find ways to get more than one thing done within the same space of time. However, too much multitasking is the surest way to lose focus and make mistakes.

The same goes for any program. Computation is often so fast that we may sometimes forget that it takes any time at all. But computers aren’t magical — if you ask too much of them they will slow down. Whether it be a sizable data crunch or a series of requests to an external server, some tasks can bring your program to a halt.

Threading is one solution to this issue. Ruby’s Thread class creates a separate process that will run alongside the main process. Instead of running through each task in sequence, threading allows the program to do more than one thing at once!

Let’s see it in action. Compare this:

to this:

They do the same thing, essentially. Each runs #slow_task ten times, and then puts its total run time. The second method, instead of running the tasks sequentially, runs each task in a separate thread. What’s our outcome?

The threaded method finishes in one-tenth the time! Fantastic!

Getting a thread going is as easy as initializing with a block. But what does this #join method do? Well, since threads run alongside the main thread, there’s a good chance that the main thread will complete too early. Whenever a parent thread finishes, each of its children are terminated alongside it, regardless of completion.

This method does three things, it initializes three threads, and then ends. The first two threads may get the chance to produce their outputs, but since threads to take some amount of processing time to set up, none of these tasks can be guaranteed.

#join solves this issue. It halts the parent thread until its child completes itself.

Now that the threads have been joined, they are guaranteed to finish before completing the method. Keep in mind, they nevertheless will complete their tasks in an unpredictable order (“Thanks to join” may still be printed after “I can take..”)

How long would this method take, roughly? t3 sleeps for 60 seconds. However, its #join call is given an argument of 5. This means that the main thread will give up to 5 seconds for the child thread to finish before continuing on.

How does scope work with Threads? It seems like threads wouldn’t lose a good deal of usefulness if they could only use data declared in their individual block. Thankfully, they can use the data available to their parent:

Both threads can pick apples, as apples had been declared before the thread. However, only the parent’s loop can pick peaches, as they are declared after the thread.

What will this method return? Let’s run it five times:

It seems our method does not have a guaranteed output. That’s not good. Why did this happen? The thread takes 2 apples at a time, while the parent only takes one. But the threads aren’t exactly being polite with one another. They execute indifferently to each other’s behaviors. This also demonstrates how unpredictable thread execution order can be, and the slight set-up time a Thread class needs to initialize.

This becomes more problematic as we add more common variables:

Threads can share variables, but they do not have any agreed execution order. Solving these issues would require some amount of locking. Ruby’s Mutex class is a good option.

A question you might be wondering at this point: If two threads have access to the same variable, what would happen if they tried to mutate said variable at the exact same time? Isn’t there a threat of trying to assign two separate values for apples, creating some sort of dimensional rift in our program?

Fortunately, there isn’t. Up until now, we haven’t been exactly clear with what “at the same time” really means with Ruby threading. In programming, there is often a good deal of confusion between Parallelism and Concurrency. Parallelism involves using a multi-core CPU to literally perform more than one task at the same time (individual CPUs may look like they’re doing many things at once, but they still must perform operations essentially one at a time). Concurrency, on the other hand, involves a single CPU working through several tasks, switching among them until all have been completed.

If you were cooking a meal sequentially (without concurrency or parallelism), you would have to do each step individually and completely. This means, when you set the water to boil, you wouldn’t be able to do anything but wait. If you were to do it concurrently, however, you could leave the pot and begin chopping vegetables. Finally, if you were to do it with parallelism, you would have another cook aiding you, (albeit at the cost of space in your kitchen, and with a higher risk of communication issues).

So how does Ruby do it? The issue is actually a bit of a sore spot for the language. The standard Ruby interpreters (MRI and YARV) do not employ parallelism. Further, while threads technically allow for concurrency, it’s only in a fairly restricted way. MRI and YARV employ a Global Interpreter Lock (GIL), meaning that only one thread can be executed in the interpreter at any given time. So, even on a multi-core computer, parallelism is not possible.

At the start of the blog, we saw how ten threads could perform a task at the “same time.” Unfortunately, there has been some amount of innocent deception at play. The slow task these methods were performing was simply sleeping for a second and then outputting . This task, like waiting for a pot to boil, can significantly benefit from concurrency. What if we replaced the task with an actual computation?

Not as exciting.

This is not to say that Threads are useless. Just as concurrency was far faster with the sleeping method, so too is it with other processes that see interruptions. Most often is anything involving inputs — from a user or a server. Without threading, the program must halt until the input is received. Ruby threads can solve these inefficiencies.

If the lack of true multitasking disappoints you, you may want to consider switching your interpreter to something like JRuby, which uses the Java Virtual Machine, and allows parallelism. Otherwise, use threading to close the gaps in your program’s performance, and try not to create any funny bugs.

--

--