The Fun and Magical World of Benchmarking
Great your ruby code works now and you’re thinking about refactoring your code. While refactoring, you realized there is another way to write this code. Now, you’re torn between sticking with the original working code or implementing the new one. It’s probably best not to mess with the original working code. However, deep within you, you’re tantalized by the thought of the new code to be potentially faster. Should you remain steadfast with your original code or should you venture into the unknown?
Ok, that was overly dramatic, but — fear not, for Benchmark is here!
What is benchmark you might ask?
Benchmark is a built-in ruby module that lets you measure how long it takes for your code to execute and this can be useful in many ways. For one, in the overdramatize example above, coders often realize there are multiple ways to implement a method/solution, but whether it’s worth implementing or not can be quite taunting to decide. If there are no significant difference in performance, let’s not tamper with code that works perfectly fine. However, for certain situations, it can provide a tremendous performance boost to your application.
Another reason why benchmarking to find the optimal code will be important in the future is due to the growing amount of data in the world and the limited processing power we have. I’m not going to go over the details, but you can read about it in my previous post. Lastly, it’s fun! At least I personally think it’s interesting to see how fast or slow your code is running.
Now that you know the benefits to benchmarking, so without further ado, let’s get right into it.
Since the Benchmark module is already included in ruby, you won’t need to install any gems.
Now, let’s play around with the #measure method. Go ahead and put this.
You should see an output with four set of numbers and the unit of time is in seconds.
The first set of number indicates the user CPU time. This is the time spent processing your code.
The second set of number indicates the system CPU time. This is the time spent processing the OS kernel code.
The third set of number indicates the user + system CPU time. This is the combined time spent processing your code plus the OS kernel code.
Fourth set: This is the elapsed real time. This is the total amount of time spent from start to finish for the code.
While the last set of number is the most telling as it refers to the time it took from start to finish, it doesn’t explain where the latency is from. A slow elapsed time may not necessarily mean the code is slow, be cognizant that I/O is included in the elapsed time and not in the CPU time. So, don’t just focus on one particular metric, use all the metrics available to figure out which area of the code is faster or slower.
Simply running the benchmark for one code by itself is not very useful. Also, wouldn’t it be nice if there were column labels? So, let’s use the #bm method to compare multiple codes at the same time.
These two codes will essentially do the same thing, it will print out “Hello World” once on 100 million rows. Can you guess which one will be faster?
Wow, the first code was faster than the second by 4.8 seconds or maybe it’s the other way around. This is kind of confusing without labels. Let’s add some labels to it.
Aha! I was right the first time, but adding labels helped clear things up a bit. However, there’s one final method I want to show you before we wrap things up and that’s the #bmbm method.
Per the ruby documentation, “The times for some benchmarks depend on the order in which items are run. These differences are due to the cost of memory allocation and garbage collection.” Basically, the #bmbm method runs twice to prevent memory allocation or garbage collection routine to avoid skewed results. The first execution will be a rehearsal to flush out memory allocation or garbage collection routine, then the following codes will be free from discrepancies.
If you ever find yourself with multiple solutions to a problem, play around with benchmark. It’s super easy to use and you can just run it in the IRB. It might take you a minute to benchmark your codes, but in the grand scheme of things, your app will ultimately run faster.
For more information, you can check out the official documentation on how to use the benchmark module.