What to do when your ring buffers are tired of waiting! (Pt.2)

Hosein Ghahramanzadeh
Sahab
Published in
4 min readApr 6, 2020

Reader: “Pt2.”! Where is the part one then?(Checks authors history….) Nothing here!

That is true you are not going to find the first part here. You don’t really need it either to understand this part, however if you’re interested, it is on my personal blog: iyp.home.blog, which I highly encourage you to follow (duuuuuh!).

So you might ask why I am posting this here? Well few of my friends asked me to post here after seeing my blog, and so we’re here; however, my blog is still my main blog, and I will continue on there, and will try to post here as well.

So enough chit chat let’s get to it!

How ever bad this Rona thing has been, it has been good for me giving me some spare time. This combined with the holidays gave me enough time to catch up with my promises, one of which was a follow up on my wait-free ring buffers(a wait-free ring buffer library written primarily with C++ in mind which supports C++ objects, read further here). So I promised to publish the code, which I have, and to provide some benchmarks, which I have also done. Here is the code (feel free to contribute) and the rest explains the benchmarks.

As a benchmark, I have written an standalone benchmark, which provides a tool to compare the library against itself, which I bet you aren’t that interested in. I have also written a benchmark that compares my library against rte_ring library shipped with Intel’s DPDK library. Take the results with a grain of salt, since rte_rings are not a wait-free but rather just lockless.

What the benchmark does, is to instantiate each type of the ring buffers which are: MCMP, MCSP, SCMP, SCSP. After that it runs a benchmark on each of these ring buffer sixteen times with different thread count configurations. For example runs the benchmark on a MCMP ring buffer with two consumers and producers, sixteen consecutive times and then moves to running the same benchmark with three producers and three consumers on a new MCMP ring buffer and so on. The final result is the average time of processing a single element on each run on each thread configuration per ring buffer type. This benchmark is only possible on Linux and at this point is not merged into master.

These benchmarks were performed on an Ubuntu 16.04 virtual machine on an Intel Core i7–6700K system, hence I was limited to 8 threads at a time, and could not perform test on greater number of threads (I could but they probably wouldn’t contain that much of informing data.) so I look forward to it if you ever wanted to run these benchmarks on your own systems, and share them with me. Thank you, and let’s see the results.

For MCMP ring buffers the results look like this: (Note: The notation nCmP where n and m are integers denotes the number of producers and consumers, for example, 2C2P means the test was run with two producer threads and two consumer threads. Also to note is that each data point is aggregation of sixteen consecutive runs, and hence the error bars.)

It is interesting to see that the process time for a single elements seems to stay constant on my implementation which is kinda expected from a wait-free operation.

Here are the results for SCMP ring buffers:

My take on this is that the reason for the increasing time is the asymmetry between pushers and poppers. I currently do not know how to properly benchmark this since as the number of pusher threads increases the attempts to push increase without the increase in attempts to pop.

That jump on 7P1C is really odd on rte_ring’s side, I have no clue why that is.

MCSP ring buffers’ results are very similar to SCMP’s:

I think the increase here is similar to that of SCMP.

And for SCSP ring buffers the result is very simple:

Seemingly rte_ring is doing very well compared to my implementation on single threaded cases.

So that is it, feel free to contribute, and contact me if you needed help using the library or contributing to it.

--

--

Hosein Ghahramanzadeh
Sahab
0 Followers
Writer for

Technical Product Manager, C++ programmer and enthusiast, Software designer and architect.