bencher

Published in

Orijtech Developers

9 min readAug 20, 2021

We at Orijtech, Inc. are proud to announce bencher our continuous benchmarking infrastructure and product; bencher is firstly focused on the Go programming language but extendible to others. bencher seamlessly hooks into Github, and whenever changes are staged against a base branch, it’ll run all the affected benchmarks and present comparisons. bencher’s user interface is available at https://bencher.orijtech.com We are grateful to one of our customers, the Interchain Foundation for the great opportunity to build and scale high performance engineering infrastructure for their very diverse and rich Cosmos ecosystem. bencher will eventually be generally available for everyone in the near short term, but for starters it is available for the cosmos-sdk.

Introduction:

In the world of software engineering, source code is the medium for expressing and implementing ideas. As that source code (code) is evolved to add features, correct logical mistakes (bugs), remove features, deprecate functionality etc, the net effect could be an improvement, neutral or detriment to the overall performance. As projects grow in complexity with lots of moving parts, these effects on each other are not well understood and cause issues that bog down progress. This means that examining the true effect of changes is hard. To figure out what code to optimize, it is important to get actual data on what code is hot/popular. Due to the effect of dependencies, as complex processes interact, performance can unwieldingly degrade..

Every code change can essentially be looked at as causing the sensitive tangential effect that in chaos theory is termed the “Butterfly effect”.

thinking about all that is a complex process right?

Complex details right?

About us…

At Orijtech Inc, we are high performance and efficiency hounds.

Orijtech Inc as Cerberus; Image credit to Giphy.com

We are technologists who dig deep; some of us also Go core committers who keenly observe and improve Go in every aspect ranging from developer tooling, compiler, runtime, standard libraries, networking, developer relations etc. Any opportunity to improve efficiency is one that we take the challenge for!

What is bencher?

bencher is a continuous wholistic benchmarking infrastructure that is seamlessly hooked up to a repository such as on Github.

Whenever pull requests or commits are made by any developer against the repository, we run the respective benchmarks and then report changes, comparing the results against the base branch e.g. main or master. These results are coordinated to the Pull Request by our bot orijbot which makes such announcements.

When one clicks on that link, this is what bencher’s homepage looks like

of which if we visited any of those benchmarks, we can see

or if we visit another one

If we look at https://dashboard.github.orijtech.com/benchmark/87a2cf865f76439caf1e37ea5743da06

and when we ask to inspect all the changes:

Why bencher?

Currently, when developers write code, it is very helpful to have associated benchmarks for those who care about performance. For starters, this post is focused on tooling we’ve implemented to augment the Go programming language which we primarily use and is also used by more than 25% of the Fortune 100 and many companies in the Global 2,000 plus many global companies, big and small.

Following the guidance for writing Go benchmarks at https://pkg.go.dev/testing#hdr-Benchmarks, benchmarks in Go consist of writing this kind of function:

func BenchmarkXxx(*testing.B)

where Xxx is the name of the benchmark for example in the Go standard library.

Say we were tasked with writing an engine for a compression and in there we need to count the number of 1s in an unsigned integer. Given an integer, iterate through it and count the number of 1s in there. Our design session will feature dissection of the process and a sample will look like this..

After understanding the algorithm, asking questions, we can now go ahead and write the code for it! The code and tests might look like this:

Our tests pass, code review approves, life is all good! We are happy to have helped the database team! We submit our code, it gets heavily used by the database engine. We even write the benchmarks:

the database team is even happier due to seeing the benchmarks along with the tests…

Celebration!

Later that week, continuous profiling surfaces up that that code is hot…Why is it hot? What’s going on?? We don’t know what to do…

Hello, Darkness my old friend!

At that point we hear of the famed hardcore engineer, talk to them and they tell us about a common computer instruction for counting ones in integers called POPCOUNT aka POPCNT and we decide to investigate…

. Lucky for us we find out that the Go standard library’s math/bits package has exactly this function OnesLen64 Ahh we’ve got information about how to use POPCOUNT/POPCNT

Its POPCNT not a DOCTOR..Busy at the crib cooking salmon with the lobster…

Sounds familiar?

The process above of refining changes might sound familiar to most software developers who might hear of some esoteric command that could solve many of their problems. This is why diverse code reviews and tests are important.

Sounds familiar

However, to objectively make a judgement, we have to make comparisons for shoot outs and compare versions. We can’t fix what we can’t measure!

To properly compare versions of the code, we need to perform some benchmarking using math/bits.OnesCount64 to produce the following benchmarks

which when ran produces the following

but how do we compare results of the shoot out??

To do that, we ask around the Go community and then we hear of benchstat and thus learn out about how to compare benchmarks and alas we save benchmarks for the naive implementation in naive.txt and the results for the speed up in stdlib.txt and after some experimenting we learn that to compare benchmarks they MUST have the same names thus finally we do:

We finally get to the conclusion that using the standard library’s math/bits.OnesCount is the best option! Woohoo, we now have very fast code!!

Notice how tedious of a process that was? Imagine that for every single change that you make, you have to firstly run the benchmarks before for all the directories, store the results, and then run them again for the new changes and then invoke benchstat… Try this out for yourself and then run it a few times manually, you’ll notice a strain..

Not only is it error prone, but it is also a tedious process and one of our exhibits is this change in the cosmos-sdk in which overtime, code piled up, microbenchmarks looked alright but in the end the big picture was missed and performance was hurt severely.

Exhibit of consequences of benchmarking inadequacy & fatigue:

One day in February 2021, I stayed up all night writing benchmarks and profiling code after having been requested by Ethan Frey and Billy Rennekamp to examine what could be burdening Cosmos users, from 11PM to 4:30AM I was on a hunt with their SDK reading through their APIs, writing benchmarks from which I noticed that what was supposed to be fast code took ages to run, benchmarks were even timing out… This problem sounded familiar to Ethan Frey who messaged me to tell me there was a similar issue in https://github.com/cosmos/cosmos-sdk/issues/7766. It was at that moment that I dug deep, spun up our continuous profiling infrastructure then debugged what was up with lots of time having been wasted by slow code, it was almost morning, I was very frustrated…

Worn out and tired, it was time for drastic measures…I feel you Samuel L Jackson…

I read through lots of the code and had hypothesis then proved them by examining continuous profiles for an hour or more and then after my careful analysis, I posted this at around 6:13AM https://github.com/cosmos/cosmos-sdk/issues/7766#issuecomment-786671734

and I fixed this by doing a wholistic benchmarking, along with continuous profiling to exorcise the demons that were haunting the code…

I fixed that code in pull request https://github.com/cosmos/cosmos-sdk/pull/8719 which involved carefully analyzing pathological algorithmic behavior and discovering lots of busy work in the Quicksort routine along with a bunch of other unnecessary work. I also sprinkled on some black magic, and eventually that brought starting a node, down to 6minutes mandatory time,waaay down from 20+ minutes and here are some benchmarks that I produced:

The result was smiling customers, very fast code and less wasted CPU cycles plus lots of money saved — for context, as of August 20th 2021, the Cosmos ecosystem has $100+B assets+market cap.

Any saved cycles directly benefit the ecosystem and save their validators lots of money…

How does it work?

Let’s dive in!

bencher receives a Github event via a web hook. Once we receive that event, we process it; examine if we need to do any work, we grab the affected code, compute a dependency package set of Go code that is affected within different directories in the entire repository and run the respective benchmarks and then report changes, comparing the results against the base branch e.g. main or master.

Architecture diagram:

bencher’s features:

Ergonomics and complexity reduction:

The cognitive load of developers sifting through benchmarks, trying to interpret results, and dealing with so much information and menial labour makes many not even care about performance; it requires lots of attention to detail and painstaking skill — but that’s why Orijtech builds tools for developers. Our goal is to simplify things and democratize the specialized knowledge that we have accumulated from all the work that we do..

With that information above, now you have an idea of why and how we might have built bencher. You no longer have to sweat about running benchmarks entirely on your own, turning off your computer to avoid noise and competition from CPU and Memory greedy programs like web browsers, you don’t have to collect benchmark results before and after on your own, you don’t have to do anything except just look at the results that bencher produces for you; we handle all the ergonomics, setup, scaling, reliability, analysis. That’ll save you lots of time, rare expertise and you can focus on building your product — that saves a whole lot of money and our customers are an exhibit of this effect

Save you that money, racks on racks!

Next steps:

bencher is available already for developers making Pull Requests to the cosmos-sdk, just make a change and you’ll get that notification. If you’d like get added, please don’t hesitate to reach out to us, you can’t miss out on contacting us. We have a whole lot of updates and simplifications that we are bringing to bencher, and we’ll eventually make it generally available as a 1 click addition on the Github marketplace.

We are very thankful to our customers the Interchain Foundation for the great opportunity to serve, to the Go community whom we serve and care for, our colleagues and to our entire team who helped make this possible, great work to some folks on Orijtech’s team who worked on bencher primarily Cuong Manh Le, Uzondu Enudeme, Nathan Dias who work with me bencher, Derrick Mugenyi and Mike Brian Ndawula who provided feedback — it is nice to work with you all!

Thank you to the customer end: Billy Rennekamp, Ethan Frey who raised to us the nebulous slowness in the cosmos-sdk, Zaki Manian, Robert Zaremba, Ethan Buchman who all talked about the slowness in the cosmos-sdk, Marko Baricevicwho gives us valuable and direct user experience feedback about utility and pain-points, plus everyone else who provided feedback. We are always growing and improving bencher with many more features to come, but we are pragmatically building it out for the best and easiest user experience, and complexity reduction.. If you liked this post, please go ahead and share it, if you are talented and would like to work on the next generation infrastructure, please send me your resume…

Thank you for your time, attention, kind consideration, for reading this far!

Yours faithfully,

Emmanuel T Odeke

Chief Executive Officer

Orijtech Inc.