How I built the Optimizely C SDK Wrapper using the Go SDK in a week (part 2)

Ola Nordstrom
Engineers @ Optimizely
8 min readApr 9, 2020

Here is part 2 of my hack week project the Optimizely Go SDK C Wrapper. If you haven’t already I recommended reading part 1. The project itself is available on Github and we are looking for your feedback.

https://github.com/optimizely/go-sdk-c-wrapper

Part 1 covered some of the overall design decisions and settling on using a handle to allow the library to maintain state in between function calls. In this post I will show how the SDK can be used in a high performance web server such as Nginx, measure performance and provide some additional pointers on interfacing Go with C. Now, why Nginx and not some other C application?

Nginx Background and Motivation

I picked Nginx because it is a well written and high performance web server written in C. According to Netcraft it is used across 233 million domains on the web today and powers about 26% of all domains. It also forms the basis of new application routers such as Kong which are used as the industry is shifting to containerized micro services running in environments such as Kubernetes. Finally, and probably the most important reason, my own curiosity. I have always wondered what the performance characteristics of Go are and this post will explore this question for Optimizely Rollouts, albeit only a little bit.

If you google Golang Performance many in depth articles show up. Everything from measuring memory allocations in the heap and stack to articles discussing how the runtime environment has been tuned over various Go releases over the years. Also, from time to time there are software teams that write about how they had to migrate from Go to some other language due to performance or memory management overhead of the runtime. This is great… for their specific workload. No (sane) product manager or software engineering manager would (should) ever approve a rewrite of a working system unless there were strong reasons for doing so. Showing that performance is a key bottleneck is one reason for moving such a project forward. Thus showing whether the Go SDK C Wrapper I have written exhibits poor performance is a factor in determining whether a pure C implementation is necessary or not.

Working Smarter (Doing Less)

In part 1 I discussed the time constraints in doing hack week projects. To do an apples to apples, a la C to Go comparison I should write a pure C nginx module that interacts with Optimizely and compare it to the wrapper I wrote, but I didn’t and I won’t.

I’m lazy. There, I said it. And stating it is cathartic. I let the computer do the work for me. I also suspect most programmers are a little lazy (or just lazy enough). I could write the code to query feature flags from Optimizely in pure C. If I dig through Nginx I will likely find it already has good abstractions for making network requests, it is a web server after all, and a damn good one at that.

My hypothesis is that in the end it won’t matter. Go is compiled and once the library is loaded and mapped into Nginx it will be fast enough so that it won’t matter to the browser loading up http://myNginxServer. The network will be the bottleneck by many orders of magnitude. Also my hypothesis conveniently aligns with my goal of doing less work (working smarter).

With that said here is the code that either displays “hello world” or whatever the feature flag string variable was set to in app.optimizely.com.

Nginx module calling the Optimizely SDK

If you’re not familiar with C the #if #else and #endif are preprocessor directives that allow the preprocessor to modify the source code to include or not include specific blocks of code. In this case the processor blocks define how the message is either fetched from the Optimizely feature flag or it is set to “hello world” if OPTIMIZELY_SDK_ENABLED is set to zero.

You can find all the code in this Gist. Yes, if you look in the gist there is much more code, but what’s shared above is the meat of it. The other ~200+ lines is all boilerplate code to set up the module for Nginx. More code to dynamically load the shared library, optimizely-sdk.so and manually set up all the function pointers to allow the SDK functions to be called. And finally, a Makefile. A lot of error handling has been omitted since this is a proof of concept.

The screenshot below shows both variations in the background and the feature flag configuration.

In the far back window is the plain hello world Nginx module. The middle window is showing the Nginx module that has loaded the Optimizely Go C SDK Wrapper. The topmost window is showing the Feature Key along with the custom greeting.

When the Optimizely SDK is not compiled into the module it just returns “hello world”. Surely this will be the fastest and the Optimizely SDK, however well written, will never be able to compete… or will it?

Performance

I used apache bench to gather these numbers on an Amazon Linux AMI EC2 instance, the gist has additional information on the hardware used. The tests were executed with the following command.

ab -n 1000000 -c 500 http://172.31.19.13/test

The host running apache bench is running in the same security group as the Nginx server in AWS.

Plain Hello World Module

For the Nginx hello world module, without the Optimizely SDK checking a feature flag, the following was returned.

$ ab -n 1000000 -c 500 http://172.31.19.13/test

. . .

Server Software: nginx/1.16.1

Server Hostname: 172.31.19.13

Server Port: 80

Document Path: /test

Document Length: 13 bytes

Concurrency Level: 500

Time taken for tests: 248.681 seconds

Complete requests: 1000000

Failed requests: 0

Total transferred: 156000000 bytes

HTML transferred: 13000000 bytes

Requests per second: 4021.21 [#/sec] (mean)

Time per request: 124.341 [ms] (mean)

Time per request: 0.249 [ms] (mean, across all concurrent requests)

Transfer rate: 612.61 [Kbytes/sec] received

. . .

Percentage of the requests served within a certain time (ms)

50% 114

66% 127

75% 143

80% 149

90% 162

95% 167

98% 173

99% 178

100% 1207 (longest request)

With the SDK

With the Optimizely Go C SDK Wrapper loaded I observed the following.

$ ab -n 1000000 -c 500 http://172.31.19.13:80/

. . .

Server Software: nginx/1.16.1

Server Hostname: 172.31.19.13

Server Port: 80

Document Path: /test

Document Length: 31 bytes

Concurrency Level: 500

Time taken for tests: 242.083 seconds

Complete requests: 1000000

Failed requests: 0

Total transferred: 174000000 bytes

HTML transferred: 31000000 bytes

Requests per second: 4130.81 [#/sec] (mean)

Time per request: 121.042 [ms] (mean)

Time per request: 0.242 [ms] (mean, across all concurrent requests)

Transfer rate: 701.91 [Kbytes/sec] received

. . .

Percentage of the requests served within a certain time (ms)

50% 111

66% 118

75% 137

80% 145

90% 159

95% 167

98% 175

99% 180

100% 1186 (longest request)

The observed memory usage of Nginx was higher when the Optimizely SDK module was loaded but remained constant throughout the stress test. This indicates that there are no memory leaks in the Nginx module that is loading and calling the Optimizely SDK Wrapper. The increased memory usage is expected since the whole Optimizely SDK library is loaded into Nginx and Go has a runtime.

From a performance perspective I cannot tell the difference between the two runs. Thus I postulate that

For feature flagging the Optimizely Go SDK C Wrapper performance is on par with a pure C implementation.

Intuitively this makes sense. The Optimizely SDKs fetches the datafile in the background every 60 seconds (by default), then caches it and sets the feature flags. With that said, over time the Go garbage collector may have to run and will have to do work which may impact performance. This may be measurable if more advanced features of the SDK are used, but for looking up and applying feature flags it isn’t.

Odds and Ends

Go SDK C Wrapper, including tests and examples, is only ~1500 lines. Only the client package has been wrapped, it is the primary interface to the SDK. The whole Go SDK by comparison is ~33000 lines.

The Go test framework does not allow the unsafe package to be imported. I suspect that this was an explicit design decision by the designers of Go to ensure maximum platform portability. Using tests that require pointer manipulation is not only unsafe but will lead to cross platform portability pain.

Another issue when interfacing Go and C is that Go does not handle packed C structs very well. Further complicating the situation is that Go does not allow pointer arithmetic. How then does one traverse a list of C structures in Go that are of an unknown length, while ensuring that the code is safe and portable? In my case this is a list of user attribute structs that can be passed into the SDK. The answer is that you end up with a gem like this reproduced below.

Iterating over C structs

This obvious line of code is documented on stackoverflow (our 21st century man pages) here, here and here.

To save you from having to read all three posts, take a walk, come back and reread all three I will summarize them as follows.

  • The code creates a Go slice named attrList that points to the passed in list of optimizely_ user_attribute C structs.
  • The length of the slice is at most, 2³⁰ long, this is what (*[1 << 30]) does. The slice, of course, is not that long but we don’t know how long it can be so this is a convenient upper bound.
  • The first entry in the slice is the first entry in the user_attribute_list which is the memory location of the first optimzely_user_attribute C struct.
  • The slice is exactly attrCount long and this is the maximum length of the slice which is what is specified in [:attrCount:attrCount].
  • The size of the struct depends on how it is packed by the C compiler on the target platform and will always be known at compile time ensuring that the traversal of the slice and the underlying list of C structures is safe.

Now breathe in, we’re done!

In Conclusion

The biggest obstacle to overcome when working on this project was the mind bending necessary to switch from Go → C then back again, over and over again. The Go grammar is

variable_name <type>

When C (and the rest of it’s descendant languages) have always done

<type> variable_name;

At times my head was spinning.

The Optimizely Go SDK C Wrapper is ready to be used with your C or C++ project. If it can run in Nginx it can run in your software without impacting performance as well. Give it a shot and let us know what you think!

-Ola

--

--