Blog: Inside the mind of a low latency developer (Part 1)

Today our CTO provides us an insight into the mind of a low latency developer

Here at CDRX we are ramping up to launch a crypto exchange in Q1 2019 (with a closed beta starting in Q4 2018). Read more about it and our unique approach to securitised tokens here.

Developing this sort of platform is our home ground and is something we have plenty of experience in. For me personally this is due to spending a large part of my career in one of the fastest markets — inter-bank FX.

Any average programmer can develop an exchange, but as we’ve seen with the Equities and FX markets, there comes a time when even small inefficiencies are magnified. These problems become dangerously exposed to the black box trading engines of high frequency clients. Ultimately a poorly written system will end up turning a booming success into a money pit as market latency reduces and sophistication increases.

Interestingly the mindset to develop platforms with the required performance characteristics is not schooled by the Tier 1 banks we have worked in, but instead is driven by the fierce competition between these organisations to be the fastest. It’s rare that (for example) you will have a senior leader telling you to use a certain caching strategy; instead they just know that fastest is best and safest. It’s up to the developers and engineers to achieve this.

If you put aside the controversies, the great success the British cycling team achieved over the last 15 years or so has been nothing short of astounding. In 2004 only the talismanic Chris Hoy and Bradley Wiggins won gold and in 2005 there were no British riders in the Tour de France. As of summer 2018 they have accumulated another 22 Olympic golds and 6 of the last 7 Tour de France wins.

You may wonder where I’m going with this, but bear with me.

There are many factors that contributed to the success of British cycling, but it always seems to come back to their guiding principle of “marginal gains”…The art of tuning every possible aspect of their approach in order to achieve minute improvements that in aggregate achieve that winning extra 1% of performance.

Marginal gains

Let’s dissect this:

Marginal — Minor and not important
Gains — Attain or secure something valuable

These words seem to contradict each other, but it’s really about making those unimportant changes add up to something which is valuable. In a sport, where you can win by 1 thousandth of a second, it is astounding that it hadn’t been done before.

What’s also surprising is how few people actually pursue “significant gains” in their chosen field let alone marginal ones. But both count when it comes to development.

So…How can we apply this to development then?

I’ve cherry picked a few some topics to talk. These may be so obvious that you wonder why I mention them, but again, just because they see obvious doesn’t mean that people will automatically do them. Computers have become so powerful these days that these sorts of optimisations are only necessary in extreme or highly sensitive scenarios.

Caveat: clearly many things affect benchmarking and the examples below use Java, so the JIT compiler can also intervene with smart trickery too, but the values are so wildly different that it still gets the point across.

Think carefully about your data structures

I can’t count the number of times I’ve seen arrays or maps misused. Iterating over a map is a great example.

Scenario: Assuming you already have an array and a HashMap of random numbers, each sequentially populated by index or by the equivalent Integer key (respectively)

// A) Looping over an array
for (int i = 0; i < size; i++) {
total += array[i];
// B) iterating over a HashMap
for (Long value : hashmap.values()) {
total2 += value;
// output totals to avoid JIT eliminating the need to run the loops
System.out.println(“output1=” + total);
System.out.println(“output2=” + total2);

For 10m records this is taking ~40–45ms for the array, but ~130–135ms for the HashMap.

“So what?” you may ask. In most applications this improvement wouldn’t be worth the effort to change it. Rarely do you find yourself dealing with this kind of volume of records and if you were, most end users wouldn’t even notice a ~90ms lag.

However, the moment you introduce thousands of users and throw in a few aggressive api clients (who have tuned their code as much as possible) into the mix this is the kind of delay that starts costing you money.

Minimise hops

Splitting your applications up into small components can be of great value. It can simplify the testing, maintenance, reliability and — if done right — the scalability of a platform. Unfortunately it has a detrimental effect on latency. As per (1) this is a delay that’s probably not discernible to an end user, but adding in a network hop and the associated serialisation and deserialisation can significantly increase overall latency.

Scenario: comparing the concatenation of a string to itself (using Java again in this example)

  1. Serialising and deserialising string before appending
  2. Simply appending
//create a trivial object as the input to the test
String testString = “Hello World”;
//initialising google’s JSON serialiser/deserialiser
Gson gson = new GsonBuilder().setPrettyPrinting().create();
StringBuilder buf = new StringBuilder();
long start = System.currentTimeMillis();
for (int i = 0; i < size; i++) {
// TEST1: serialise/deserialise string then append to the buffer
buf.append(gson.fromJson(gson.toJson(testString), String.class));
System.out.println(“time=” +(System.currentTimeMillis()-start));
StringBuilder buf2 = new StringBuilder();
long start2 = System.currentTimeMillis();
for (int i = 0; i < size; i++) {
// TEST 2: simply append the test string to the buffer
// to force JIT not to optimise the loops above out of existence
// the substring avoids messing overloading the console
System.out.println(“output 1=” + buf.substring(5000, 5005));
System.out.println(“output 2=” + buf2.substring(5000, 5005));

If I run the above code for 1m records I get ~1050ms for TEST1 and ~35ms for TEST2. This is a staggering difference and is only part of the story as it clearly doesn’t take into consideration any network or even software stack outside the JVM.

Now, it may seem I’m arguing the case for monolithic applications in a low latency environment, but instead I am actually advising a measured approach to deciding if, when and how to split an application up. If it’s for a good reason, such as scalability then the additional latency may well be worth the overhead.

Think about your protocols

I have a confession to make. Don’t hate me, but I’m really not a fan of REST. Yes its ludicrously simple to code and on the face of it it seems fast enough, but REST requires the handling of HTTP headers on every request, which is clearly not required for communication protocols that have open sockets.

Others have done exhaustive analysis on this subject (here for example) and the conclusions are clear: even 100 messages can take over 5 times as long to process with REST than it does with websockets.

Again, it takes significant scale before the effects are noticeable to end users, but when you scale it up to thousands of messages and place it in extreme scenarios (like an exchange) this can be make or break for a business. Don’t forget I’m focusing on marginal gains here!

Wrapping up

By achieving “marginal gains” consistently, CDRX endeavours to be a game changer for crypto exchanges. Faster, more stable and more enterprise grade than the other offerings out there. Thankfully we’ve partnered with some incredibly smart business people so we have many new business niches we will be evolving into as well. Exciting times ahead!

Please come back for part 2…

Join Us On Telegram:

White Paper: