Performance — a gaze through the functional prism

Nir Rubinstein
AppsFlyer Engineering
7 min readSep 26, 2019

--

I’m a functional programmer. There, I said it.

What do I mean by that?

Over the course of my career as an engineer I started to identify some phases that we, developers, go through in our career:

  1. Novice — “I have no idea what I’m actually doing, but I’m giving it an enthusiastic effort”
  2. Opinionated novice — “I’ve been coding for a living for the last X years — trust me when I tell you that without ORM and Dependency Injection all code is crap”
  3. Intermediate — “I’ve just started to know what I don’t know”
  4. Opinionated intermediate — “Hmm… I think I connect more to X than Y because of A, B and C”

After stage 4, all sorts of hells can occur — you could get promoted to being a tech-lead, manager or even start your own company.

What does usually happen after stage 4 is that you come to a stage in life when you’re old enough and experienced enough (those two usually go hand in hand), that you usually surround yourself with technologies you relate to the most.

Let’s get funky! Lisp, Clojure and Scala are just some of our functional jive. Join us! >>Learn More

That being the case, I chose a few years ago to become a functional programmer. I’ve spoken about the “why” in depth in my two previous entries about Clojure, but the fact remains that this was a conscious choice. After a few years of hacking away at functional code, what usually happens is that someone comes to you and tells you that functional code is less performant than its non-functional alternative. Why? The reasons can vary but they usually revolve around the “cost” of immutability and some specific design decisions of the language used (such as it being dynamically typed for example). So, the rest of the post will deal with that question — is performance the sole consideration that we need to take into account when writing new code.

Show Me the Code

Let’s look at the following code snippets:

In Java:

public int sumSquares(int[] rangeOfNums) {
int sum = 0;
for (int i:rangeOfNums) {
int temp = i * i;
sum += temp;
}
return sum;
}

And now in Clojure:

(defn sum-squares [range-of-nums]
(->> range-of-nums
(map #(* % %))
(reduce +)))

I like this example, a lot. The problem solved above is how to sum the square value of every number in a list of numbers. I brazenly stole this example from Luca Bolognese’s introductory talk to F# and I’ve been using it ever since to illustrate functional thinking.

Is there a difference between the solutions?

Well, the first major difference is conceptual — when we, as software engineers, are faced with the simple problem of summing the square value of every element in an array, we simply say: “what’s the problem?”

  1. you use a helper variable
  2. iterate over the array
  3. calculate the square value of each item, and
  4. add it to the helper variable.

Once you’re done — return the helper variable’s value as the answer.

This is basically what we did with the Java example. This is a very imperative and straightforward “first-year computer science” exercise.

On the other hand, if you’d ask any non-developer the exact same question, they’d just say: “Well, just square all the values and then sum them up”. Which is exactly what is written in the Clojure solution.

What I’m trying to illustrate is that we, as developers, are so ingrained in years of schooling/work of imperative and OOP (Object Oriented Programming) paradigms that the functional thought, which is closer to a “day-to-day” way of thinking, seems hard to grasp at first.

The other, most significant difference between these two examples, is in the performance tradeoffs. I won’t go into why (there’s enough out there to read through), but what you should always keep in mind is this — the higher the abstraction, the costlier it is in terms of performance.

That being the case, why are we seeing a rise in functional programming languages? If most of them are less performant than their OOP/procedural/imperative counterparts, why use them at all?

Even here, at AppsFlyer, being mainly a Clojure shop, I get asked quite often about this hefty cost. I’d like to present my answer, through these following points

  • Code velocity
  • Performance test skew
  • Readability and maintainability

Code Velocity

The higher the abstraction, the less code you need to write (usually). And this, in short, allows for faster time-to-market. If I can write my next feature faster, it means that my client will be able to enjoy it faster, and I can grow my business more rapidly.

Hence, the notion about velocity. The JVM is a very robust and excellent VM. When writing Clojure code, I can accomplish the exact same things that I can in Java, most of the time with fewer lines of code. While this is not a huge accomplishment in itself, the fact that functional languages usually do away with all the “boilerplate” of OO code and allow us to focus on writing the business logic, is a huge benefit when it comes to velocity, in my opinion. In my previous posts I went into detail about how FP (Functional Programming) changes the way we reason about the problem space as opposed to OOP — I think this always contributes to the aspect of velocity, at least mentally.

Performance Testing

It’s true that the code samples above vary greatly in performance, even in an order of magnitude or more in favor of the Java code — but that’s testing one single function in a very limited scope.

At AppsFlyer scale, our service is usually comprised of dozens of threads, each doing various tasks, be they CPU-bound or IO-bound. Let’s say we’ve “solved” the performance problem of the specific function written above, we then need to explore how it impacted the entire service? What happens if your service is actually IO-bound most of the time, will improving a single CPU bound function even register? Moreover, a lot of the hard work and complex operations we do in programming revolve around concurrency and parallelism. And in this context, FP languages usually provide us with safe and sensible idioms of work — usually via immutability. This ultimately enables us to write complex async mechanisms in a really simple manner.

When trying to solve the same thing in an OOP language, the margins of error we have when writing that same concurrent/parallel code are much higher. This also validates the first point regarding velocity, but I’d also like to re-emphasize the fact that testing only specific parts of the service can distort our benchmarking. While we can improve issue X by an order of magnitude, it might have close-to-zero impact on the rest of the service.

Reliability & Maintenance

Reshef, AppsFlyer’s co-founder and CTO, likes to say that the code we write “lives” most of its “life” in production — not in our IDE.

That being the case, the ease of changing the code and re-visiting it in the future is essential to our velocity. Fewer lines of code mean less to read and change. Also, working within safe idioms of parallelism and concurrency in our code base will allow us to better maintain it.

I’d like to look at this via the following quote from Michael Feathers: “FP makes code understandable by minimizing moving parts”. I agree with this statement wholeheartedly. Since we have less “moving parts” in our functional code, it’s usually more reliable because we have fewer “things” that can “break”. As a direct result, if things do break, it’s usually easier to find the problem since there are fewer “moving parts”. Finding the problem is usually 90% towards solving the problem. This “less moving parts” notion lends itself directly to a more reliable and maintainable code base.

Final Thoughts

So what point am I really trying to get across in this post? Well, as always, it’s more of a way of thought than an actual call to action.

I encourage you to embrace the notion that less performant is not always Bad. Nor is it always Good.

It’s really just one more aspect we have to take into account when writing our code.

The decision making matrix can be complex at times, and there is no universally true or correct answer.

If performance was the only consideration when building complex systems, we’d still be writing in Assembly (or, hopefully, at least in C). If the only consideration was readability and velocity, we’d be writing in some higher level abstraction.

The fact that there are, still, in 2019, a plethora of programming languages and paradigms, just serves to show us that there are a diversity of considerations to take into account when building systems — whether it’s the business impact, system thinking as a whole, or other considerations, specific to our engineering organization and architecture.

There is no “Silver Bullet”, not in databases, not in programming languages, not in readability vs. performance, and certainly not in real life. That’s why, when taking all of this into account, we need to make a case by case decision.

Don’t be afraid to write less performant code if it serves your organization better. If you’ve hit a brick wall in terms of performance and/or cost, be sure to profile the hell out of the problem before you commit to a solution.

And also be sure to keep in mind that you should not be afraid to mix paradigms — maybe you can write high level application code, and low level infrastructure code. Maybe you can integrate them together, a lot of the time Clojure libraries “fall back” to “lower level” Java code in order to benefit from performance. The answers are plentiful and out there, just take into consideration that writing code is more than just writing, it’s the entirety of it. This is the type of holistic approach and thinking that will benefit you the most as a developer.

--

--