How Capchase leverages profiling to improve performance in Elixir 😎

Published in

Capchase Tech

7 min readMar 7, 2023

Elixir is a functional programming language that allows building scalable and very resilient applications. We use it at Capchase coupled with Oban to generate jobs to fetch data from multiple sources. Elixir is by default a very resource light language. We are running hundreds of thousands of jobs a day and yet we average 5% CPU usage over 3 nodes with the biggest spikes at 30%.

That’s why it was really weird when a few months ago, after some feature releases, our CPU usage was close to 90% at all times. We sprung into action immediately using all available profiling tools to check what methods were getting called the most and the corresponding percentage of those in all the calls.

Let’s simulate this scenario today and see how we can use profiling tools to find code hot spots and other optimization problems of our code!

Assumptions: Elixir is already installed and ready to use! (Check the official instructions if that’s not the case)

First steps

We need to create a new project. Let’s use the mix tool for that by running

mix new profiling_example

Let’s add the Jason library as well, for that we need to go into the mix.exs file and add

{:jason, "~> 1.4"}

After adding this line we need to run mix deps.get to update the dependencies locally.

Functionality

Let’s create a program that mostly creates random spam and then stores it by sending it to an API.

Inside a file called spam_producer.ex we can create a SpamProducer module and start by defining some functions.

First the spam function which will just create a random object with randomly sized strings.

def spam() do
   field_1_size = :rand.uniform(3_000)
   field_2_size = :rand.uniform(3_000)
   field_3_size = :rand.uniform(3_000)

  %{
     id: DateTime.utc_now() |> DateTime.to_unix(),
     field_1: String.duplicate("a", field_1_size),
     field_2: String.duplicate("b", field_2_size),
     field_3: String.duplicate("c", field_3_size)
   }
end

Next we want a size function so that we can see how big this object we just created is. In this case we are just encoding the map as a JSON string first and then checking the size of the string.

def size(object) do
   Jason.encode!(object) |> byte_size
end

Ok, good progress, let’s now do the can_store? function. We want it to return true if the size is less than or equal to 5000 bytes and false otherwise.

def can_store?(object) do
   size(object) <= 5000
end

Almost there, now we want a store function that will store the object into a DETS table (eq. to numpy’s ‘save to’.npy except we can keep appending). We are just simulating the API here, instead of doing the actual implementation of it. This is also very simple to do in Elixir:

def store(object) do
   :ok =
     :dets.insert(:"./spam_table", {object.id, object.field_1, object.field_2, object.field_3})
end

We need to also open this DETS table before it’s used, let’s also create an open_dets function.

def open_dets() do
   {:ok, _table} = :dets.open_file(@dets_file_path, type: :set)
end

Ok, lastly we need a function to use all of this together, the run function!

def run() do
   open_dets()
   Enum.each(1..10_000, fn _ ->
     spam_object = spam()
     if can_store?(spam_object) do
       store(spam_object)
     end
   end)
end

This function creates 10k objects and stores them into our DETS table if the object is less than 5000 bytes.

Let’s check if what we have so far works! We can do this by running iex -S mix and calling manually SpamProducer.run() . The result is this:

Calling run() in iex

Yup, it’s a little underwhelming… But we can see that indeed we generated a spam_table file that is populated with 10Kb of very important data.

Profiling

Finally, it’s profiling time! Any bets on what are the worst offenders in our simple code?Let’s find out!

Exprof

First we need to add ExProf to our dependencies in mix.exs, just add this line in the deps:

{:exprof, "~> 0.2.4"}

Second we need to add this at the top of our module:

import ExProf.Macro

After this we can create a new function called profiling_function:

def profiling_function do
   profile do
     SpamProducer.run()
   end
end

This function essentially calls the profile macro on our previously created SpamProducer.run()

And lastly we need a way to actually see the results in a valuable way, for that we can create yet another function:

def run_with_exprof do
   {_records, result} = profiling_function()
   result
end

Ok, now we are actually ready to see the results. Let’s run iex -S mix again and call SpamProducer.run_with_exprof()

Calling SpamProducer.run_with_exprof() in iex

I’m only keeping the last lines as they are the most important ones.

We can see that 90% of our time is spent doing some Jason’s escape_json_chunk 😮

Ok, let’s make sure these results are consistent by using eFlambé as well!

eFlambé

First we need to add eFlambé to our dependencies in mix.exs, just add this line in the deps

{:eflambe, "~> 0.3.0"}

This one requires less setup and we only need 1 function that looks like this

def run_with_eflambe do
   :eflambe.apply({SpamProducer, :run, []},
     output_format: :brendan_gregg,
     open: :speedscope
   )
end

This function will call eFlambé and ask it to generate a brendan_gregg file with the profile of the run method from SpamProducer. This file can then be visualized in speedscope or online using https://speedscope.app

After running it using the same strategy as for ExProf we can check the results

Speedscope’s Sandwich view of the generated profiling file

Indeed we are able to confirm our suspicions!

In this specific case we were already expecting Jason to be the culprit but in a real life scenario it might be harder to identify what could be the problem in your code since there are going to be a lot more of unrelated calls showing up.
My only suggestion for this is to train your eyes to catch anything that might look a little suspicious and dig deeper for those cases. Eventually you will be able to find the real issue!

Optimizing

Ok, now we know what’s not cool about our code. It’s the Jason.encode!() we use in the size function. I believe we can get rid of it, so let’s do it.

def size(object) do
   object |> :erlang.external_size()
end

How about this? The functions are not necessarily equivalent, however, we should still get a pretty close approximation while reducing the number of computations by a lot.

Let’s check the results using ExProf

Calling updated SpamProducer.run_with_exprof() in iex

Well, I would call this an absolute win, now we spend most of our time creating the actual spam object instead of checking how big it is. And we no longer use 85% of the calls on a single part of our code.

Let’s see if eFlambé agrees!

Speedscope’s Sandwich view of the new profiling file

eFlambé is basically telling us we don’t spend 85% of the time in the Jason encode so it does agree.

Alright, we just improved our SpamProducer by a lot… Which is great but are there any considerations?

Caveats

While it’s amazing that we get such deep information when using the profilers this will also work against us.

Because we had such a simple application, profiling it was easy but when you are working with real life applications you will quickly notice that the reports are almost unreadable from the terminal. I usually pipe the output to a file which I can check later.

Another consideration is that a lot of functions are wrappers of other functions and they will all show in the reports. This means that sometimes it’s hard to tell what function corresponds to what part of the code. I recommend taking a look at eFlambé in these cases, you can see the whole call stack if you use the Time Order visualization.

Focusing on a call and it’s downstream using the Time Order visualization with Speedscope

As you can see, we can select a specific call and get all the downstream calls!

However, eFlambé also has its problems, in particular it’s a goddamn memory hog! You will notice while using it that it makes your RAM sweat. The generated files are also pretty big since they contain the whole call stack of your program.

Bonus

Ok, we saw that the program was much more streamlined when using external_size vs doing the Jason.encode!() but can we measure the difference in elapsed time?

We can! By using the :timer module. A function like this will suffice

def run_with_timer do
   {elapsed_micro_seconds, :ok} = :timer.tc(SpamProducer, :run, [])
   elapsed_seconds = elapsed_micro_seconds / 1_000_000
   IO.puts("Took #{elapsed_seconds}s to run!")
end

The time for running the program using the Jason.encode!() size function was 0.347129s and the same code ran in 0.213262s using the new size function!

Conclusion

While this example might seem a little wacky, it’s incredibly close to the problem we had at Capchase.

We were checking size by using Jason.encode!() And we got rid of it to use external_size. The only reason we were able to understand what was happening was by using profilers!

We went back to using little to no processing power… As it should be!