Optimizing work with nested maps in Elixir

Gaspar Chilingarov
Learn Elixir
Published in
5 min readAug 9, 2017

Summary: Maps are extremely efficient in Erlang and create/update/access performance scales very good up to 100k keys. Access time per every element stored increases just three times from 2 objects in a map to 100k objects in the map. Map.update, Map.update! are efficient in creating/updating maps. For access instead of Map.fetch use case map which is faster. Write most obvious straightforward code and you will get good performance. Manually optimizing creation code can yield extra 10–15% of performance.

Note that X-axis is not linear.

Recently I was writing Elixir code which had to do cache check to see if object is present in cache or not.

Obviously objects are added and removed from cache from time to time and such structure should be updated periodically.

Objects which I’m storing in cache have two keys. Top-level key space is anything between 20 to 80000 keys, and secondary key is usually 1–8 values.

So I had three options to organize addressing object status in cache.

  • two-level nested maps %{primary_key => %{secondary_key => true} }
  • one level map with composite key %{ {primary_key, secondary_key} => true}
  • ETS table with composite key {primary_key, secondary_key}

There is three possible access pattern I need to consider while developing cache:

  • creation time
  • update time
  • access/check time

For every specific task usage patterns will be different, so YMMV. For me important points where:

  • very fast creation
  • updating cache status (updates occur not very often)
  • very fast access/check time

I considered several different ways to create objects

1 — Naive one

Most obvious solution is to reduce over keys lists and update map.

In graph this solution is called “2-level nested map”

top_level_key_list |> Enum.reduce(%{}, fn(top_key, data) ->
nested_key_list |> Enum.reduce(data, fn(nested_key, data) ->
Map.update data, top_key,
%{nested_key => 1},
&(Map.put &1, nested_key, 1)
end)
end)

2 — Optimize nested map build

Updating empty map and populating it with objects may create re-allocations in the map, so you need to optimize for it. In my case nested maps were of size 9 which practically does not require re-allocation. In case of larger nested maps performance impact can be bigger. So optimization is quite obvious — first build key-value list, then convert it to map in single operation, so that map creation function will know map size in advance.

In graph this solution is called “2-level nested map + :maps.from_list optimization”

top_level_key_list |> Enum.reduce(%{}, fn(top_key, data) ->
nested_key_list |> Enum.reduce(data, fn(nested_key, data) ->
Map.update data, top_key,
[{nested_key, 1}],
&([{nested_key, 1} | &1])
end) |>
# after building list, convert it to map
Map.update!(top_key, &(:maps.from_list &1))
end)

3 — Create flat map with composite key

Idea was to build single map with composite key {primary_key, secondary_key} so all objects are under single level in the map. Running Map.update (top-level count*nested level count) turned out to be extremely inefficient and slow. I’ve dropped this method and adopted “build list, convert to map approach”.

In graph this solution is called “flat map + :maps.from_list optimization”

top_level_key_list |> Enum.reduce([], fn(top_key, data) ->
nested_key_list |> Enum.reduce(data, fn(nested_key, data) ->
[{ {top_key, nested_key}, 1} | data]
end)
end) |> :maps.from_list

4 — Create flat map in ETS

This is exactly same way to store data as previous one, but putting data in ETS table. This may be helpful if you need shared access between processes.

In graph this solution is called “ets, flat map”.

table = :ets.new :table, []
top_level_key_list |> Enum.each(fn(top_key) ->
nested_key_list |> Enum.each(fn(nested_key) ->
:ets.insert table, { {top_key, nested_key}, 1}
end)
end)

5-Manually optimize map creation

Using Map.update has few inefficiencies —both initial argument and update function are created, which is unnecessary and it also involves anonymous update function call.

In graph this solution is called “nested map, manual optimized”.

In this case I decided to manually unroll/inline all operations in code:

top_level_key_list |> Enum.reduce(%{}, fn(top_key, data) ->
nested_key_list |> Enum.reduce(data, fn(nested_key, data) ->
case data do
%{^top_key => list} ->
Map.put data, top_key, [{nested_key, 1} | list]
_ ->
Map.put data, top_key, [{nested_key, 1}]
end
end) |> Map.update!(top_key, &(:maps.from_list &1))
end)

Updates and data access

For simulating data update I’ve decided to update half of the keys in the map. For simulating access I’ve decided to check half of the number of the keys in the map, but also replace 50% with keys which do not exist in the map, to have realistic cache hit/miss cases.

Benchmarking results

Creation time with 8 element nested map

Creation time with 32 element nested map

Creation time with 256 element nested map

Creation (8 element nested map) + read access only, no updates

Creation (8 element nested map) + updates + read access

Manual optimization vs Optimized nested map build

Conclusion

  • For purpose of building nested maps just use most obvious Enum.reduce/Map.update solution.
  • Keeping all objects in single map is bad idea, as updates are more costly and creation/access time is practically the same. Manually optimized nested map beats them all :)
  • Use case instead of Map.fetch because latter is slower.
  • If you really want get extra 10–15% performance — write your own map creation code.
  • ETS lags behind on creation and access, but surprisingly comes close to nested map implementations if there are many updates.

About me

I’m Gaspar Chilingarov . I facilitate DevOps transition, help moving legacy applications to cloud and write high-performance Elixir apps.

Need help with your Elixir app or want prototype your next microservice in Elixir? DM me on Twitter or Github.

You can connect with me on Twitter, Facebook, LinkedIn and GitHub.

Found this post useful? Kindly tap the ❤ button below! :) Let’s spread word about Elixir.

--

--

Gaspar Chilingarov
Learn Elixir

I facilitate DevOps transition, help moving legacy applications to the cloud and write high-performance Elixir apps.