Speed up data access in Elixir
Working with data structures was straightforward and predictable in a days of Erlang. Most of the time you were getting exactly what you expect.
Elixir is a different beast. It allows more ways to work with maps, lists and keyword lists. Turns out, performance-wise some of them are radically different than others.
First, some nice graphs of access time depending on structure size and then comes explanation. Raw data and benchmarking code is in the end of the article.
Accessing Maps
There are a few ways to access maps in Elixir.
* Using pattern matching (the fastest)
# %{^key => value} = map
map = %{foo: 1, bar: 2}
%{^:foo => foo_value} = map
Using pattern matching in case
or in function’s clause head is as fast.
* Using Map.fetch
(slower)
Map.fetch
is slower because it not just matches the value, but also creates extra tuple.
map = %{foo: 1, bar: 2}
{:ok, foo_value} = Map.fetch map, :foo
* Using Access
module (much slower)
map = %{foo: 1, bar: 2}
foo_value = map[:foo]
This is the most obscure and non-intuitive behavior in Elixir. If you coming from most other programming languages, you will immediately recognize []
notion as key/index access and most probably use it.
In Elixir []
allow you to do much more, for example addressing and updating all elements in map — I’ll show that in next article. But this comes with huge cost. []
is more than two times slower than pattern matching.
Accessing keyword lists
Erlang and Elixir, in turn, use lists with names tuples to pass around options. Before introduction of maps in Erlang it was only feasible way to have data structure with dynamic properties.
keyword_list = [ {:foo, 1}, {:bar, 2} ]
# or compact form
keyword_list = [ foo: 1, bar: 2 ]
Because this is a list, accessing elements in it requires scan to find necessary element. So access speed is O(n) and you do not expected really use keyword lists with more than couple of elements.
* Using custom functions to speed-up lookups (fastest)
If you know that keyword list you will be using will be limited to 3–4 values, you may want to have own faster lookup function to get values from the keyword list.
def kw_list_access(kw_list, key, default\\nil) do
case kw_list do
[{h, v}|_] -> v
[_, {h, v}|_] -> v
[_, _, {h, v}|_] -> v
[_, _, _, {h, v}|_] -> v
_ -> Keyword.get kw_list, key, default
end
end
* Using Keyword.get
(slow)
This is about 2 times slower than using custom function.
keyword_list = [ foo: 1, bar: 2 ]
foo_value = Keyword.get keyword_list, :foo
You can also supply default value to Keyword.get list, :foo, :default
in case if key is not found in the list.
* Using Access
module (slowest)
The same story as with maps. Access
module is great to provide extra functionality, but it 3.5 times slower compared to custom function and 1.3 times slower compared to Keyword.get
.
keyword_list = [ foo: 1, bar: 2 ]
foo_value = keyword_list[:foo]
Tuples
I will cover tuple access vs maps and lists in other article. Tuples are blazingly fast, but little bit less flexible.
How to benchmark
Clone repository from my github and run
mix bench bench/access_vs_map_fetch_vs_match.exs
Warning: If you are using any modern computer, your OS constantly changes processor speed in order to save energy and keep it cool and quiet. This is great to every day work, but terrible for benchmarks. Before running benchmarks manually set processor to some fixed speed. For example Debian instruction is here.
# show current settings
grep . /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
Number in test name means how much elements were in the tested structure (1,2,3, …). Uncomment other lines for more elements in map/kwlist/etc.
baseline
is just function call without any processing done at all, just to see how much function call costs forbenchfella
.match
is test of%{^key => value}
lookup in mapmap_fetch
is a call toMap.fetch
for map :)kw_get_unroll
- manually unrolled function for lookup in keyword lists up to 4 elements big, then switch to Keyword.getkw.get
- is a call toKeyword.get
on keyword listaccess_kw
is a 'kw_list[index]` callaccess_map
is a 'map[index]` call
So you can see that access_kw1
is at least 2 times slower than manual kw_get_unroll1
or match1
.
I know about flaws of microbenchmarks, but I also did optimization in real code based on this input and had significant speed ups. Access
module is nice, but (very) slow. case map
or matching in function head are the fastest.
Raw data
## AccessVsMapFetchVsMatch
benchmark name iterations average time
baseline2 100000000 0.04 µs/op
baseline128 100000000 0.04 µs/op
baseline8 100000000 0.04 µs/op
baseline4 100000000 0.04 µs/op
baseline3 100000000 0.04 µs/op
baseline1 100000000 0.04 µs/op
baseline16 100000000 0.04 µs/op
baseline32 100000000 0.04 µs/op
baseline64 100000000 0.04 µs/op
kw_get_unroll1 100000000 0.07 µs/op
match1 100000000 0.07 µs/op
map_fetch1 100000000 0.09 µs/op
match2 10000000 0.11 µs/op
kw_get_unroll2 10000000 0.11 µs/op
kw_get1 10000000 0.13 µs/op
kw_get_unroll3 10000000 0.14 µs/op
match3 10000000 0.14 µs/op
access_map1 10000000 0.15 µs/op
kw_get_unroll4 10000000 0.16 µs/op
map_fetch2 10000000 0.16 µs/op
access_kw1 10000000 0.17 µs/op
match4 10000000 0.17 µs/op
map_fetch3 10000000 0.22 µs/op
kw_get2 10000000 0.24 µs/op
kw_get_unroll8 10000000 0.26 µs/op
map_fetch4 10000000 0.28 µs/op
access_map2 10000000 0.28 µs/op
kw_get3 10000000 0.32 µs/op
match8 10000000 0.33 µs/op
access_kw2 10000000 0.39 µs/op
access_map3 10000000 0.39 µs/op
kw_get_unroll16 10000000 0.45 µs/op
kw_get4 10000000 0.45 µs/op
access_kw3 10000000 0.46 µs/op
access_map4 10000000 0.51 µs/op
map_fetch8 10000000 0.53 µs/op
access_kw4 10000000 0.71 µs/op
match16 10000000 0.76 µs/op
kw_get_unroll32 10000000 0.85 µs/op
kw_get8 10000000 0.89 µs/op
access_map8 1000000 1.04 µs/op
map_fetch16 1000000 1.14 µs/op
access_kw8 1000000 1.44 µs/op
kw_get_unroll64 1000000 1.66 µs/op
kw_get16 1000000 1.79 µs/op
match32 1000000 2.08 µs/op
access_map16 1000000 2.72 µs/op
access_kw16 500000 2.81 µs/op
map_fetch32 1000000 2.81 µs/op
kw_get_unroll128 500000 3.20 µs/op
kw_get32 500000 4.49 µs/op
match64 500000 5.20 µs/op
access_kw32 500000 6.20 µs/op
map_fetch64 500000 6.66 µs/op
access_map32 500000 6.97 µs/op
match128 200000 8.70 µs/op
map_fetch128 100000 11.49 µs/op
kw_get64 100000 11.52 µs/op
access_map64 100000 12.01 µs/op
access_kw64 100000 15.42 µs/op
access_map128 100000 22.48 µs/op
kw_get128 50000 31.74 µs/op
access_kw128 50000 37.11 µs/op