Benchmarking avro and fastavro using pytest-benchmark, tox and matplotlib
--
Ever wonder how you can benchmark a block of Python code to see how fast it runs or how do you choose between multiple libraries that do the same thing but not sure which one is faster or which python interpreter is most performant for your application. Fantastic, you have come to the right place. In this blog, I will introduce you to a set of tools that can help you achieve exactly that.
I work for the distributed systems team at Yelp which is responsible for building a Streaming Data Pipeline systems that caters real-time data to our sales, analytics, and data science teams. Any observations or claims that I make here are my own and by no means represents Yelp Engineering.
I am going to take 2 python implementations of Apache AVRO (avro and fastavro), compare their encoders and decoders against each other in terms of performance on various python interpreters. Finally draw conclusion as to what works best.
Lets begin…
Following code uses the pytest-benchmark, an extension for pytest.
- ensure that
pypy
,pypy3
,py27
,py35
andpy36
are installed on your operating system before runningtox
. - tox is responsible for running the
pytest
once for each python interpreter. This is declared inenvlist
variable oftox
. - running
pytest
with-m “benchmark”
only runs benchmark tests. - passing
— benchmark-json=.benchmark-{basepython}
topytest
tellspytest-benchmark
to write the result of benchmark into files. Later I will use these files to compare results. For each run oftox
,{basepython}
evaluates to the version of python interpreter used in that run.
By default pytest-benchmark
displays the results on console along with the rest of the pytest
test suite. And it looks something like
- There is a lot of information in those tables but the one that truly captures the pulse of performance is
OPS (kops/s)
, which basically indicates the number of operations performed each second. benchmark.pedantic
gives control overrounds
,iteration
andwarmup_rounds
.- I have categorized my results into 2 logical buckets for ease of interpretation and comparison. This is achieved by the use of
@pytest.mark.benchmark(group='...')
. encoders
group lists the benchmark results from fastavro schemaless_writer and avro writer.decoders
group lists the benchmark results from fastavro schemaless_reader and avro reader.
This is fine for starters but it gets tedious if we were looks at 5 such groups, one for each python interpreter. Drawing conclusions about which library performs better across python versions is somewhat not obvious here. So I wrote a simple python script that plots the results from .benchmark-{basepython}*
files to do this job.
Conclusion
fastavro
is multifold faster compared toavro
, at-least in all the cpython interpreters.- general performance of
pypy3
is much better than all other python interpreters. avro
is much faster thanfastavro
onpypy
, this is really surprising.
If you are building Python application at scale that does serialization and deserialization with AVRO then fastavro
running on pypy3
is most performant.
Thanks
I want to give a shoutout to Scott Belden and Ryan for their insights with https://github.com/tebeka/fastavro/issues/195 which resulted in me writing this blog.