Faster Elixir Logger part 2
Like with Part 1 you can click on the links throughout the article to get more context.
Coming back to the issue, when I had a free time during a weekend, I setup up trying to test sending messages directly to the process running the console.
The send is just a simple way of sending things to the drv process. The protocol for what you are sending is well documented. The second self() should be a monitor but I hadn’t set up anything to handle that.
So I was looking at a couple of nice libraries for benchmarking the code but first I ran the code and saw that there wasn’t really a difference in getting the code to the console and one call is blocking while my call wasn’t blocking so it didn’t make sense to have to libraries since they are more focused on the runtime and not other things that I was looking for.
So I ended up using /usr/bin/time --verbose since it gives me everything I needed.
Now what to run through the test suite I The Complete Works of William Shakespeare, by the The Project Gutenberg. It was big enough for me to find real difference between the calls and it was fast enough so I wouldn’t be bored with all the test I would have to do :)
I setup a simple Elixir application, so I could make my test reproducible.
First thing I did was just running the entire document right through the commands which didn’t really give interesting results so I ended up splitting it on the new line and running the commands in parallel.
The command I run for the out of the box way of doing it:
The command I used to test the new way of doing it:
/usr/bin/time --verbose mix test test/test_io_test.exs:21
You can run these test yourself and see the difference. The difference I was seeing was less System time, CPU %, and memory.
Given these results I wanted to do a patch for Elixirs Logger since it was worth doing this kind of tricks there and keep the general safe behaviour in IO for now. So this simple patch was the thing I came up with.
I had of course forgotten to include some code to deal with unicode that Jose was quick to spot :)
I ran some tests and made sure I was handling all the results and pushed the changes. It was then merged.
You can test it out for yourself by running these commands and compare the results. You have to use the git version as the comparison and I have the elixir git repo in the same place as the test_io. When everything has showed up I just press Ctrl-c twice, since I haven’t found a better way of running mix from the new repo.
For the unmodified Logger, the version you have on you computer is close enough to the version before modification but if you can always checkout and build d42fbd0ca0753b002637555e61815820a47c1293:
/usr/bin/time --verbose iex -S mix test test/test_io_test.exs:31
For the new Logger:
/usr/bin/time --verbose ../elixir/bin/iex -S mix test test/test_io_test.exs:31
But of course that wasn’t the end of the line there were some messages that were unhandled and some possible issues with overwhelming the GenServer receiving the messages like you can see in the discussion on the pull request.
So what was the main reason why IO is using more cpu. It is the same thing that we were discussing now is that with the system like it was in place we were making sure that the receiving process didn’t get overwhelmed with all the requests so for every request Elixir makes sure that it gets there. It collects the messages in Logger that haven’t gone through yet. So we know if we are doing something wrong. But that comes with a cost, about one and a half times more cpu time.
So in Part 3 I will go over the improvements that both José and James came up with.