Elixir GenServer call vs cast
GenServer.call is not just for returning a response
I have been working on a side project using Elixir with a friend who is new to the language. He has some functional programming experience, but none using the Erlang/Elixir actor model. It was tricky at times to explain the fact that a single actor in this world can do only one thing at a given time even though you often hear about the BEAM VM and its abilities to handle concurrency. The reality is that multiple actors are often needed to get things done.
Somewhat related to this are the concepts of cast and call in a GenServer. At first glance when building a system with OTP fundamentals it is easy to think the following:
It says in the docs that cast is asynchronous. Since I do not need a reply in this situation, the cast will be better than a call. Asynchronous code seems cool. Synchronous code doesn’t seem cool. I want to be cool.
Unfortunately, the idea that a cast is superior or gives you an advantage can potentially become a nasty problem as you scale out. I’ve found it much safer to always start with a call which allows me to process heavy workloads at a much more consistent rate without worrying about overloading portions of my systems. Having said that, a cast is perfectly fine to use in the right scenario. However, if your workload is capable of fluctuating heavily and can spike at times, it may be best to keep things under control using a call or some other form of backpressure to keep clients from overloading portions of your system.
The official learning guide on the elixir-lang website has a small section on the same topic which can be found here. Regarding the usage of a call that documentation says:
This (call) should be the default choice as waiting for the server reply is a useful backpressure mechanism.
TLDR; The decision between cast and call is more than just determining if you need a reply. When in doubt, use a call.
Hopefully the silly analogy below sheds some light on this subject.
This is Bob. Bob is a GenServer.
Bob really just likes two things in his little world.
He enjoys doing chores and checking his mailbox for new chores to do.
One chore Bob is really good at is doing dishes.
This is Mary. Mary hates doing chores.
She really dislikes doing dishes.
Mary has a lot of other things to deal with and is looking to offload her dirty dishes. She heard about Bob and his chore service.
Whenever Mary needs dishes done, she sends a message to Bob’s chore service mailbox.
She is going to send the message to Bob using a call
which let’s Bob know that Mary is expecting a message when the work is done.
Mary now will sit and wait for Bob’s reply. Mary can do nothing else until that plate comes back clean.
When Bob receives the message telling him to wash a dish, he gets started right away.
Until that point, he’s been standing at his mailbox just waiting for the chance to wash some dishes.
He really loves it that much.
He will not do anything else until he is done washing.
If a new chore message arrives in Bob’s mailbox while he is doing dishes for Mary, it will just have to wait until he is finished washing.
The only real way Bob could expand his chore operation is to hire more people like himself. But for now, he is fine handling the workload.
Mary is currently his only customer.
When Bob finishes a chore for Mary, he will send a message to her mailbox telling her it is done. This is because she sent the message using a call
. This tells Bob to reply to Mary.
When Mary knows the dish is done, she can move on with her day. The call
blocks Mary from doing anything else, including adding more work to Bob.
Bob is a great dishwasher and Mary really enjoys using his service, but there are days where the load of dishes is pretty big.
Mary has her own work to do and cannot afford to wait for Bob to finish each dish before she can get to her own work.
She really needs to tell Bob to wash a lot of things and still be able to get her own work done. She can send the chore messages to Bob using a cast
to make her life easier.
This allows Mary to move on to other things while Bob does the dishes.
The cast
is non-blocking for Mary only. Bob still cannot be interrupted while washing dishes.
Bob still receives the messages in his mailbox the same way as before when she uses cast
, but he knows not to bother telling Mary the dish is clean each time. He just goes to his mailbox when he is done and looks to see if there are more dishes to be washed.
Mary trusts Bob just to get it done and that works out great for a while.
Unbeknownst to Mary, Bob has managed to get himself in a little bit of debt and his friends suggested he pick up additional clients for his chore service.
All of the new clients are sending chore messages using cast
rather than call
, they do not want to wait until Bob is done just like Mary doesn’t.
The only issue is that Bob has accepted more work than he can handle in a given day and the workload shows no sign of letting up.
Mary has no idea Bob is struggling. She trusts Bob. Mary continues sending chore messages to Bob.
After a day or two, Mary starts to get irritated that Bob hasn’t finished all of her dishes yet. She is planning a dinner with friends and has no forks.
(A standard GenServer.call
would time out by now, but just pretend with me)
Bob is still working 24 hours a day, but just not keeping up at the rate of which he is receiving messages. He is really trying his best, but this is not going to work for his clients.
Mary stops sending chores to Bob out of frustration.
A couple days later, Bob finally gets caught up with the workload.
He and his customers get together and realize they are all to blame for the slowdown. They all agree the best way to handle this situation, for now, is to send messages to Bob using a call
and just wait for each dish to get done.
Using this approach Bob is able to provide his service to all of his clients but while he washes a dish for Mary for example, she will not be able to add any additional dishes to his queue. This gives other clients a fair shot to get a dish washed after he is done with her. If Mary needs another dish washed, she can send a message after Bob completes her current request.
Think of it this way. If Bob has ten total clients that could utilize his washing service and they all rely on a call
to interact with Bob - this tells us, at peak, Bob could only really have a total of nine or ten dishwashing messages to process. This is because after a client sends a call
to Bob, they can do nothing else until Bob replies indicating it is done. If we used a cast
instead, that number could grow as large as the VM settings would allow it to, potentially bringing down the entire system.
Everyone is mostly happy now, but there are times that Mary still gets sick of waiting on Bob when her message may end up fifth in the mailbox because he is doing work for other clients. But at least she knows now that she can expect her dish to be cleaned within a reasonable amount of time rather than it possibly taking hours or days.
The long-term fix here is to encourage Bob to hire some friends to help in his chore business. Together they could work with a supervision tree to assist them in the workload.
If Bob manages to hire nine other friends and keeps his ten clients, he would know that a client would never have to wait on more than just their dish to be washed. That would probably make his clients happy.