Thoughts on Push vs Pull Architectures

6 min readApr 1, 2018

I’ve had a few discussions with people lately about the advantages and disadvantages of different service architectures (queuing, REST, RPC, etc) in a distributed system and wanted to get my thoughts into a post both to help clarify my thinking and to get feedback from the wider world.

What I mean by “push” and “pull”

When creating a distributed system, there are two basic architecture patterns you can follow (and they can be mixed in the same system, if you choose).

First, what I call “push” architectures. This is when a client requests work from a server — the work is “pushed” to the server, which has no choice in the matter. This is probably the most common pattern, and the most common examples are requests to a REST API or RPC calls using some sort of RPC system (gRPC being a current favorite).

Next, what I call “pull” architectures. This is when the server requests work, usually not directly from the client, but often through an intermediary. The most common approach here is when there is some kind of work queue that clients enqueue messages on, and the server pulls messages from that queue. One important caveat is that when an intermediate queue service is used, some of the complexities of “push” architectures can leak into these systems in determining where to enqueue a message, as there may be multiple queue instances.

Two types of “pull” architectures — one where servers request work directly, and one where a message queue is used as an intermediary

In a way, it seems like the choice is irrelevant — nothing more than an implementation detail — but there are actually some important trade-offs, which I’d like to go into here.

Push Advantages

The “push” approach tends to be almost a default for several reasons. First, for request-response operations, it usually involves a persistent connection over which both the request and the response travel. Whether this is an HTTP request/response or some sort of RPC call, there is at least the illusion of one connection that both use (there may be load balancers and other transparent systems in-between). Matching the response to the request is as simple as seeing what response comes back over the same connection.

Especially when using HTTP, just about every language in modern use has both a good HTTP client and HTTP server in either it’s standard library or in well-known, heavily-used external libraries. We have so much experience with HTTP that inserting a load balancer to allow several server instances is a pretty easy exercise.

If we want routing more complicated than blindly sending requests to a load balancer, we can make some decisions on the client side, which gives us a certain distributed ability to make locally-optimal decisions. For example, as a client, I may send more requests to servers that respond faster or have a lower error rate. I may also know the difference between “close” server processes (on the same host or in the same datacenter) versus “far” processes (in a different datacenter in the same region, or in a totally different region), and use that to limit myself to the “closest” healthy server processes.

Push Disadvantages

The first disadvantage of a push system is you need to know where to send your request. You need a hard-coded list of addresses, a set of load balancer endpoints, DNS names that can be used with A or SRV lookups, or a whole service discovery system such as Consul, Eureka,or Zookeeper, or something else.

Another issue is load balancing. A naive approach (such as round-robin or random) works fine if the system is lightly loaded compared to its capacity and the back-ends are mostly homogeneous, but those approaches can be brutal when the system is heavily loaded, the requests take different amounts of time to process, or the backends are not comparable in performance. More complicated approaches make the client heavier and heavier, which is especially an issue if you have clients written in different languages and need that logic implemented for all of them.

Pull Advantages

The biggest advantage of pull systems is that they distribute work to those that can process it. Server/worker processes only request work if they have capacity, and if one gets blocked for some reason, work will naturally go to processes that still are functioning properly. This generally maximizes your scalability for a given amount of computing resources. If some kind of queueing system is used as an intermediary, we also don’t have to worry about overloading the back-ends — the queue system can hold messages until some server believes it can take the message. A push system may need to add server logic to avoid processing too many requests at once, and may have limited ability to avoid facing resource exhaustion from too many clients connecting.

If an intermediary is used, the clients and servers don’t need to know anything about each other. They only need to know how to reach the intermediary/intermediaries, which may be a much simpler problem. The need for service discovery is either simplified or eliminated. The intermediary process can also do a little work — for example, it could fork messages to two different servers.

Pull Disadvantages

The biggest problem with pull systems is probably when you have a request/response communication pattern. It is easy enough to send the request, but how do you get the response? There are several approaches, but they all feel like hacks: declare a one-time use queue or mailbox for responses to get put in, have the client run a HTTP server and send the address along with the request so the response can be sent in a REST request, have all responses get broadcast to all clients and each client only pays attention to the responses it is waiting on, etc.

Another problem with pull is if you want complicated routing, the complexity is likely to get as bad or worse than with push — an example would be routing to the “nearest” servers may involve posting messages to a queue named with the location of the client, and the servers having some complicated logic to pull preferentially from “local” queues (though you probably want them to also be able to pull from other queues if the servers in those areas are overwhelmed or non-functional for some reason).

Service Meshes

It’s worth noting that service meshes are starting to get popular now and resolve some of the problems of the push architecture. With a service mesh, like Istio or LinkerD, a lot of the client complexity can exist in a sidecar process. This means the clients just need to know how to make a basic request, and the service mesh can handle routing, service discovery, retries, circuit breakers, instrumentation, and more. I don’t have enough experience with this approach yet to tell how great service meshes are, but it’s an area to keep an eye on for sure.

Wrapping It Up

In the end, there are a bunch of trade-offs between the different approaches, and the right one likely depends on the application and how comfortable you are with the different patterns. I know I’m thinking about multi-datacenter applications right now, and the differences between push and pull architectures in that environment is something that is on my mind.

As always, I’m glad to hear any feedback!