Is gRPC the Future of Client-Server Communication?

Bleeding Edge Press
17 min readJul 19, 2018

--

A logical place to start when answering this question is to first talk about what the letters in gRPC mean. It is an acronym after all:

Remote
Procedure
Calls

This is the programming idiom that gRPC presents to application developers.

So what is the “g” in “gRPC?” This technology was created by Google as an open source evolution of their internal RPC technology named Stubby, and they continue to be the stewards of the official open source project. So the “g” is widely thought to stand for “Google.” Google has tried to make it clear, however, that they want a community collaborating with and accepting contributions and input from developers outside of Google. To that end, the “g” has never officially meant “Google.”

In fact, a new meaning is assigned to the “g” for each release. In August 2016 during the 1.0 release of gRPC the acronym stood for “gRPC Remote Procedure Calls.” In the numerous minor releases since then, the “g” has gone through equally numerous redefinitions, including good, green, gambit, glamorous, and glorious. This history of what the “g” stands for is documented in their main repo on Github.

Let’s take a look at what gRPC is, starting with some of the principles on which gRPC is built, and also look at what distinguishes gRPC from other RPC systems, and how it compares to other widely-used technologies.

HTTP and REST

HTTP is the protocol that powers the worldwide web. It stands for Hyper-Text Transfer Protocol. It’s a text-based request-response protocol. Because it is text-based, the protocol itself is human-readable (data payloads, which can be binary, may not be). It was originally designed for accessing documents across the Internet, but it has become a ubiquitous protocol, not just for web browsers, but for all kinds of computer-to-computer interactions.

Subsequently, HTTP and “web technology” have become the substrate of choice for many kinds of interactions thanks to common-place open source software components and a range of hardware components that make it possible to build and deploy HTTP-based systems with ease and at great
scale.

REST, which stands for REpresentational State Transfer, is based on HTTP. It defines constraints and conventions on top of HTTP that are intended to provide global interoperability and potential for scalability. A key architecture constraint in REST is statelessness: application servers do not store client context in between requests. Any client state needed to process a request is included with each request. This enables systems to grow to very large scale: requests can be load balanced across a large pool of servers, and multiple requests from a single client don’t have to be handled by a single server.

The concepts used to define REST APIs focus on managing “documents” or “entities.” REST’s primitives are the various HTTP methods: GET, PUT, DELETE, POST, PATCH, etc. These methods map, more or less, to CRUD operations on these entities. (CRUD stands for Create, Read, Update, and Delete: the set of actions needed to work with data.) A REST API defines naming conventions so that clients can construct URLs to identify particular documents or entities. It also defines semantics for each of the methods when applied to a particular resource or document. Request and response payloads are used to transmit document contents.

This is a very straight-forward model for a service that exposes a database-like or filesystem-like resource: you can use HTTP GET requests to list documents or retrieve document contents, and you can use PUT, PATCH, and DELETE requests to create new documents or to modify or delete existing ones. When a service has more complicated and less document-centric operations, mapping them to REST can be a bit more challenging.

There are numerous RPC systems that are built on top of REST. Some leverage the full flexibility of REST, such as Swagger and JAX-RS (the Java APIs for REST-ful web services). Swagger is a cross-platform and language-agnostic technology whose IDL-the Swagger spec-describes rules for how a set of operations is represented in HTTP requests and responses. The spec describes the HTTP method, patterns for URI paths, and even the request and response structures for each operation.

Code generation tools exist to create client stubs and server interfaces from a Swagger spec. JAX-RS, on the other hand, is Java-only, and its IDL is Java itself. A Java interface is used to define the operations, with annotations that control how its methods, arguments, and return values are mapped to the HTTP protocol. Servers can provide implementations of the interface. And
a client stub is a runtime-generated implementation of that interface, which uses reflection to intercept method calls and transform them into outgoing HTTP requests.

Other RPC systems are based on HTTP. From a certain point of view, they can be seen as narrow subsets of REST: they still adhere to its architectural principles, but put forth significant constraints/conventions on resource naming, methods, and the encoding of documents to simplify the implementation of both clients and servers. Such systems were created during the web’s “adolescent era,” including XML-RPC and SOAP. As the popularity of XML declined, different approaches to content encoding came into favor, such as JSON-RPC. In fact, gRPC finds itself in this category.

RPC systems

Photo by Mathew Schwartz on Unsplash

Here we cover the typical traits shared by all RPC systems. There are systems that contain some of these traits, whereas mature systems provide them all:

  1. Because it’s RPC, the programming model is of course procedure calls: the networking aspect of the technology is abstracted away from application code, making it look almost as if it were a normal in-process function call versus an out-of-process network call.
  2. There is a way to define the interfaces as the names and signatures of the procedures that can be invoked along with the data types that are exchanged as arguments and return values. For RPC systems that are language-agnostic (e.g. can be used with multiple programming languages), the interface is typically defined in an Interface Definition Language, or IDL for short. IDLs can describe the shape of data and interfaces but cannot express business logic.
  3. The RPC systems often include code generation tools for transforming the interface descriptions into usable libraries. And they include a runtime library that handles the transport protocol details and provides an ABI for the generated code to hook into that transport implementation. Some systems rely more heavily on runtime reflection and less on code generation. And this can even vary from one programming language to another for the various implementations of a single RPC system.
  4. Unlike REST, these systems typically do not expose all of the flexibility of HTTP. Some eschew HTTP completely, opting for a custom binary TCP protocol. Those that do use HTTP as a transport tend to have rigid conventions for mapping RPCs to HTTP requests, which often cannot be customized. The details of what the HTTP request looks like are meant to be an implementation detail encapsulated in the system’s transport implementation.

gRPC

Photo by Brad Huchteman on Unsplash

The first thing to note is that the architecture of gRPC is layered:

  • The lowest layer is the transport: gRPC uses HTTP/2 as its transport protocol. HTTP/2 provides the same basic semantics as HTTP 1.1 (the version with which nearly all developers are familiar), but aims to be more efficient and more secure. The new features in HTTP/2 that are most obvious at first glance are (1) that it can multiplex many parallel requests over the same network connection and (2) that it allows full-duplex bidirectional communication. We’ll learn more about
    HTTP/2 and the ways it differs from and improves on HTTP 1.1 later in the book.
  • The next layer is the channel: This is a thin abstraction over the transport. The channel defines calling conventions and implements the mapping of an RPC onto the underlying transport. At this layer, a gRPC call consists of a client-provided service name and method name, optional request metadata (key-value pairs), and zero or more request messages. A call is completed when the server provides optional response header metadata, zero or more response messages, and response trailer metadata. The trailer metadata indicates the final disposition of the call: whether it was a success or a failure. At this layer, there is no knowledge of interface constraints, data types, or message encoding. A message is just a sequence of zero or more bytes. A call may have any number of request and response messages.
  • The last layer is the stub: The stub layer is where interface constraints and data types are defined. Does a method accept exactly one request message or a stream of request messages? What kind of data is in each response message and how is it encoded? The answers to these questions are provided by the stub. The stub marries the IDL-defined interfaces to a channel. The stub code is generated from the IDL. The channel layer provides the ABI that these generated stubs use.

Protocol Buffers

Another key component of gRPC is a technology called Protocol Buffers. Protocol Buffers, or “protobufs” for short, are an IDL for describing services, methods, and messages. A compiler turns IDL into generated code for a wide variety of programming languages, along with runtime libraries for each of those supported programming languages.

It is important to note that Protocol Buffers have a role only in the last layer in the list above: the stub. The lower layers of gRPC, the channel and the transport, are IDL-agnostic. This makes it possible to use any IDL with gRPC (though the core gRPC libraries only provide tools for using protobufs). You can even find unofficial open source implementations of the stub layer using other formats and IDLs, such as `flatbuffers` and `messagepack`.

Other RPC Systems

There are too many other RPC systems to realistically review them all, but the table below enumerates the more well-known systems. Some are older technologies that are no longer widely in use (or they are in use primarily in legacy and enterprise systems). Some are newer technologies, like gRPC. They are listed oldest to newest, showing the history and evolution of RPC.

Some notable innovations of RPC systems over the years are the “distributed object” paradigm, Cap’n Proto’s “promise pipelining,” and gRPC’s streaming.

To describe a distributed object system, we first need to talk about a key constraint in non-distributed-object RPC systems: the operations that a server exposes and the implementations to which they are bound are fixed. When a server starts up, its service implementations are registered to be exposed by the network server. In an OOP paradigm, this amounts to the service implementations being singletons.

To contrast, in a distributed object system, the server can dynamically allocate new objects at runtime whose methods are also exposed via RPC. The existence of these objects can be communicated to clients as RPC return values. A typical RPC result might be a value that is serialized to bytes and transmitted to the client. But with distributed objects, the system might just serialize and transmit “handles,” not the actual values. These handles are then used to create new stubs that are returned to the client application as the RPC result. Calling methods on these stubs in turn issues RPCs, which result in invoking methods on the underlying instances in the server.

Cap’n Proto’s “promise pipelining” allows clients to pipeline many requests without waiting on a server response, even when the arguments to one RPC may depend on the result of an earlier one! It does this by allocating a promise “handle” for each RPC in the client. So, even before the server may have received the first request, the client has a handle to the first response. It can
then refer to that promise in a subsequent request, and send that to the server.

The server is responsible for resolving these handles: once the result for the first RPC is computed, the server machinery can bind it to the handle in the second RPC and then begin computing a result for the second RPC. This feature is an optimization to decrease the latency for multi-step operations, because it can greatly reduce the amount of time waiting on messages to transit the network. gRPC’s streaming is covered below.

Streaming

The closest siblings to gRPC are Thrift, an Apache project, and Twirp, open sourced by Twitch. But neither of these include support for streaming. gRPC, on the other hand, was designed with streaming in mind from the outset.

Streaming allows a request or response to have an arbitrarily large size, such as operations that require uploading or downloading a huge amount of information. Most RPC systems require that the arguments of an RPC be represented as data structures in memory that are then serialized to bytes and
sent on the network.

When a very large amount of data must be exchanged, this can mean significant memory pressure on both the client process and the server process. And it means that operations must typically impose hard limits on the size of request and response messages, to prevent resource exhaustion. Streaming alleviates this by allowing the request or response to be an arbitrarily long
sequence of messages. The cumulative total size of a request or response stream may be incredibly large, but clients and servers do not need to store the entire stream in memory. Instead, they can operate on a subset of data, even as little as just one message at a time.

Not only does gRPC support streaming, but it also supports full-duplex bidirectional streams. Bidirectional means that the client can use a stream to upload an arbitrary amount of request data and the server can use a stream to send back an arbitrary amount of response data, all in the same RPC. The novel part is the “full-duplex” part. Most request-response protocols, including HTTP 1.1 are “half-duplex.” They support bidirectional communication (HTTP 1.1 even supports bidirectional streaming), but the two directions cannot be used at the same time. A request must first be fully
uploaded before the server begins responding; only after the client is done transmitting can the server then reply with its full response. gRPC is built on HTTP/2, which explicitly supports full-duplex streams, which means that the client can upload request data at the same time the server is sending back response data. This is very powerful and eliminates the need for things like
web sockets, which is an extension of HTTP 1.1, to allow full-duplex communication over an HTTP 1.1 connection. Thanks to streaming, applications can build very sophisticated conversational protocols on top of gRPC.

Where to use gRPC?

So now you know that gRPC is a request-response protocol for streaming RPC that uses Protocol Buffers to define interfaces and messages. But, what about where or why you would use gRPC, or what programming languages can be used with gRPC?

The “where” is pretty easy: you can leverage gRPC almost anywhere you have two computers communicating over a network:

  1. Microservices: gRPC shines as a way to connect servers in service-oriented environments. One of the original problems its predecessor, Stubby, aimed to solve was wiring together microservices. It is well-suited for a wide variety of arenas: from medium and large enterprises systems all the way to “web-scale” eCommerce and SaaS offerings.
  2. Client-Server Applications: gRPC works just as well in client-server applications, where the client application runs on desktop or mobile devices. It uses HTTP/2, which improves on HTTP 1.1 in both latency and network utilization. This means you get improved response times and longer battery life.
  3. Integrations and APIs: gRPC is also a way to offer APIs over the Internet, for integrating applications with services from third-party providers. As an example, many of Google’s Cloud APIs are exposed via gRPC. This is an alternative to REST+JSON, but it does not have to be mutually exclusive. There are tools for easily exposing gRPC services over REST+JSON, such as `grpc-gateway`.
  4. Browser-based Web Applications: The last big area superficially seems like a poor fit. JavaScript code running in a browser cannot directly utilize gRPC because gRPC has a strict requirement for HTTP/2, but browser XHRs do not. However, as mentioned above, there are tools for exposing your gRPC APIs as REST+JSON, where they can then be easily consumed by browser clients.

For each of the above situations where you might use gRPC, there are alternatives. In fact, REST and JSON are a sort of de-facto standard for all of these situations. So why would you use gRPC instead? There are several dimensions along which gRPC wins out over the others, particularly over REST and JSON:

1. Performance/Efficiency: HTTP 1.1 is a verbose protocol and JSON is a very verbose message format. They are great for human-readability, but less so for computer-readability, requiring a good deal of string parsing. HTTP 1.1 also has a severe limitation of how a single connection can be used for multiple requests: all requests must be sent back in the order the corresponding requests were received. So clients that use pipelining will see head-of-line blocking delays, but the later responses may have been computed quickly, and must wait for earlier responses to be computed and transmitted before they can be sent. And the other alternative, using a connection for only one request at a time and then using a pool of connections to issue parallel
requests, consumes more resources in both clients and servers as well as potentially in between proxies and load balancers.

HTTP/2 and Protocol Buffers do not have these problems. HTTP/2 is much less verbose thanks largely to header compression. And it support multiplexing many requests over a single connection. Protocol Buffers, unlike JSON, were designed to be both compact on the wire and efficient for computers to parse.

The result is that gRPC can reduce resource usage, resulting in lower response times compared to using REST and JSON. This also means reduced network usage and longer battery life for clients running on mobile devices.

2. Productive Programming Model: The programming model with gRPC is simple to understand and leads to developer productivity. Defining interfaces and canonical message formats in an IDL means a lot of boilerplate code is auto-generated.

Forget the days of manually wiring up server handlers based on URI paths and then manually marshalling paths, query string parameters, requests, and response bodies. Similarly, forget the days of manually creating HTTP request objects, with all of the same overhead on the server side.

While there are myriad tools and libraries that can alleviate this burden for REST+JSON (including a great number of home-grown, proprietary solutions within organizations and projects), they tend to vary significantly from one programming language to another, possibly even from one project to another. And in some cases they are incomplete: one library, for example, may
address one aspect (reduce boilerplate in servers) but fail to address others (common definition of interfaces and message schemas, reduce boilerplate in clients). gRPC is thorough: it addresses all of these concerns. It also does so in a way that is consistent across numerous programming languages. If you write code in multiple languages and/or in a polyglot environment, this is
particularly poignant.

3. Streaming: One of gRPC’s “killer features” is full-duplex bidirectional streaming. While the great majority of RPCs will be simple “unary” operations (single simple request, and single response), there are often cases where something more sophisticated is called for. Whether its affinity (often for performance reasons), server-push facilities for sending notifications, or
something more complicated, it can be done using gRPC streams.

Historically, this required writing a custom TCP-based protocol. This can be a good solution in the datacenter, but is much less realistic when clients and servers are separated by a WAN or the Internet: open source proxy/load balancing software and hardware load balancers aren’t as effective when they don’t understand the protocol. Even in the datacenter, where you might have a fairly homogeneous set of microservices that can all speak the protocol, developing a custom protocol and client and server libraries is not trivial.

It is possible to do some of these sophisticated things with HTTP 1.1. The most flexible and widely-supported approach involves using web sockets, which basically let you create a custom TCP-based protocol, tunneling it over HTTP connections. But this still comes with the cost of inventing the protocol and implementing custom clients and servers.

4. Security: gRPC was designed with security in mind. It is of course possible to use HTTP/2, and thus gRPC, in an insecure way with plain text connections. But when using TLS (transport-level security, sometimes called SSL), HTTP/2 is more strict than HTTP 1.1. It allows only TLS 1.2 or higher; numerous cipher suites are blacklisted because they provide inadequate
security; and compression and re-negotiation are disabled. This greatly reduces the surface area of TLS vulnerabilities to which HTTP/2 connections are subjected.

Now that we know where gRPC is well-suited and why it is a good fit, the last question is, “What programming languages can you use with gRPC?” The following table shows the languages supported by the core gRPC and Protocol Buffers projects. In addition to these officially supported languages, you can also find unofficial open source gRPC libraries for other languages (such as Rust, Erlang, and TypeScript to name just a few).

  • The term “reflection support” refers to being able to use Protocol Buffer descriptors to define and interact with message types at runtime. Without reflection, applications must always use `protoc` to generate code for message types. These limitations are only in the core gRPC and Protocol Buffer projects. For most, there exist third-party libraries to fill the gap.
  • Browsers must use alternative stubs and a web-to-gRPC proxy such as
    `grpc-gateway`, or `grpc-web`.

We see some interesting constraints to the supported languages above. It is obvious why Objective-C and Android Java don’t support gRPC servers: they target mobile devices that will always act as clients. For PHP, which is often used for building web servers, and the omission is a technical one: PHP integrates with web servers, as a module or via CGI. These interfaces were not
designed with HTTP/2 in mind, so they are not compatible (even if the web server such as Apache, Nginx, or Lighttpd does support HTTP/2).

Summary

In this post we learned the details of what gRPC is and what differentiates it from similar technologies. We also learned what makes gRPC a good fit for numerous applications and what languages can be used with gRPC.

If you’re interested in learning how to actually use gRPC and move into several more advanced topics for making the most of gRPC, please check out “Practical gRPC” by Joshua Humphries, David Konsumer, David Muto, Robert Ross and Carles Sistare, a how-to guide published by Bleeding Edge Press.

“Practical gRPC” at Bleedingedgepress.com

This post was excerpted from this book’s chapter titled, “What is gRPC,” by one of the authors, Joshua Humphries.

Joshua Humphries (jhump on GitHub) has been working with Protocol Buffers and building RPC systems and related facilities for over six years. He was first introduced to Protocol Buffers and "Stubby" (gRPC's forebearer) while working at Google. Afterwards, he led a team that worked on protobuf-based RPC at Square, including "smart clients" in Java, Go, and Ruby that handled service discovery, load balancing, automatic retries, and automatic geographic failover. Joshua has been an advocate of gRPC since its initial release. He continues his work with Protocol Buffers and gRPC as part of an infrastructure team at FullStory, a customer experience management platform. He is a contributor to the Go open-source projects for gRPC and Protocol Buffers, the author of a Go library for Protocol Buffer reflection named protoreflect, and the author of gRPC-related projects open-sourced by FullStory, including grpcurl.

--

--

Bleeding Edge Press

Bleeding Edge Press is publishing books and videos at the speed of technology. Follow us for news, exclusive content, and discounts.