EXPEDIA GROUP TECHNOLOGY — SOFTWARE

The Weird World of gRPC Tooling for Node.js, Part 3

Peak productivity parsing protos in-process

Published in

Expedia Group Technology

6 min readJun 18, 2020

If you’re creating a gRPC server in Node.js, loading and parsing protobuf schemas at runtime results in server code that is idiomatically JavaScript. Part 1 of this series surveyed the available tools for creating such a server, concluding that there were only two production-ready paths to a working product. Part 2 of this series explored the first of those: building a server using build-time code generation with protoc. The story exposed some limitations and tradeoffs inherent to that approach, the most salient of which was the need to manually create marshalling code for each request and response.

In this final installment, we’ll build a server using the other approach: dynamic, reflection-based marshalling. You can find here a complete, working application that contains all of the sample code in this story and the other installments.

A hiker in a dark forest striding toward sunlight — Photo courtesy of kiwihut on Unsplash

Retooling the server

Initializing a gRPC server requires passing a handlers object and a protobuf descriptor of the service to be provided, which includes pointers to the methods that marshal and unmarshal the request and response messages and their embedded data types. Dynamic proto parsing doesn’t change that, but it changes how that descriptor is created.

In static generation, protoc parsed the service and RPCs defined in library_api.proto to produce library_api_grpc_pb.js. This was accomplished via a codegen script in package.json. The dynamic version has no codegen in the start and debug scripts; node . is all that’s needed.

However, using dynamic proto parsing means that the .proto files need to be included and distributed with your application. If they come from a third party, you will need to make sure that you have a process in place to keep your copies of them up-to-date and deployed.

Server construction

Constructing a server dynamically is similar to constructing one statically. The big difference is that, instead of importing the generated code, we must call two methods to parse and decorate the protobuf definitions. Static generation couldn’t handle the packages I described in the previous story, but I’ve reintroduced them to the .proto schema in this story.

proto
├── com
│   └── rcbrown
│       └── grpc
│           └── v1
│               ├── library_api.proto
│               └── library_book.proto

Here is the new construction code:

protoloader is the part that does at runtime what protoc did at build time. There are many examples that look like this on the internet, and most simply say without further explanation that one should take the result of protoloader.loadSync, pass it to grpc.loadPackageDefinition, and pass that to addService. But I’d like to dig a little bit deeper.

Here’s a debugger screenshot (since there’s no source code — apologies for accessibility) of what is returned from protoloader.loadSync:

Debugger screenshot showing a JavaScript object containing hierarchies like “com.rcbrown.grpc.v1.LibraryAPI/GetBooks”

It’s not obvious from the screenshot, but com.grpc.v1.LibraryAPI is a single string key, not nested objects collapsed by my IDE. This code is a simple list of fully-qualified names found in library_api.proto and its imports, with metadata describing the objects and services for each. In addition, fully-implemented functions are provided in the metadata for serialization and deserialization. These are reflection-based generic functions, not some sort of eval or other runtime-generated source code.

libraryApiServiceDescriptor, produced from grpc.loadPackageDefinition, looks like this:

Debugger screenshot showing an object hierarchy like “com/rcbrown/grpc/v1/LibraryApi/service”

That’s not much different. The two obvious changes from libraryApiPackageDefinition are:

The objects are now deeply nested; the string com.rcbrown.grpc.v1.LibraryAPI is now a chain of nested objects accessed using that notation.
There is a new service object under LibraryAPI. It contains an entry for each RPC, though the contents of each looks like the same descriptors received from protoloader.loadSync.

The libraryApiServiceDescriptor is ready to be passed to addService, along with a handlers object.

Handling RPCs

Creating the handlers is very different when using protoloader. No marshalling code is required; the reflection-based messages take care of it. For this trivial example, the handlers can be one-liners:

getBooks = (call, callback) => callback(null,
    { libraryBooks: this.dao.getBooks() });checkoutBook = (call, callback) => callback(null,
    { libraryBook: this.dao.checkoutBook(call.request.title) });

But this is a bit more explicative:

That’s more like it! If your gRPC schema and the format of your data in its store are structurally equivalent, you can skip all marshalling code except wrapping/unwrapping your data in message objects. And that is trivial because the message object is an ordinary JavaScript object.

For a fully modern JavaScript experience, you can use grpc-promise to promisify all the synthesized methods with a single call, then use await.

A dynamic client

The library API client is constructed similarly to the server. Here is the entire client code:

The individual RPC calls were difficult to write because it was hard to know where to start. One would expect that examining the client class returned from the service descriptor would be clear enough, but here’s what it looks like in the debugger:

Debugger screenshot of LibraryApiClient. The display isn’t useful for understanding what functions the client provides.

It’s too dynamic to comprehend. I really wouldn’t have known how to write the client methods without looking for examples in the internet’s collective consciousness, as gRPC’s documentation is limited to a couple of terse examples.

I found it easier to reason about returning values from the client by returning Promises.

Two roads diverging in a snowy forest, meant to be evocative of Robert Frost’s poem “The Road Not Taken” — Photo courtesy of Oliver Roos on Unsplash

That has made all the difference

Static code generation, as described in the last installment of this series, had some significant shortcomings:

Can’t handle packages
Requires handwritten, unintuitive marshalling code
Complicates development scripts
Can still fail at runtime if you get it wrong

The dynamic version has its own relative shortcomings:

Requires schema to be deployed with application and kept current, or requires integration with a schema registry
Initial bootstrapping tricky without generated source to refer to

And both are sparsely documented, though there are many more examples of dynamic usage in the internet hive mind. For most use cases, the dynamic approach’s positive developer experience and maintainability makes it the more attractive option.

Additional admonitions

Unless you can live with beta server functionality from grpc-js, you will be leaning on the grpc npm package for core gRPC functionality. This package contains the native C++ runtime, which has some implications for deploying the application.

We were writing an AWS Lambda that needed to make a call to a gRPC server. Our continuous integration (CI) server, Jenkins, runs on an Ubuntu Docker image. During the npm install phase of the build, node-gyp automatically downloads the appropriate gRPC build for Ubuntu and bundles it into the application. Unfortunately, the lambda runs on AWS Linux 2, which requires a different runtime. We therefore had to perform an npm rebuild in our npm build script (with line breaks inserted for readability):

"build": "mkdirp dist/node_modules .cloudformation \
  && cpx \"src/**/*\" dist                         \
  && cpx package.json dist                         \
  && cd dist                                       \
  && npm install --production --no-package-lock    \
  && npm rebuild grpc                              \
         --target=12.13.1                          \
         --target_arch=x64                         \
         --target_platform=linux                   \
         --target_libc=glibc                       \
  && bestzip ../.cloudformation/lambda.zip *       \
  && rimraf ../dist",

The parameters to npm rebuild must match the target architecture. npm rebuild is itself mysterious, yet another reminder that you are a pioneer in a weird world of gRPC on Node.js.

I hope this blog series was enlightening to you. I found it an interesting voyage of discovery.

On the Expedia Group Technology Blog, Nikos Katirtzis of Hotels.com™ (part of Expedia Group™) published a thoughtful and informative blog series about implementing gRPC with Java in Kubernetes:

Introducing gRPC to our Hotels.com Platform — Part 1

Learnings from our experiments with gRPC

medium.com

Introducing gRPC to our Hotels.com Platform — Part 2

Continuing our learning about gRPC — this time implementing a service