Startup friendly guide to securing GRPC connections using Traefik

Ishaan Bahal
Meeve
Published in
9 min readJan 29, 2019

--

Anyone who has read even a little bit about GRPC, has probably read about the fantastic TLS support it provides. However, its actually hard to set it up if you’re using GRPC with mobile devices and web, at least it was for me. The issue was, most docs and tutorials talked about using a custom cert to talk directly to the GRPC server and securing that communication.

In this post, we cover SSL termination at the LB or reverse proxy level and still keep the original servers running without a change.

As a small example, we’ll try connecting a flutter (dart) app to our now secure setup.

Let’s get to know our options

  1. Nginx (v1.3.10): Nginx recently added support for GRPC with the version 1.3.10. If you’re already setup with Nginx, you can keep using it. The issue with nginx is, it cannot do automatic cert renewal and you have to either manually get the certs from LetsEncrypt or setup a cron. Read more about GRPC support for nginx at: https://www.nginx.com/blog/nginx-1-13-10-grpc/ (this is pretty much the only documentation on the subject)
  2. Envoy-proxy: The envoy is a sidecar proxy, and a great concept. The only issue was, it’s slightly harder to get started with and seemed to have a whole lot of configuration to be done. It does have good support for HTTP/2 and GRPC. Also, the documentation for envoy is a bit scary, and mostly talking about configurations, and you have to read through a lot to understand the GRPC bit.
  3. Traefik: Traefik is a reverse proxy that is supposed to be simple, automatically detects services, so you don’t need to write rules, has support for HTTP/2 and GRPC and has automatic cert renewal from LetsEncrypt. Oh and its got 19K+ stars on Github at the time of writing this.

So, as the title suggests, I chose Traefik.

Our setup

We’ll be talking about this particular setup:

  1. Services are all running in dockers.
  2. Multiple microservices, some GRPC and some normal REST.
  3. DNS-01 based ACME challenge to get Certs for the domain.
  4. Hosted on Google cloud (just environment variables for cert generation).

Note: This post isn’t about a complex setup with Kubernetes orchestration or an even more complex service mesh. Most online tutorials for Traefik are about that only and getting newbie friendly data is the hard part.

Getting started with Traefik

We’ll be using the latest traefik docker image (please make sure you have docker installed).

To get the latest image based out of scratch linux (cannot shell into it):

docker pull traefik

There are more images, such as the alpine image, nano server etc. You can find those images here: https://hub.docker.com/_/traefik

To understand the Traefik basics, please visit: https://docs.traefik.io/basics/

Let’s create a docker network, which we’ll use with the other services as well.

docker network create proxy

If you’re on the same server, you don’t quite need this and can point your network to host and be done with it.

Traefik supports service discovery and can automatically detect your services running on the network if their dockers are started with the correct labels. We’re not taking this approach, since in a small startup environment, its mostly known which servers will spawn what service and changing a config is much more easy to manage.

traefik.toml

Traefik configs are done in TOML (https://github.com/toml-lang/toml), you have to mainly define your frontend, backend and your entry points. Let’s take a look at the amazing illustration on traefik.io

Closer look at Traefik. Source: https://docs.traefik.io/

It’s a self explanatory image, but let’s see what it says. There are majorly three parts to Traefik:

  1. Entrypoints: Can be http (80) or https(443) or any other custom entry point. You can even force a redirect to https from http entry points. Take a look at https://docs.traefik.io/basics/#entrypoints
  2. Frontends: This is where you define frontend rules based on either hostname, path, header or a custom regex. https://docs.traefik.io/basics/#frontends
  3. Backends: Connect the frontends to the backend services and APIs. https://docs.traefik.io/basics/#frontends

Time for the config (traefik.toml):

defaultEntryPoints = ["http", "https"]
logLevel = "INFO"
[traefikLog][accessLog]
filePath = "/var/log/access.log"
format = "json"
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[api][file][acme]
email = "admin@example.com"
storage = "acme.json"
entryPoint = "https"
acmeLogging = true
[acme.dnsChallenge]
provider = "gcloud"
[[acme.domains]]
main = "*.example.com"
sans = ["www.example.com"]
[backends]
[backends.foo]
[backends.foo.servers.server1]
url = "h2c://127.0.0.1:3000"
[backends.bar]
[backends.bar.servers.server1]
url = "http://127.0.0.1:3001"
[frontends]
[frontends.foo]
backend = "foo"
[frontends.foo.routes.server1]
rule = "Host:foo.example.com"
passHostHeader = true
passTLSCert = false
[frontends.bar]
backend = "bar"
[frontends.bar.routes.server1]
rule = "Host:bar.example.com"
passHostHeader = true
passTLSCert = false

Alright, now let’s understand what’s happening here.

  • defaultEntryPoints option tells Traefik to allow http and https by default. You can also set this up per frontend, using the entryPoints option.
  • I’ve setup logging in two ways here, one is access logs and the other are traefik logs. For my scenario, I wanted to be able to look at traefik logs directly by doing docker logs traefik and not specifying a file path outputs to STDOUT, which is what enables that. The access logs I store in a specific path, so they are not routed to STDOUT.
  • [api] is something you may or may not require. This enables the GUI on port 8080 by default. You can either open that port using password based security or you can enable it using a VPS.
  • [file] tells Traefik that the config resides in the config file and it should read those. If you don’t provide this option, the configs below might not work.
  • [acme] This is where we setup our acme certs. There are multiple ways to complete the challenge, for my case I wanted to utilise the DNS challenge. You might notice a provider, in my case it was Google Cloud, so the value is gcloud but it might vary depending on your provider. For a full list of qualified providers, visit https://docs.traefik.io/configuration/acme/#provider
  • [backends] defines the services on our servers. Traefik lets you specify a URL to the backend service, and it also lets you choose the protocol. For our purpose we’ve used two services, one runs on simple http and the other runs on GRPC which uses the protocol h2c. For secure backends that require custom certs, you must use https only.
  • [frontends] lets you define rules for reaching your backends. Be it path, subdomain or even a complex regex, you can define those here. In our case, we have defined two sub domains foo.example.com and bar.example.com .

Let’s start some dockers

Now that we have a working TOML configuration for Traefik, let’s go ahead and start Traefik.

Let’s first create an empty acme.json file.

touch acme.json

We create this to generate the certs outside the docker container. And use it later too if the docker is removed or recreated for some reason.

docker run --rm -d -p 8080:8080 -p 80:80 -p 443:443 \
-v /path/to/traefik.toml:/traefik.toml \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /path/to/gca.json:/gca.json \
-v /path/to/acme.json:/acme.json \
-l traefik.frontend.rule=Host:local \
-l traefik.port=8080 \
--network proxy \
--name traefik \
-e GCE_PROJECT="project_name-123456" \
-e GCE_SERVICE_ACCOUNT_FILE="/gca.json" \
traefik

Now this is a fully qualified docker command to run Traefik on GCE. Let’s go through this command and see what’s done and why?

  1. We’ve mapped 3 ports, 8080 for the API, 80 & 443 for http and https respectively.
  2. We have mounted 4 volumes, the TOML configuration, docker.sock file so that Traefik can still listen to newer services getting added and it automatically starts routing them, gca.json is the service account file and acme.json is the JSON file we created before to store the certs.
  3. The network proxy is chosen using --network proxy
  4. Two environment variables are provided. These environment variables are the cloud provider specific variables. For Google cloud, these two are required, and GCE_PROJECT tells the project name, while GCE_SERVICE_ACCOUNT_FILE provides the path of the service account file inside the docker, which we mounted at /gca.json .
  5. We give it the name traefik and start the image called traefik that we pulled from docker hub in the beginning.

Once you start docker, monitor traefik logs by typing docker logs traefik and you’ll start seeing ACME TLS cert process starting and you’ll also see the acme.json file filling up.

Note: Make sure you have an A name record for both, your domain and the sans domain you’ve provided, LetsEncrypt will try both and the DNS resolution is required for both. You do not need a sans domain, and autocert can work without it too.

If it goes through, you’ll receive an email from LetsEncrypt telling you that your domain has got a new certificate. Autocert will also make sure that your certs don’t expire and will renew them 15 days before expiry.

Securing the clients

Now that we’ve successfully setup SSL termination at the server end using Traefik, it’s time for our apps to start asking for the secure connection.

For this example, I’m going to be showing dart code snippets to connect to a now secure GRPC service.

Let’s see the original connection class:

class GrpcClientSingleton {
ClientChannel client;
static final GrpcClientSingleton _singleton =
new GrpcClientSingleton._internal();

factory GrpcClientSingleton() => _singleton;

GrpcClientSingleton._internal() {
client = ClientChannel(Constants.SERVICE_URL,
port: Constants.SERVICE_PORT,
options: ChannelOptions(
credentials: ChannelCredentials.insecure(),
));
}
}

Let’s modify it to use secure a connection but also an insecure for cases when the build environment is dev or staging. Please read:

to get an idea of how to create build variants for different environments.

The modified class would look something like this:

class GrpcClientSingleton {
ClientChannel client;
static final GrpcClientSingleton _singleton =
new GrpcClientSingleton._internal();

factory GrpcClientSingleton() => _singleton;

GrpcClientSingleton._internal() {
client = ClientChannel(Constants.SERVICE_URL,
port: Constants.SERVICE_PORT,
options: ChannelOptions(
credentials: Constants.CURRENT_ENVIRONMENT == Environment.PROD
? ChannelCredentials.secure()
: ChannelCredentials.insecure(),
));
}
}

Now, based on what environment rules are needed, one can configure their singleton based on that. For me, it’s a simple prod = secure, else insecure.

The constants.dart for this scenario would look something like this:

import 'package:flutter/foundation.dart';enum Environment { DEV, STAGING, PROD }class Constants {static get SERVICE_URL {
return _config[_Config.SERVICE_URL];
}
static get SERVICE_PORT {
return _config[_Config.SERVICE_PORT] as int;
}
static void setEnvironment(Environment env) {
switch (env) {
case Environment.DEV:
_config = _Config.debugConstants;
break;
case Environment.STAGING:
_config = _Config.qaConstants;
break;
case Environment.PROD:
_config = _Config.prodConstants;
break;
}
}
static get CURRENT_ENVIRONMENT {
return _config[_Config.ENV] as Environment;
}
}
class _Config {
static const SERVICE_URL = "SERVICE_URL";
static const SERVICE_PORT = "SERVICE_PORT";
static const ENV = "ENV";
static Map<String, dynamic> debugConstants = {
ENV:Environment.DEV,
SERVICE_URL: "192.168.1.2",
SERVICE_PORT: 3000
};
static Map<String, dynamic> qaConstants = {
ENV:Environment.STAGING,
SERVICE_URL: "foo.omgstaging.com",
SERVICE_PORT: 3000
};
static Map<String, dynamic> prodConstants = {
ENV:Environment.PROD,
SERVICE_URL: "foo.example.com",
SERVICE_PORT: 443
};
}

Here the staging and dev environments have a direct connection to the service, whereas the production system will be connected over port 443 using SSL.

If you’ve successfully managed to do this, your flutter GRPC based app is now running on a secure connection. Congratulations 🎉.

Notes:

  1. We have terminated SSL on the proxy (LB) end only, because we trust the internal communications. You might want to encrypt those too depending on your environment.
  2. All services must be run on the network proxy for traefik be able to route those. You can also simply start Traefik on the network host if everything is on a single machine and you don’t need the complexity right now.
  3. Google cloud provider explicitly needs access to the service account file. I couldn’t find a method to do this without it. And even this took a long while to figure out.
  4. Nginx is simpler to setup than this, and doesn’t require any specific configs. You can setup a cronjob to do the autocert part using the certbot tool. Traefik once setup correctly, takes the headache off of managing a lot of “enabled websites” files and does cert management automatically.
  5. Traefik does a lot of things that are not covered here, for things like LB, healthcheck etc, visit https://traefik.io
  6. HTTP traffic can be redirected to HTTPS automatically in Traefik but I have chosen to not include that option, because some people might be running older apps, and for that they need to allow http connections, and GRPC will simply refuse to connect to a secure channel using an insecure connection.
  7. If you are going to be shipping your iOS flutter app to the app store, you should start selecting app uses secure transport not powered by ATS and then choose HTTPS transport. This self declaration is required and if you choose no HTTPS, then Apple might block your app.

Hope this guide helps a few people who are in search of setting up their apps with GRPC and securing them. It was a major pain point while setting up Meeve’s services and took a lot of trial and error, to set it up sort of securely.

Meeve is a hyperlocal platform that connects similar people through events happening nearby. Checkout Meeve at:

Thanks for reading through, have a great day! 😀

--

--