Inter-service authentication and the need for decentralised shared key exchange

Published in

Practo Engineering

9 min readNov 5, 2018

TL;DR Decentralised inter-service authentication is easy to implement, reduces maintenance woes and more secure. It uses Diffie-Hellman and Request-In-Request mechanisms. (Scroll down for the all-explaining image)

Note: In this post, I will be stressing on the word decentralised due to the fact that most of the teams/projects use centralised solution for key distribution and we will look into why centralised solution can create a havoc!

Lets start with the few reasonable assumptions

We all know what is inter-service communication since almost everyone has adopted or read about service oriented architecture and amazing benefits it brings to a system.
We have encapsulated micro-services (Ideal scenario) which communicates with each other over some protocol (Such as HTTP, RPC, WebSockets, MQTT, and AMQP).
We use a secure channel (such as HTTPS) for communication.

Now lets reiterate the problem at hand and the probable solutions.

Though the communication within the channel is secured i.e. HTTPS channel, if anyone is aware of the micro-service resource URL then a simple POST request can alter the state of the system or a simple get could leak the data i.e. core to an organisation and we just can’t afford that.

The above problem can be solved in many different ways and lets concentrate on the prominent ones.

Filter traffic to the service
Pre-establish shared secret between micro services.
Centralised key store service/database where every other service is connected to fetch the latest secret for encrypting the data.
Decentralised key establishment between micro services with TTL.

Now lets evaluate each of the solutions and try to come up with the best possible way for secured inter-service communication.

Filter traffic to the service

This can be achieved using an IP whitelist or by deploying the application in a private cloud where external access is restricted etc.

This is the first level of defense for every application and creates a wall between trusted network and untrusted internet. But, this approach falls short if different micro-services have different levels of security and importance. If one micro-service is compromised, it should not bring down the whole system along with it.

Pre-establish shared secret between micro services

Key shared over a file is not safe for long (Mainly due to human involved while key establishment), so we need to rotate these keys with new ones often. Whenever new keys are established that may requires a deployment and/or a planned downtime which isn’t a bad solution but it’s just inefficient and resource (human and the rest) intensive over time and we need to stop relying on human presence for key establishment.

Centralised key store service

After looking into the pros and the cons of above mentioned solutions, the first viable solution that comes to our mind is to build a key exchange service where every other service communicates with to fetch a shared key between any two services.

Meanwhile, we have to make sure that the new important service is secure by making this service inaccessible from the outside world while also adding additional security features such as key expiry.

Well, this is a solution! but is this a good one?

We are creating a centralised system where every service is dependent on it and all the time, which means devastating single point of failure and it doesn’t stop there, we are keeping all our shared secrets/keys at one place which is really not a great solution.

Decentralised key establishment

What does this mean? In simple terms each service pair should be capable of establishing a secret and rotate them whenever it must without human intervention or downtime. Now lets work towards how to achieve the same.

Lets decentralise! “key establishment”

Now that we have understood that decentralised shared key exchange is a good problem to solve, lets Google “shared key exchange” (Well bing it or use any of your fav search engine).

The first thing you will notice is one algorithm that is mentioned over and over again i.e. Diffie Hellman algorithm

Fact: No it is not created by Diffie or Hellman, it’s theorised by computer scientist and one of public key encryption inventor Ralph C. Merkle and named after two famous cryptographers Whitfield Diffie and Martin Hellman.

You can find a simple and easy to understand explanation for Diffie Hellman here. I am reiterating the first and important sentence below for quick reference.

In simple words, Diffie Hellman is a way of generating a shared secret between two people/system in such a way that the secret can’t be seen by observing the communication.

Important distinction: You’re not sharing information during the key exchange, you’re creating a key together.

There is a always a catch!

Once we decide to use any new approach for a problem at hand, we tend to wonder what can be the drawbacks of this approach? And yes, there are drawbacks of using DH as it takes lot of resources (i.e. time an important factor and computing power though computing power is a worry only for small devices in IOT) to create 2048 bit key pair (since 1024 bit keys is not safe anymore and Java 6 open source version doesn’t support DH > 1024).

As always, we have a good alternative — ECDH.

ECDH is better performant than DH and you can find a lot of great resources about it. The philosophy/design behind the DH and ECDH are the same but they achieve results using different algorithms. Elliptic curve cryptography in short ECC has provided a great means to improve the performance of many Cryptographic algorithms and one such improvement is ECDH over DH.

For instance 2048 bits DH key generation in Python 2.7 takes more than 30 seconds and the same takes around 10 seconds on NodeJS while the same level of Security can be achieved by ECDH using a curve NIST P384 which takes less than 10ms to generate public private key value pairs across programming languages.
(hardware used: Mac Pro 2015 with i5 processor and 8 gigs of RAM).

Hey! most of the programming languages already supports ECDH.

Yes, A major problem is resolved by the programming languages by providing good cryptographic modules so that you don’t have to worry about implementing DH yourselves. But not all programming languages follow the same interface for e.g. NodeJS crypto methods encodes and expects Public and Private keys in 'latin1', 'hex', or 'base64' formats whereas Python and Java works 'DER' or 'PEM' encoding schemes.

It doesn’t stop there, now we have to worry about storing this established secret since we can’t afford key exchange for every request.

Key establishment for each communication session is referred to as perfect forward secrecy for which we have DHE where E stands for ephemeral and being used by most of the your trusted tools such as TLS, SSH, ECDSA etc.

Since the applications are deployed in a horizontally scaled environment, we need to store the secret in some secure place and we need a way to fetch it for every request which is not solved by these Standard libraries. Hence we are defining a way to generate secrets, store and access the established secrets and expire with TTL (Time to Live) support across heterogenous setups and across multiple programming languages.

Well it’s not all over yet, we have one drawback with DH/ECDH which is man in the middle attack. Consider you are trying to establish a communication channel between your Service A and and an external Service B, while a prankster Service C poses as Service B in which case you have no way to differentiate the services in an untrusted environment. That is why we are limiting the discussion to Inter-service communication in this article, but there are ways to establish communication with untrusted parties but again over a trusted channel or interacting with trusted third party services.

ECDH, ECDSA, etc. ECC based DH algorithms are being used by your browser for SSL handshakes, OpenSSL and for most of the secured communication over the Internet.

Lets dig into the implementation!

Now we have arrived at conclusion that we will be using ECDH for key establishment. But are there any standards? Yes, there are standard curves created for this purpose and you can find these safe curves here.

Use ECDH key exchange for generating a temporary shared secret valid for a short duration and storing it in a central datastore that supports expiration.
Use Request-In-Request and not request-response mechanism during the ECDH key exchange for establishing the trust (via Domain validation) for two-way authentication.
At the end of the ECDH key exchange, services maintain a list of non-secret identifiers of the temporary shared secrets and the expiration timestamp against the service so that they can reuse an existing valid secret if available.

I get the Gist but I need more than that.

Decentralised key establishment approach for Inter-service communication

Step 1: Service A generates its ECDH asymmetric key pair and an UUID and store it in a key-value store preferably with support for TTL like DynamoDB or Redis for few seconds (Configurable depending on application preference) with the UUID as the key and the value comprising of the following:

ECDH curve, Generated private key, Generated public key, Service Info (B in this case)

Step 2: Service A makes a direct HTTPS request to Service B with the following details:

`POST <optional-prefix>/_handshake-1`
`curve` — ECDH curve used
`ecdh_pub_key` — Public key generated by service A
`source service` — Identity of Service A
`UUID` — Unique identifier for the request

Step 3: Service B generates its ECDH asymmetric key pair using the curve provided in the request.

Step 4: Service B makes a direct HTTPS request to Service A passing the following details:

`POST /<optional-prefix>/_handshake-2`
`pub_key` — Public key generated by service B
`UUID` — Same UUID provided by Service A
`source service` — Identity of Service B

Step 5: Service A retrieves the data stored initially based on the UUID using a strongly consistent read operation.

Step 6: Service A checks that the service passed in the second request matches the value stored in the datastore.

Step 7: Service A generates the temporary shared secret by combining the public key of service B and stores it in a data store preferably with TTL support for a pre specified duration with the UUID as the key (overwriting the previous entry that was valid for few seconds).

Step 8: Service A responds back to service B’s request with a success response

Step 9: On receiving a successful response, service B combines the private key of B and public key of A to generate the temporary shared secret and store it in its data store for a some duration (As per application requirement) with the uuid as the key and the value comprising of the following:

Temporary shared secret, Service Info (Service A in this case)

Step 10: Service B records the uuid and the expiration of the shared secret against the service A for use in the future in its relational database with a composite index on (service, expiration) columns.

Note: In this approach, we are using relational database to store service, and expiration info against UUID and using Key value database with TTL support to keep the UUID with generated secret which removes single point of failure in terms of Data leak.

Step 11: Service B responds with success for the initial request

Step 12: On receiving a successful response, service A records the uuid and the expiration of the shared secret against the service B for use in the future in its relational database with a composite index on (service, expiration) columns.

Wait! What happens when i have multiple keys and how do i choose one?

Use non-secret identifiers (UUID) to communicate which temporary secret was used to sign for the actual communication.
And while checking the signature, if the shared secret is not found respond with a different error message than when the signature matching fails so that the client can retry after creating a new temporary shared secret.

And ta-da we have created the mechanism to establish key exchange in a decentralised way across services with a mechanism for storing and expiring shared secret key.

Hope you enjoyed this post. I am ending this article with a quote from Bruce Schneier:
“The fewer and simpler the secrets that one must keep to ensure system security, the easier it is to maintain system security.”

Thanking great resources that helped me create this post

Special thanks to Kishore Kumar for coming up with this great approach for solving key establishment problem for inter-service communication.

P.S: We are working on libraries across programming languages and we are hoping to make it open source as soon as possible.

Thank you!

If you liked this article, please hit the 👏 button to support it. This will help other Medium users find it. Share this on twitter to help out reach as many readers as possible.