Open-sourcing Knox, a secret key management service

Pinterest Engineering
Pinterest Engineering Blog
4 min readMay 6, 2016

Devin Lundberg | Pinterest engineer, Infrastructure

Pinterest requires many internal secrets for everyday operation such as login credentials for third party integration and cryptographic keys for protecting user data in transit via HTTPS. As we scale, it’s critical to maintain confidentiality while still allowing for auditing and rotation. To solve these problems, we built Knox, a service that helps developers write secure code to protect both Pinterest and Pinners. Today, we’re open-sourcing Knox to enable other companies to store and manage their own information like credentials, keys and passwords securely and improve the security of the internet for everyone.

History

Before Knox, secrets were stored in source control at Pinterest, where any engineer could work with the information as needed for development. As the number of engineers grew, the risk of malware and phishing increased, which could lead to these secrets being leaked. Also, the more engineers with access to secrets, the more of a challenge it would be to determine the source should a secret leak. Additionally, rotating this information required redeploying our service with code changes, and there was no mechanism for gradually rotating secrets to limit downtime for Pinners.

Features

To solve these problems, we built Knox, which is designed to make secrets accessible only to the machines and users who need them at any given time. Knox tracks who accesses the keys, and is built with a simple yet powerful versioning system so keys can be rotated out over time.

In addition to these important features, Knox is extensible so we can easily transition to more secure authentication, auditing and storage mechanisms as they’re built. Knox also provides high availability, which is important considering every request to Pinterest requires at least one secret that’s stored in Knox. Lastly, Knox provides a simple interface for developers to interact with so they can continue move quickly and iterate on our product.

How Knox works

Knox has two important parts: the server and the client. The Knox server is responsible for enforcing access control, auditing all actions, performing all key management related tasks and persisting encrypted keys to a database. The client is how both users and machines interact with Knox. It also provides a key caching mechanism for services to prevent Knox servers from taking down Pinterest and to allow quicker access to keys so requests are served efficiently. The GitHub wiki has more information on both the server and the client.

Gradual rotation

When we first built Knox, there were no other open-source key management solutions. In the last year, several other solutions have been open-sourced with similarities to Knox. However, one of the key elements that makes Knox unique is the ability to rotate keys gradually, which is hugely beneficial.

Let’s assume a simple scenario where we have a client and a server. The client has a secret to authenticate to the server. If this secret needs to rotate, two steps need to happen: the server needs to be updated to accept this new secret and the client needs to be updated to send it. If these two steps don’t happen at the exact same time, the client’s requests to the server will fail in the meantime. Either the client will have the new secret before the server understands it, or the server will have the new one and reject the old. When we control both the client and the server, this problem can be solved using quorums and protocols like paxos, which are designed for consistency, but take time and add complexity to the code base. When the client is outside of our control (such as when the client is a Pinner and the secret is their signed cookie), you can’t coordinate such actions.

The better solution is to support multiple active secrets and have the server accept both the old and new password until the developer has confirmed the rotation is complete. Knox has this kind of rotation built into its data model and is designed to allow for these kinds of workflows, making rotation easier and less likely to break things. This is untrue of any other key management solution.

Getting started

Knox is available now on GitHub. From the GitHub page, you can easily setup your own dev server and client by downloading, compiling and running the binaries. Included in the wiki is information on re-configuring this development setup to use Knox as a part of your infrastructure.

Acknowledgements: Contributors who’ve helped build Knox include Devin Lundberg, Amine Kamel, Zack Drach, Matt Jones and Jon Parise.

--

--