by Nicolas Beguier (Site Reliability Engineer)
As SRE, one of our missions is to prevent intrusions in our infrastructure. How can authentication security be increased without penalising thousands of engineers working on the platform?
A classic way to authenticate with SSH is to use a pair of keys. Then you can deploy each public key on every server and add your custom flavor.
In fact there are quite a few unused functionalities in OpenSSH, such as signed certificates, which greatly increase both security and reliability.
What is a SSH signed certificate?
Introduced in OpenSSH 5.4 (March 2010), an SSH signed certificate contains a public key and metadata: Validity, Principals and Extensions
- Validity: Each key has an expiration time. For example, it can be one week for a developer and one day for a system administrator. When your certificate expires, you have to ask the Certificate Authority to sign your public key again.
- Principals: This is the list of usernames you are able to connect with. For example, every developer could have access to ‘dev’ user, and administrators to ‘dev’ and ‘root’.
- Extensions: Maybe not the most important feature, but you can limit some rights of the user.
How does SSH authentication work?
The traditional way to connect to a server is with an SSH key pair. In this key-based scenario, the client submits its public key to the server. The server checks this public key against every key on its list.
It’s simple, but every public key has to be stored in every server. Also, they generally have no expiry and need to be deployed by an admin.
By using an SSH certificate instead of an SSH public key, these issues are solved.
First of all, you need a Certificate Authority hosted on a server which turns public keys into certificates.
In this certificate-based scenario, the client submits its certificates to the server. The server then checks if the certificate is signed by one of the trusted CA deployed on the server.
Security breach at leboncoin
Last June, one of our developers had his SSH key stolen. We don’t know how it happened since his laptop’s hard drive was encrypted and he was considered security savvy.
At some point, we noticed some strange access on our infrastructure due to a triggered alarm. Indeed, an old legacy server still had a public IP and an SSH port open.
To stop the attack, we had to delete the public key on every server and deploy a new one. Nothing really bad happened that time, but anyone could lose their private key. Had it been an admin, with rights everywhere, it could have been quite critical.
The main advantage of a signed certificate is the validity; certificates revoke themselves after a while.
State of the Art
Turning public keys into signed certificates is easy: it’s an ssh-keygen command. The real challenge was to convince every user to periodically request an SSH certificate. To do that, we needed to find a client interface which worked on several environments (Linux, MacOS or Windows), was easy to use, and sufficiently secure.
We have tried several SSH key-signing projects, such as Niall Sheridan’s cashier, in heavy development as of June 2017, but still promising. However, we didn’t agree with the rights management and the repercussions on the client interface.
A good article has been written on Medium by Uber Security, but their work is to increase the security of the OpenSSH feature by developing a new PAM module. Interesting but we still needed a tool to sign certificates.
Since March 2017, HashiCorp has given the possibility to sign an SSH certificate with a Vault secret backend. The API allows actions just like ssh-keygen. Unfortunately, there is no privileges logic and no interface. All we need is not implemented by default, so…
Let’s talk about CASSH
OpenSSH features reach their limit when it comes to industrialization. We don’t want an administrator to sign every user’s public key by hand every day, so we need a service for that. That is exactly the purpose of CASSH: signing keys!
A user needs to follow these two steps to get their public key signed:
- Authentication: First of all, a user needs to be authenticated by a LDAP/AD.
- Registration: Every user needs to be present in the CASSH database. In this database, a user login has some static attributes such as: Key hash, Expiry, Principals and a Username (different from LDAP/AD login and has to be unique).
If a user fulfills these two conditions, he can get his public key signed.
During the creation of a new user on a CASSH server, an admin needs to validate all information provided by the user. It won’t be possible for a normal user to change these data later on.
The authentication via LDAP/AD on CASSH ensures user identity. Then, static data in the CASSH database prevents users from signing other public keys or changing the information on signed certificates.
Where is the SPOF?
The CASSH server is only in charge of key signing. When a user tries to SSH on a server in production, CASSH is not involved.
If the CASSH database is unreachable, only an admin user can sign its public key by forcing CASSH to do it. This option is disabled by default.
If the authentication backend (LDAP/AD) is unreachable, CASSH cannot authenticate users and becomes a SPOF.
If the CASSH server is unreachable, signed certificates will be of course still valid. Until expiry… It is recommended to use a load balancer (like HAProxy/nginx/varnish/you name it) to distribute the traffic on each CASSH instance.
Finally, to prevent a lockdown situation, it’s crucial to have a backup CA deployed on every server and to store the associated private key in a vault.
Using CASSH as a client
We wrote a restful API because we wanted to implement something easy to maintain and with various client applications.
In this project, a Command Line Interface (CLI) was developed in Python 2 & 3 and allows a user to interact with the CASSH server to sign their key.
This CLI was also containerized in Docker and is regularly updated when a minor version comes up.
$ cassh status
Please type your LDAP password (firstname.lastname@example.org):
“expiration”: “2017–11–14 10:19:54”,
“ssh_key_hash”: “4096 7b:5f:73:66:68:5f:73:fe:f1:02:4a:72:b7:d1:97 “,
Finally, a web user interface can be hosted on the CASSH server. It allows users to sign their public key without any command line.
Integration of CASSH at leboncoin
We started by deploying the Certificate Authority’s public key on each server in the infrastructure. Then when an user tries to authenticate on a server, OpenSSH checks if his key is present in the authorized_keys file. If not, it verifies that it’s a signed certificate.
The whole SRE team adopted CASSH despite some apprehension of removing their own public key from the infrastructure.
In the end, our solution to bypass CASSH, in the case of an emergency, has never been used.
Every new user at leboncoin will automatically use CASSH. Meanwhile, current users are migrating slowly.
The next step will be to extend Certificate Authority to every AWS instance.