Practical Implementation of Public Key Infrastructure at Cermati

Michael Sutiono
Cermati Group Tech Blog
9 min readJul 29, 2019

There are a lot of critical components and resources in Cermati’s working system, and to perform their jobs our staff and services need to access the said resources such as database access, cloud provider console, and third-party APIs. This leads to the problem of credential and access management.

How are we doing the credential and access management at Cermati? To be honest, we started without any sophisticated strategy to approach the problem. We simply issue a new account with a username-password pair every time an employee needs to access a certain resource.

But our team is rapidly growing in size, and along with it so does the risk of us having inside attackers and credentials leakage. Our current approach of credential and access management would neither allow us to enforce a stronger security policy conveniently regarding the access nor to scale the organization well.

We don’t have much control over each employee’s password strength and how they store it. In the case of credential leakage, we have to manually revoke the credentials or at least update the password on our internal services, which would be another cost in productivity.

The sad truth about password strength (source).

Long story short, the need for a better way to solve our credential and access management problem at scale leads us to come with the idea to implement our self-managed Public Key Infrastructure (PKI). The PKI provides each of our team members with a certificate bundle that will be used to authenticate to our internal resources, while also allowing us to revoke them easily.

We aren’t going to dig too much into the internal working of PKI in this article. For those who’re interested, the following resources might be relevant.

For the relevant IETF RFCs, refer to the following RFC specification documents.

  • RFC3820, Proxy Certificate Profile
  • RFC2560, Online Certificate Status Protocol
  • RFC2527 & RFC3647, Certificate Policy and Certification Practices Framework
  • RFC2511, Certificate Request Message Format
  • RFC2797, Certificate Management Messages over CMS
  • RFC3039, Qualified Certificates Profile
  • RFC3161, Time-Stamp Protocol (TSP)
  • RFC3281, An Internet Attribute Certificate Profile for Authorization

Currently, our PKI implementation consists of a root and an intermediate Certificate Authority (CA). The intermediate CA will sign any valid Certificate Signing Request (CSR) from the requester. We divide them into two major groups: team members and service. To aid the certificate management and ease the integration to our development workflow, we developed a CLI-based toolkit called pkictl to automate the certificate management workflow. We’re currently developing an HTTP server as the certificate management back-end.

The reason we are developing the HTTP server in a slightly later phase is that we need to roll this out company-wide, which includes the non-engineering teams. The non-engineering teams are most likely Windows users and not avid CLI users, while our initial pkictl is Linux-oriented and CLI-based.

Also, the original implementation of pkictl does all the heavy-lifting on the user’s machine — such as interacting with the CA key-certificate pair stored in a remote instance to do the management workflow. This client-side workflow needs to be implemented for several OS environments if we’re aiming to roll it out for the whole organization with the current architecture. To avoid that, we can use a client-server architecture to move the process we’ve been doing on the user’s machine to a back-end server. This way, the users can use a thin client to execute the process they’re required to perform operations related to certificate and the developers only need to implement the thin client for supporting multiple OS platforms.

Generally speaking, we determine there are 3 phases for both groups to acquire a certificate bundle: the request phase, the issue phase, and the retrieval phase. However, as we need to automate the certificate bundle generation for service, we have to merge those phases into a single request.

Member Certificate Workflow

Authentication Method

Before we dig deeper into the workflow of how to get a member certificate, let’s discuss the authentication method we are going to implement. For the initial phase, we are using GitHub as our identity provider and utilize its repository as our certificate storage. But, as we’re required to roll this out company-wide, we need a more universal identity provider which everyone in our company had an account in it — because unless you’re a software developer it’s highly unlikely that you have a GitHub account. Google OAuth is a very good candidate for our identity provider, as our employees are already required to have a Google account.

The OAuth implementation in a CLI is a little tricky. There are to ways we thought of to implement the Google OAuth workflow.

The first way is to return the authorization code on the browser view (instead of on a redirect URL) after the user permitted for the application to access their account. Hence, the user needs to manually copy and paste the code back to the CLI before continuing the operations.

The second one is the CLI needs to spawn a localhost server whose sole purpose is to catch the authorization token returned in the query parameter. We spawn this localhost server before opening the OAuth URL in the browser, so after the user giving their consent and they are redirected to the localhost server, it can acquire the token which will be used for authenticating to a remote HTTP server.

We ended up going with the latter. Although it’s a bit more complicated to implement, it offers a more seamless UX compared to the first flow. You can look at the workflow diagram below.

Google OAuth implementation in our command-line interface.
  1. Anytime a user executes a command related to certificate management (request, issue, get, and revoke).
  2. pkictl get the OAuth URL first from the CA HTTP server.
  3. It then opens a browser to access the OAuth URL and displays the consent page.
  4. pkictl also spawns a temporary HTTP back-end to receive the authorization token returned at the Google OAuth redirect URL.
  5. Upon consent was given, Google will redirect the user to the localhost URL spawned in step 4.
  6. The temporary server then parses the code in the query parameter and returns it to the pkictl main thread.
  7. pkictl shuts the temporary server down and sends the authorization token to the CA HTTP server to be exchanged with JWT.
  8. CA HTTP server tries to retrieve the access token from Google based on the authorization token sent in step 7 and uses the access token to retrieve the profile of the user. If all is good, it will generate a JWT.
  9. The generated JWT then returned to pkictl to be used for further interaction.

Request

In the request phase, pkictl will first prompt the requester to fill some profile details like Organization, Organization Unit, Email Address, etc. These profile details will be used to generate a CSR for that particular requester profile using openssl command. After the key and CSR have been generated locally, pkictl executes the authentication workflow. Upon a successful JWT retrieval, it will upload the CSR along with several metadata of the CSR via a CA HTTP endpoint.

Activity diagram of the member certificate request phase.

Issue

This phase can only be executed by authorized personnel (PKI administrator), so the first thing pkictl do is executing the authentication workflow and check if the user is authorized to perform the tasks. If the user is authorized, then it will retrieve the CSR list of a certain member determined during the issue command invocation from a CA HTTP endpoint. After receiving the CSR list, the issuer will be prompted to enter which CSR to be issued and pkictl will send the chosen CSR names subsequently. The issue request will be processed by the back-end service by executing an openssl command.

Activity diagram of the member certificate issue phase.

Retrieval

For the retrieval phase, it’s quite simple, pkictl will do the authentication workflow, prompt the user to pick which certificate(s) to download from a list of the user’s valid certificates, and finally downloads the chosen certificate archive which consists of the signed certificate and the CA-chain certificate (intermediate CA certificate joined with the root CA). Upon successful download, pkictl will prompt the user to input a password to protect the to-be-generated PKCS12 archive for web browser use.

Activity diagram of the member certificate retrieval phase.

Revocation

In general, there are two main methods on how to revoke a certificate, by generating a Certificate Revocation List (CRL) and by implementing the Online Certificate Status Protocol (OCSP). In our current state, we are only implementing CRL because our edge services that are utilized as SSL validator (Vault and Nginx) only support CRL to validate client certificate.

For the revocation process, the general workflow is similar to the issue phase, but instead of sending a request to the sign endpoint, it will send a request to the revocation endpoint. The back-end service will then execute two openssl commands, revocation and CRL generation command.

Activity diagram of the member certificate revocation phase.

We also set up a cron job to periodically generate CRL (as CRL also has an expiration date) on the CA server and to distribute the CRL, we set up another cron job on each of our SSL validator services to periodically fetch CRL file from the CA server. Special case for Nginx, we also need to combine all CRL files generated from each CA level to a single file. Otherwise, any certificate will be invalid.

Service Certificate Workflow

Authentication

The certificate for services will only be generated from our Jenkins pipeline, so we are using an API key for it to authenticate with the CA HTTP server.

Certificate Generation

As we mentioned earlier, we need to automate the certificate bundle generation for our services. Therefore, for service, we merge the request, issuance, and retrieval phases into a single request. This workflow is also handled by pkictl and the generation command will be invoked from Jenkins pipeline.

Certificate Revocation

The revocation workflow for a service certificate is similar to member certificate revocation workflow, but instead of member name, we supplied the service name to the pkictl command.

Internal Resource Services Authentication Workflow

Currently, we have a myriad of self-provisioned internal resource services such as Jenkins, Nexus, Redash, and Kibana. Until this article was published, we have just tested the PKI implementation on a few of the said services. So far, we used a generic way to implement the authentication using PKI certificate by offloading certificate validation to our Nginx reverse proxy server and forward a custom header indicating the authorized user, such as X-AUTH-USER = someone@example.com which we can acquire from the certificate properties such as Common Name.

The Nginx configuration for enabling client certificate verification is pretty simple, we just need to add these lines to the server (or HTTP) block configuration.

server {
...
ssl_verify_client on;
ssl_verify_depth 2; # if we use root and intermediate only, specify this as 2
ssl_trusted_certificate /path/to/your-ca-chain.crt;
ssl_crl /path/to/your.crl;
...
}

While forwarding a custom header from the certificate details can be done by mapping the Distinguished Name (DN) variable provided by Nginx on $ssl_client_s_dn to a custom variable of our choice, for example$ssl_client_s_dn_cn . We can put this mapping inside the http block.

http {
...
map $ssl_client_s_dn $ssl_client_s_dn_cn {
default "";
~CN=(?<CN>[^,]+) $CN;
}
...
}

To use the custom variable when forwarding a custom header, we can put it directly when setting a proxy header on the location block.

server {
location / {
...
proxy_pass http://nexus/;
proxy_set_header X-AUTH-USER $ssl_client_s_dn_cn;
...
}
}

Next, we need a PKCS12 certificate format to access the protected resource service from the browser. We can generate it by using openssl command and to generate it properly, we need to use a full-chain certificate (a single file containing a certificate, intermediate CA, and root CA sequentially) and of course the certificate key. Then, we can install it on our browser.

Conclusion

To improve our credentials management, we implemented our self-managed Public Key Infrastructure. We divided the workflow into two distinct groups, team member and service. Both groups have the same generic workflow:

  • Generate the private key and CSR.
  • Send the CSR to the CA server to be signed.
  • CA signs the CSR.
  • The Certificate owner acquires the certificate from the CA.

To help manage those workflows, we developed a CLI client called pkictl and a CA HTTP server. We also have successfully implemented the PKI authentication to a few of our internal resource services by offloading the SSL validation to our Nginx reverse proxy server.

There are a lot more things that we need to develop and improve to help Cermati’s business and organization to scale. We’re also currently hiring more engineers to help us to keep improving our system.

Stay tuned for more tech articles from us!

--

--