Digital signatures: how Sleek leverages Cloud HSM to guarantee the integrity of legal documents
Posted by Oleksandr Rudenko, Head of Technology, Sleek, Julien Frerot, VP of Platform, Sleek, Jerome Poudevigne, Startup Architect, Google Cloud
Sleek is part of a new wave of corporate services providers who offer “online first” corporate and accounting compliance for their Singapore-registered company.
Traditionally, corporate services were similar to law firms — conservative and paper-based, with very little in the way of automation or digitalisation. Sleek’s online platform allows entrepreneurs and company owners to manage their company compliance from anywhere. An integral part of Sleek’s offering to their customers is the ability to use e-signatures in order to sign corporate documents such as constitutions, shareholder and board resolutions. Customers who have incorporated a Singapore company with international stakeholders are able to generate their governance documents, convert them into PDFs and send the document out to stakeholders all over the world for them to sign the document (inserting their handwritten signature, or typing in their name, with a full audit trail of the user actions). This cuts the process time from a few weeks to a few hours or even a few minutes as there is no need to courier paper documents all over the world.
Once all parties are done, Sleek will digitally sign the document to ensure the signature process has been followed and can be audited, and will seal it to ensure the document integrity.
To achieve this in accordance with official PDF format specification, Sleek relies on the Google Cloud Platform, and particularly the Cloud Key Management Service (Cloud KMS) and the Cloud HSM (Cloud Hardware Security Module) as well as a number of open source products to orchestrate a complex cryptographic dance.
If you are familiar with digital signatures, and you just want to see how Sleek does, you can jump there now. Otherwise, let’s first review a few general concepts around digital signature, and we’ll then go about the code.
Digital signatures primer
Digital signatures provide the electronic means to assert that a document has not been tampered with, as well as certifying the identity of whoever affixed it. So it boils down to how to generate a signature, and how to authenticate a signature (while making sure a signed document has not been altered).
How to construct a digital signature for a PDF
Digitally signing a PDF is conceptually simple. First, you need to own a public/private key pair (this is why public/private keys are cool) and a certificate to prove that this public key is yours. Then, you need a PDF file to sign (duh!).
With these, you will:
- Compute a digest of the content of the file (using a strong hash like SHA-2)
- Encrypt the digest itself with your private key.
- This encrypted digest is the document signature.
- And then distribute the PDF file, the signature (i.e. the encrypted digest), your public key wrapped in your certificate, and the hash method.
Of course, sending separately these four pieces of data for each file would be impractical. Luckily, chapter 12.8 of the Portable Document Format specification contains a way to append them at the end of the PDF file. I’m simplifying, but the end result is that you can distribute the whole thing in a single file. Since everything is self-contained it makes it easier to perform validations on the fly.
It’s all in the diagram below.
How to validate a digital signature for a PDF
OK we have received a signed file… how to proceed ?
Well… if the signature contains a certificate (as it should), we go like this:
- Read the certificate and validate its authenticity (using the chain of trust provided by certification authorities)
- Extract the public key from the certificate
- Decrypt the signature using the public key and obtain the original digest
- Compute the file digest using the provided hash method
- Compare the digest they calculate with the decrypted digest
- If they are identical, it means that the PDF file they have is the same as the PDF file you signed (by virtue of the properties of hash functions). In other words, it has not been tampered with (and neither did the signature by the way).
- If they are not identical, something is fishy is you should not trust this document
The following diagram illustrates this process.
So, if the certificate is valid, and the file has not been tampered with, what do we know? Well, now, not only do we know that the content is correct, we also know that it was signed by the correct entity.
Quite essential info if this is a contract or some other important document!
All of this is done using the same technologies that basically secure all global payment systems, HTTPS in your browser, etc. So it’s pretty, pretty strong!
Signed documents in the wild and conforming readers
Adobe Acrobat and all conforming readers (i.e. PDF readers who conform to the Adobe public specification) are able to read this signature and execute the validation steps described above on the fly when opening the document.
They will show an alert if things are incorrect, meaning that the document might have been tampered with, the certificate might be invalid, the signature could be wrong etc.
Putting it all together
Let’s see now how we can put all this knowledge to good use to digitally sign our PDFs using a certificate. We need to:
- Get a public/private key pair
- Acquire a certificate for the public key
- Start signing.
We will use several tools for this. Each of them plays a specific role, as below.
- The Legion of Bouncy Castle: A solid open source encryption library that can be used to generate signatures, digests, etc.
- Apache PDFBox: An open source Java library to manipulate PDF files. We use it to insert digital signatures produced with BouncyCastle into documents;
- Google Cloud Key Management Service + Cloud HSM: A cloud-hosted key management service and specialized hardware to protect you keys.
Using the Google Cloud Key Management Service with the Cloud HSM
Everything in asymmetric encryption relies on your private/public keypair(s). And it is absolutely crucial that the private key remains a secret that nobody else can access.
You could manage most of this by hand, using OpenSSL to generate your keys and carefully hiding your files, but instead Sleek takes advantage of the Google Cloud Key Management Service (Cloud KMS). This service offers a vast array of features to generate, use and otherwise manage encryption keys (symmetric and asymmetric).
GCP also offers Cloud HSM, a hosted Hardware Security Module (HSM) service. A HSM is a special computer that hosts and protects your encryption keys. There are a lot of advanced features, but the one of particular interest to us is the fact that the private keys that we use and create cannot be accessed by a third party or extracted from the Cloud HSM.
This is the reason why the main requirement to leverage an enterprise document signing certificate issued by a Certificate Authority (CA), registered in the Adobe Approved Trust List (AATL) is that the private key has to be securely stored on a HSM.
The Cloud KMS and the Cloud HSM work together, so we can ask the Cloud KMS to create our private/public keypair and to secure it in the Cloud HSM. We will get access to the public key, but the private key will stay securely in the Cloud HSM.
Of course, since the private key never makes it out of the Cloud HSM, you need to use the APIs offered by the Cloud KMS for every operation that involves the use of the private key. In our case, signing our document requires this access, and also signing our certificate request.
In order to use these technologies, you need an account in GCP and you need to have created a project, installed the Google Cloud SDK and logged in with your credentials.
Getting a public/private keypair with Cloud KMS
This is straightforward. You can use the Google Cloud console, or you can use the
gcloud commands. You need to create a keyring first (all keys are in a keyring); and then tell the Cloud KMS to create a keypair in this keyring with “Asymmetric sign” purpose. In order to use the Cloud HSM, we’ll set the protection level to ‘hsm’ . You have to pick a region for the key, I’ll use europe-west2 but any available location will do.
The complete documentation is here. Using the command line interface, it would be
- Creating the keyring
gcloud kms keyrings create my-sign-ring \
2. Creating the key
sleeksign in the keyring. Note that we specify that we will use it for asymmetric signing. We need to specify the algorithm we will use for signing. Here we select 2048 bit RSA key PSS Padding — SHA256 Digest (this is important, it will have to match the algorithms we actually use later). This creates version 1 of the key.
Since we are going to ask for the keys to be secured in the HSM (by specifying the protection level), we will have to pick a location where the HSM is available. For example,
gcloud kms keys create sleeksign \
--location europe-west2 \
--purpose asymmetric-signing \
We now have a keypair in the cloud, that is secured by a Hardware security module.
We are going to need the public key, so let’s download it into a local file, that we call
gcloud kms keys versions \
get-public-key 1 \
--location europe-west2 \
--keyring my-sign-ring \
--key sleeksign \
This will be used to get our certificate later.
Building the content signer Java code
The key trick here is to make sure that our cryptographic signature generator uses the private key stored in the Cloud KMS (and protected by the Cloud HSM).
PDFBox delegates everything about encryption to the BouncyCastle library, including digital signature, and BouncyCastle has a number of predefined interfaces for signing, one of them called ContentSigner. So, in order to make use of the Cloud KMS. we will create an implementation of the BouncyCastle ContentSigner interface that delegates to it, and use it every time we need to sign something from PDFBox.
A crucial point is the choice of signature algorithm identifier. It has to match the algorithm we picked for the key. The names used by BouncyCastle are not exactly identical to the names used by the Cloud KSM, but there is a good description of the algorithms in the Google Cloud documentation and you should be able to match them relatively easily.
The first class basically implements the ContentSigner PDFBox interface by delegating to a signing class who handles the connection to GCP and the call to the appropriate APIs.
And here is the class that essentially provides a signing facility that connects to GCP.
From now on, every time we need a signature created, we can create an instance of this class, pass it to BouncyCastle as the signer, and the BouncyCastle code will delegate to it as needed.
That’s it! We have now hooked the Google Cloud KMS and Cloud HSM to the BouncyCastle encryption libraries.
Let’s go now over the rest of the process of getting our certificate and sign documents.
Getting a certificate: Certificate Signing Request (CSR)
Getting a certificate to sign our documents is a key step. We start from our key pair.
- Create a Certificate Signing Request (CSR), which essentially contains the public key we want to certify as ours.
- Submit the request to a Certificate Authority; then go through the validation steps that they require,
- and finally get our certificate.
Let’s do it.
We have already asked the Cloud KMS to give us a copy of the public key, and stored it into a file called
To build a proper CSR, we have to build a file that present this public key to a certificate authority together with our complete identity, formatted according to the PKCS #10 specification, and signed with our private key. This way the certificate authority can validate that we are requesting a certificate for a public key that really belongs to us.
This is complicated to do by hand. But we can use the JcaPKCS10CertificationRequestBuilder class in BouncyCastle to do the work for us.
All we have to do is write a few lines of code that read the public key and invoke the proper library functions, in a simple standalone Java program. The signature will then be calculated using the
GoogleKMSContentSigner class that we crafted above.
This will output content of CSR into the console..
After getting the CSR, we can send it to our certificate authority. There are various ways to do this, depending on which authority you chose.
Ultimately, after your request is approved (which can take time), you will get a shiny certificate (some CA will send a .pem file, some will use a different format).
Signing your PDFs!
Now that we have a keypair and a certificate, we can finally get to sign our PDF files. Our application contains a service that creates the signature from a file. It implements a sign method which takes the PDF file to sign as stream.
Here again, we will delegate the actual calculation of the signature to the Cloud KMS via our
We are using Apache PDFBox as a library to work with PDF files. Below our code, based on this example from the Apache foundation.
Note: by using a timestamp provided by a time stamp authority (TSA), we additionally are able to assert that the certificate was valid at the time of signature — which allows to validate a document even if its certificate has expired. This is not the key here, but it’s useful to know about.
After the signed digest is returned, the PDFBox library will insert it into the PDF file. We ultimately save the signed file on a Google Cloud Storage bucket.
Now that the file is signed, anybody can get the signature, the certificate and the public key embedded into it, and validate that the file has not been tampered with.
It wasn’t that hard after all…
There can be a few challenges when building a digital signature product like we did.
The first point to keep in mind is that debugging can be hard when dealing with asymmetric encryption, and especially leveraging a HSM device. Your code can seem to work, and you can still have an invalid signature. We banged our heads against the wall a few times to understand where we were wrong. Here are a few gotchas to watch for:
- the private key algorithm, the public certificate algorithm and the CSR algorithm must all match. Otherwise, your code will run, and the documents will be signed, but you will get a cryptic “invalid signature” message when opening the document
- the certificate chain must be in the right order to validate correctly.
Be patient, be methodical.
The other important point is about planning. Getting a Document Signing Certificate from one of the very few CA that are in the Adobe Approved Trust List (AATL), has required a lot of time spent writing emails, and going through a paper-based process. It can take from a few days to a few weeks between the purchase order, the test certificate issuance, and the actual certificate delivery. Plan your releases accordingly.
The last challenge for the certificate issuance was to be able to provide the Certification Authority with a proper Certificate Signing Request (CSR), in order for them to issue the public certificate. It took us quite some Googling to understand how to write a CSR generator that would leverage the Google KMS API (we got our initial inspiration from this one)
What about resilience and private keys backup?
By its nature, a HSM will not release private key information and as a consequence the loss of a HSM means the loss of all private key material contained in it. On a traditional HSM, a full backup of the HSM is possible through an encrypted token and a specific protocol defined by the HSM manufacturer.
The Google Cloud HSM Service is a managed service, where you don’t need to worry about the operations management such as clustering, patching, scaling… The resilience of the key is ensured by an actual HSM cluster managing your key, and only a huge failure of the full cluster at the same time will expose you to the loss of your private key.
So, what if it still happens ? One way to manage it would be to have a service redundancy, in different regions, to make sure you will never face a service interruption. This high availability would have a cost though, since you would need 2 sets of private keys, with 2 issued public certificates from your CA. You would have to balance the cost of having redundancy versus the cost of the loss of the service for a period of time. In our case, we are sealing documents as a background operation, and we found it totally acceptable to cope with a short interruption of service without data loss.
Keys and HSM
Security is one of the main concerns when it comes to private keys management, particularly in a world of increasing cyber threats, and we were very interested by the announcement of the release of Google’s own HSM back in August 2018. We had a deep look, even as the first version of our e-signature was going live.
We are still a young startup, and we found it very exciting to have access to a HSM as a commodity service. Storing securely our private keys without putting our budget in jeopardy became a no brainer!
We think that all companies, not just startups, should strongly consider using those technologies. This is a way to a better, more secure document exchange, and Internet. Hardware Security Modules, and particularly Google Cloud HSM, should be part of your toolkit.