Standard API for connecting HSMs with client applications

6 min readAug 13, 2018

Hi developers,
This is my second medium blog post and through this I’m going to explain you what is PKCS #11 and how PKCS #11 isolates an application, which uses HSM as a cryptographic provider, from the details of the underlying HSM. I hope you already know “What is a HSM?”, if not please go through my previous blog post about HSMs, just a 4min read.

So let’s dive in.

First you have to get familiar with concept of “Standards” and why we need “Standards”. If you already have an understanding on this, just move to section “What is PKCS #11?”

“Standards are documented agreements containing technical specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics, to ensure that materials, products, processes and services are fit for their purpose.” — ISO

In simple terms, standards are all about establishing a set of rules, guidelines and heuristics by bringing together all interested parties related to a particular product, process or a service.

Let’s take our scenario as an example, what if all related parties such as HSM vendors, application developers, Internet security experts etc. get together and agree upon a standard API by establishing set of rules and guidelines such that every cryptographic provider should expose their functionalities by implementing this standard API. This makes much easier for application developers to integrate any cryptographic provider, who adhere the standard API implementation, to their application. Also cryptographic provider has the ability of getting into the market more quickly and provide an assurance to the customer about the HSM.

I hope now you know answer for these two questions - “What is a standard and why we need standards?”.

As I discussed earlier, making our application isolated from details of the underlying HSM, we need a standard API agreed by all HSM vendors in the market. This where Public Key Cryptography Standard #11(PKCS #11) comes in to the picture.

Let’s see what is PKCS #11 and how it works…

What is PKCS #11?

PKCS #11 is a standard API specified by OASIS Open which is a global nonprofit organization that works on the development, convergence, and adoption of open standards for security, IoT, energy, content technologies, emergency management, and other areas. They define PKCS #11 standard as follows,

“The PKCS#11 standard specifies an application programming interface (API), called “Cryptoki,” for devices that hold cryptographic information and perform cryptographic functions.” — OASIS Documentation

Goals of PKCS #11 and their basic approach is defined as follows,

“ Cryptoki follows a simple object based approach, addressing the goals of technology independence (any kind of device) and resource sharing (multiple applications accessing multiple devices), presenting to applications a common, logical view of the device called a “cryptographic token”.” — OASIS Documentation

PKCS #11 is not an implementation of a API, it is a specification of the required set of rules and guidelines for the implementation of the API. OASIS Open provides only a set of ANSI C header files defining the interface exposed to client application. HSM vendor is responsible for providing concrete implementation of the functionalities specified in PKCS #11.

For more information on PKCS #11 syntax and API specification visit here.

How PKCS #11 works?

Above diagram shows high level communication flow of an application communicating with an HSM using PKCS #11 API.

Here’s what happens in the diagram…

Application directly speaks to the PKCS #11 API and API is responsible for calling the PKCS #11 module through C calls. Then the module speaks to the HSM via native calls. Module hand over the response from HSM to application through C interface implementation.

Hope you have some idea how this works. :D

So let’s dig into some deeper and see how application works with HSM through PKCS #11 API. First there are some words that you need to be familiar with.

Token
Token is the logical view of the underlying cryptographic device. A token possesses a list of cryptographic functionalities supported by the device.

Slot
This is a logical access point to the cryptographic device. Objects that resides within a given slot is not visible to other slots.

Session
Session is a logical connection between an application and a token. There are two types of sessions defined in PKCS #11 as Read/Write(R/W) and Read-Only(R/O). R/W sessions can be used for both reading and writing data to the cryptographic device while R/O can only be used for data reading purposes from the device.

User
User is a person or an application who has access to the cryptographic device through a slot. There are basically two users defined in PKCS #11 as SO(Security Officer) and USER for each slot. SO has the authority to create a USER. USER is responsible for using device for cryptographic operations. There can be only one SO and USER for a given slot.

Now you’re familiar with basic words used in PKCS #11.

Logical view provided by the PKCS #11 API

*Note :- In here I have depicted as tokens reside inside slots. Actually there are several slots for a given token. What application sees is there’s a token inside each slot. But if there is only one HSM then the token is same for all the slots. In here application gets the view of multiple independent tokens so this HSM can be used by other applications from different slots concurrently.

Let’s see what happens between application and logical cryptographic device by taking following scenario as an example.

Crypto is an application which is using PKCS #11 supported HSM as it’s cryptographic provider. Crypto needs to generate an AES key using HSM and encrypt a sample of data using the generated key.

Action sequence in simple terms;

Crypto authenticates itself as user ‘USER’ to the HSM and creates a secure communication passage(ie. session between token and application) between device(ie. token resides within a slot) and Crypto.
Crypto asks HSM to generate an AES key through the created communication passage(ie. session).
HSM returns the created AES key through the passage.
Crypto sends set of data needs to be encrypted with the encryption key through the safe passage.
HSM sends back the ciphered data to the application through the communication passage.
Crypto close the communication passage.

Above action sequence same for most of the cryptographic operations, such as decryption, key generation, key saving, signing, hashing etc. with minor changes. As you can see for any application to use the HSM it should first initiate a session with a token. All cryptographic operations provided in the HSM are used via an initiated session.

Above sample scenario is based on single application using single HSM. Since HSM plays a mission critical role in business, PKCS #11 has been designed keeping availability and scalability in mind. So that it provides support for multiple applications to use multiple HSMs at a given time.

Multiple applications using multiple HSMs through PKCS #11 API

This case also same as the single application — single HSM scenario, but there can be multiple slots with same ID and PIN where a given application initiate a session with all those slots increasing availability. Application doesn’t have to bear the burden of handling multiple HSMs because it is handled by the PKCS #11 API. PKCS #11 API is designed integrating load balancing techniques so that cryptographic operations are fairly distributed over set of HSMs connected to the application. Also PKCS #11 provides an added advantage of high usability where increasing applications or HSMs is a matter of changing a configuration file.

I hope now you have an understanding about what is PKCS #11, how it works and how it achieves it’s design goals. Feel free to ask any question.

My next set of blogs will be on building a sample Java application to use HSM as a cryptographic provider.

Cheers!!!

Standard API for connecting HSMs with client applications

Written by Mevan Karunanayake