UML For Explaining Cryptography

Jim Hawkins
VMware 360
Published in
9 min readDec 2, 2022

UML diagrams can be used to explain the cryptography of an enterprise security solution. I know because I’ve contributed to security white papers and similar explanatory documents while working in the enterprise software business.

Why I use UML

Some years ago I was receiving a verbal explanation of the relationships between a solution’s cryptographic resources from a security engineer. I was asking questions, and started drawing boxes and lines on a wipe board. The security engineer corrected my understanding by erasing and redrawing some of the boxes and lines. I took a picture with my phone and transcribed the diagrams with a drawing tool on my computer.

Later, but at the same job, I was contributing to the security white paper of the product that I worked on. I came up with a convention, ad hoc, to represent in diagrams the relationships between its cryptographic resources.

Later still, at a different job, I was again writing a security white paper with diagrams. I couldn’t utilise the same ad hoc convention because that was the intellectual property of my previous employer. That’s when I turned to a standard with which I was already familiar, the Unified Modeling Language (UML).

UML as a toolkit

One great value of the UML standard is that it provides tools and isn’t prescriptive. My UML isn’t rigorous. I sometimes make use of standard elements in a non-standard way. But my diagrams do comply with what Martin Fowler described as the “fraction of UML that is most useful.”

The UML standard is a tool that’s going to help me explain the cryptography of a solution.

What should be explained

I explain all of the following with UML.

  • What are the principal cryptographic resources of the solution. For example, what encryption keys and salt values exist.
  • What algorithms are used.
  • What parameter values are used, like how long in bits are the keys and salt values.
  • How is each cryptographic resource generated.
  • Which of the cryptographic resources are stored persistently and where, and which are not.
  • Which resources are protected by which keys, in other words the key hierarchy.

The explanation of all those pieces could help an adversary to mount a cyberattack on the solution, if it has security flaws. If there are flaws, then they should be fixed before the explanation is shown to an external audience, or the flawed parts should be omitted from the explanation. For internal audiences it’s better to include the flawed parts, and explain that they should be fixed.

An explanation that I give won’t really enable a competitor to copy the product, because it won’t be sufficiently detailed. You might think differently, in which case you could require a non-disclosure agreement (NDA) from any external party that wants to see your explanation.

Example Product Requirements

To illustrate how to use UML to explain cryptography I’m going to imagine a product with some security requirements. Then I’m going to propose an implementation, and explain the cryptography of the implementation with UML diagrams.

Imagine the product is Digital Encabulator for Enterprise (DEE) version 1. DEE could be available for end users on mobile devices, and on laptop and desktop computers. The security requirements are to do all the following.

  • Protect data at rest with passcode-based encryption (PBE). The passcode will be a secret value entered by the end user.
  • Support passcode change, with no outage in the availability of protected data.
  • Support data recovery in case the end user forgets their passcode.
  • Support data audit by the enterprise without the end user’s knowledge.
  • Use well-known and standard practices for cryptography.

The PBE requirements can be met by an implementation like this.

  • A passcode is set by each end user when they install the app.
  • A passcode key (PK) is generated from the passcode by a PBKDF2 (Passcode Based Key Derivation Function version 2) process with these parameters.
    - Hash-based message authentication code (HMAC) pseudorandom function.
    - SHA256 hashing function.
    - 20,000 iterations.
  • A passcode salt (PS) value is included in the PBKDF2 inputs. PS will be generated by a secure random number generator (RNG).
  • The PS value is stored persistently on the device. Neither the PK value nor the passcode is stored.

The passcode change requirement can be met by an implementation like this.

  • Protect the application data with an intermediate data encryption key (DEK). DEK will be a long random value generated by a secure RNG. DEK will be 256 bits long, to mitigate against being guessed by brute force in a practical amount of time.
  • Store DEK encrypted by PK in the device persistent storage. Never store DEK in clear in any persistent storage. Encryption will use the AES Key Wrap algorithm.
  • When the user changes their passcode, re-encrypt DEK with the new PK value. (Without DEK, all the application data would have to be re-encrypted when the passcode is changed.)

The data recovery requirements can be met by an implementation like this.

  • Provide a data recovery service (DRS).
  • DRS will receive a set-up request (DRS-SU) when an end user generates DEK on their device. DRS-SU will include a user identifier and require user authentication out-of-band, for example by redirecting the user to an identity provider (IDP).
  • When DRS receives DRS-SU, it generates a key pair for asymmetric encryption. The key pair has a private key (RIK) that is generated and stored in a hardware security module (HSM), and a corresponding public key (RUK). RIK will be 2048 bits long.
  • DRS responds to DRS-SU by sending back RUK.
  • The user app stores DEK encrypted by RUK in the device persistent storage. Encryption will use the RSA algorithm with PKCS1 padding.
  • The end user app can send a recovery request (DRS-RY) to DRS. DRS-RY will include a user identifier and require user authentication, same as DRS-SU, and will include DEK encrypted by RUK.
  • When DRS receives DRS-RY, DEK is decrypted by RIK in the HSM. DRS responds to DRS-RY with DEK.

The data audit requirements can be met by an implementation like this.

  • DRS can receive an audit request (DRS-AT). DRS-AT will include the same values as DRS-RY and in addition an audit user identifier. The audit user will require authorisation.
  • The DRS processing and response to DRS-AT is otherwise the same as for DRS-RY.

Example Class Diagram

This diagram represents the basic cryptographic functions in the implementation as a UML class diagram.

Diagram 1: Digital Encabulator for Enterprise cryptography class diagram

The diagram expresses the following.

Secure Random Number Generator is a class. Its name will be abbreviated to RNG.

  • RNG instances have an attribute, length, with a default value of 256.
  • RNG instances have an operation, getNext, with no parameters that returns length bits.

Cipher is a class.

  • Cipher instances have an attribute, algorithm, with a default of AES-GCM (Advanced Encryption Standard in Galois/Counter Mode).
  • Cipher instances have an operation, encrypt, that takes two parameters, Key and Plaintext, and returns Ciphertext. No detail is given about the data types.
  • Cipher instances have an operation, decrypt, that takes two parameters, Key and Ciphertext, and returns Plaintext. No detail is given about the data types.

And so on.

Example Deployment Diagram

This diagram represents the storage and protection of the cryptographic resources in the implementation as a UML deployment diagram.

Diagram 2: Digital Encabulator for Enterprise cryptography deployment diagram

Interjected apology: the diagram deviates from the UML standard, in the following ways.

  • Artefacts, for example Application Data, are shown as rectangles without a document marker.
    The document marker seems superfluous in the standard. It’s already clear from the flat rectangle that Application Data, for example, isn’t an execution environment.
  • Objects, for example the RNG instances, are shown in a deployment diagram.
    The style used here, rectangles with square corners and underlined text, is taken from the UML object diagram standard.
    Drawing a separate object diagram with the same instances but without their deployed context would require the reader to look at an extra diagram.
  • Labels on connections don’t exactly show communication. Instead they show parameter names and therefore relationships.
  • The Asymmetric Cipher instance is shown in the Application Run-Time Memory environment. That’s correct for encryption, but incorrect for decryption.
    That could perhaps be addressed by expanding the diagram to show separate environments, or documents, for a DRS set-up request and a DRS recovery request.

In combination with the class diagram, above, the deployment diagram expresses the following about data storage.

  • This data is stored persistently by the application.
    - PS.
    - DEK encrypted by PK.
    - Encrypted Application Data.
    - DEK encrypted by RUK.
  • No other data is stored persistently by the application.
  • RIK is stored in an HSM within DRS.
  • RUK is generated by the DRS but isn’t stored there.
  • DEK is stored encrypted by two different keys, RUK and PK. For the PK encryption, the AES-KW (Advanced Encryption Standard Key Wrap) algorithm is used instead of the default AES-GCM.
  • PS is a random value of the default length, 256. RIK and RUK are based on a random value of specified length, 2048.

The diagram also expresses the following about data protection.

  • Application data can be obtained from persistent storage only if DEK is obtained.
  • DEK can be obtained from persistent storage if Passcode is known. PS is in persistent storage, and PK can be obtained by running KDF on the Passcode and PS.
  • DEK can also be obtained from persistent storage if RIK is accessible. DEK encrypted by RUK is persistently stored, and can be decrypted by RIK. The diagram doesn’t express that DEK encrypted by RUK has to be sent to the DRS in order to process the decryption.

The diagram shows data locations and protective relationships statically. It doesn’t show how the data comes be stored, nor how the relationships are set up. Showing that will require a different type of diagram, one that can show sequences of interaction between components.

Example Activity Diagram

This diagram represents the processing for setting the passcode as a UML activity diagram.

Diagram 3: Digital Encabulator for Enterprise Set Passcode activity diagram

The following standard elements are used.

  • Actions, such as a key derivation, have round corners.
  • Object nodes, with square corners, are used to represent cryptographic resources such as keys and salt values.
  • Pins, small squares with text labels, indicate parameters to actions. Pins are only used in the case that there is more than one parameter.
    - The input parameters to an encrypt() process have pins because there are two, key and plaintext.
    - The output of an encrypt() process has only one parameter, ciphertext, so it doesn’t have a pin.
  • Thick bars indicate the start and end of independent processing. For example, the RNG getNext() to generate PS is independent of Passcode being a parameter to KDF processing. This mightn’t be quite standard but avoids having two flows from a resource, which definitely isn’t standard.

In combination with the class diagram, above, the activity diagram expresses that the processing to set the passcode is as follows.

  1. Processing starts when the passcode has been entered.
  2. A secure random number generator (RNG) is run with a default length, as shown in the class diagram. The output is the passcode salt (PS), which is stored persistently.
  3. A key derivation function is run with the passcode as the secret, the PS value as the salt, and default algorithm and iteration count, as shown in the class diagram. The output is the passcode key (PK), which isn’t stored persistently.
  4. Another RNG is run with a default length, as shown in the class diagram. The output is the data encryption key (DEK), which isn’t stored persistently.
  5. A cipher encrypt process is run with PK as the key, DEK as the plaintext, and algorithm AES-KW. The output is stored persistently.

The following processing could also be represented by UML activity diagrams.

  • Set up data recovery.
  • Change passcode.
  • Launch app, which would include retrieving the data encryption key. Retrieval could be from persistent storage, by regenerating the passcode key from a passcode entered by the user, or from the data recovery service.
  • Store application data.

Conclusion

The above examples show that UML diagrams can be used to explain cryptography. Different types of UML diagram can be used to explain different aspects.

  • Class diagrams show which cryptographic standards and parameters are used.
  • Deployment diagrams show where resources are stored, if they are stored, and which resources are protected by which keys.
  • Activity diagrams show sequences of processing. Object nodes show which resources are involved in the processing.

References and Further Reading

For an example of an explanation of a real solution’s cryptography, take a look at the white paper published here.
developer.vmware.com/…/MobileApplicationManagement.pdf
(The UML diagrams are in the section Workspace ONE Encryption of Data at Rest, under the heading Passcode-Based Encryption Diagrams.)

The diagrams in this article and in the white paper were drawn with the diagrams.net tool, also known as the draw.io tool.

For background on the Unified Modeling Language (UML) standard, see the https://uml.org website and the book UML Distilled Third Edition by Martin Fowler.

For a reference on cryptography standards and terms, see the book Serious Cryptography by Jean-Philippe Aumasson.

--

--

Jim Hawkins
VMware 360

Software developer working on secure mobilisation of enterprise data to Android and iOS devices.