Security Engineer Interview Questions: What’s an HMAC?

Source: Wikipedia

At the OWASP Bay Area Meetup recently, I ran into Tad Whitaker. He works in security and was instrumental in organizing the OWASP Bay Area meetup along with Prashanth KV and the rest of the OWASP Bay Area leadership. As I was perusing through his public github repos, I came across one entitled “Security Engineer Interview Questions”. This intrigued me and I opened it to find that it was a list of interview questions that Tad had curated from all Security Engineer Interview Questions from Glassdoor.

I found the repo to be extremely useful. Not only were these questions a reflection of what the industry expects, but very useful to know for experienced and aspiring security engineers around the globe. The repo is here, btw. As I read through some of the questions, I realized that several were AppSec related. And in addition, I felt that they were a great way to get a lot of folks to understand some of these key concepts, especially from the perspective of AppSec usecases and vulnerabilities, which is why I decided to repurpose the project differently.

My overall aim is to put together a “compendium of answers” for most/all/some of these questions, with a special focus on AppSec. My objective is to answer the question first, and subsequently identify real-world use-cases, anecdotes or “trench stories” for the question, as applicable. The idea is to present the most relevant information about the question without having to refer to several sources at a time. I am doing this more as a personal project, and a way of giving back something to the community at large. So, before I get started, I would like to thank Tad for the initial inspiration.

Question: What is an HMAC?

I picked this question first, because I see a lot of security folks and developers get a little confused by this one. The uses of HMAC are also quite extensive in several authentication and crypto libraries. So, there’s a great need for security engineers, developers and technologists alike to understand what an HMAC is.

An HMAC is “Hash-based Message Authentication Code”. Before we get to the “Hash-based” part, let’s quickly get into the “Message Authentication Code” part of this term. A Message Authentication Code is a short piece of code that is meant to verify integrity with authenticity of a message of any kind. For instance, let’s assume that Alice sends a message to Bob, with the following content.

Hi Bob — Please transfer $1000 to Account Number: 012349124212

but Sally, a malicious actor intercepts the message and changes the account number to her account:

Hi Bob — Please transfer $1000 to Account Number: 022389124391

When this message is received by Bob, he would act on what appears to be a genuine message from Alice, and transfer the money to Sally’s account.

If on the other hand, there was a separate transmission of a Message Authentication Code or a Digital Signature for the message transmitted, Bob would’ve received an HMAC with the message which looked like this:

but when Bob checks the HMAC against the message that Sally sent, Bob gets an output like this:

And since

a47dee64d793e51dabac8125591264bf827ab84d211bd85ce8fb4856c663e779 != 70974227ea4b52d45163666abd171662f03f29131baa68e90bbd0f681963c8ab

Bob would have realized that this message was probably tampered in transit and therefore compromised. Bob has now verified that this message has been modified and is not authentic.

An HMAC is a recipe for a Hashing algorithm to be used as a Message Authentication Code. With an HMAC, you can use popular hashing algorithms like SHA-256, etc with a secret key to generate a Message Authentication Code. Other than an HMAC, you also have block-ciphers like AES and DES to generate a CMAC (Cipher Based Message Authentication Code). However, I am guessing that owing to speed and ease-of-use, the HMAC is a more popular way of generating Message Authentication Codes.

How does an HMAC work?

For an HMAC to work, you would need:

  • An HMAC Implementation of a Hashing Algorithm, example — HMAC-SHA256 is the HMAC Implementation or recipe for SHA-256.
  • A Secret Key
  • A Message

In this code snippet we are using python’s default HMAC library, with a not-very-secure key called “s3cr3tk3y” and the message is “Hello World”. The message could be JSON (popular for Web Services) or a file, or any other datatype. I am using the SHA-256 hashing function to generate the HMAC

Where are HMACs used?

Transport Layer Security

One of the key differences between “SSL” and the current standard “TLS” is the use of HMACs vs just MACs in SSL. Turns out that MACs were subject to collision attacks where the attackers were able to reproduce identical Message Authentication Codes, thereby leading to a security flaw in the SSL Standard. The use of HMACS for TLS has definitely increased the level of security provided for TLS implementations the world over. However, even with TLS, with reference to HMACs, it is ideal that you use a strong set of CipherSuites.

This is a CipherSuite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, and in this CipherSuite the SHA-256 at the end of the CipherSuite specification denotes the HMAC algorithm used to generate Message Authentication Codes TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256

With Google’s successful compromise of the SHA-1 algorithm in 2017[link], its highly recommended that you use CipherSuites with SHA-256 and higher.

JSON Web Tokens

JSON Web Tokens (JWT) have become a very popular implementation for stateless applications, especially web services or modern micro-services or single page applications. In a JWT implementation, the server signs an Authorization Token and transmits it over to the client-side. The Token can be signed either with an HMAC or with a Private Key from a Public-Private Keypair. Using HMACs to sign JWTs are an extremely common practice.

For instance, in a JWT, the application would generate a JSON payload and sign it with an HMAC-SHA256 algo (called HS256 in short) and transmit that to the client. The client would use this Token in all subsequent requests to the application. The application would decode the base64 encoded token and verify if the HMAC matches the one generated by the application at runtime. If the decode succeeds and the other parameters like Expiration check out, the request is considered authenticated.

One of the key concerns with using HMACs with JWTs is the use of the Secret Key. If the Secret Key is exposed or is weak, then its trivial for the attacker to forge authenticated requests to the application and bypass authentication and authorization controls. Please see my article and repo on JWTs for additional information[link]

In the code snippet above, I have created a JWT with a JSON payload of {“Hello”: “World”}. I have used the key “s3cr3tk3y” and the HMAC-SHA256 algorithm to generate the token.

I have used the decode() function to check if the token is valid. Turns out, it is :)

PBKDF2 — Password Protection

PBKDF2 (Password-based Key Derivation Function) has emerged as one of the “goto” ways of protecting passwords at rest. Several organizations (and its about time) have started migrating from storing passwords with plain (gasp) or salted hashing algos to key-stretching and intentionally slow algorithms like PBKDF2 and BCrypt.

PBKDF2 uses the following parameters to generate a password hash:

DK = PBKDF2(PRF, Password, Salt, c, dkLen)
  • The PRF is the Pseudorandom Function, which is essentially an HMAC
  • The Password is the user’s plaintext password
  • The salt is the nonce that adds entropy to the password
  • The `c` parameter is the number of iterations that the process would repeat itself. For instance — 10,000 iterations
  • dkLen is the desired length of the derived key (commonly called the password hash)

At its core, PBKDF2 uses the HMAC to generate the Pseudorandom function. Again, its recommended not to use algos like SHA-1 that are considered weak. I recommend using SHA-256 and above for the HMAC.

PBKDF2 initially generates a password digest based on the algorithm from the plaintext and generates an HMAC based on that password digest, thereby generating the Pseudorandom function.

In this code snippet we are using python’s `passlib` library to generate a PBKDF2-HMAC-SHA256 hash, with 10,000 iterations and a random salt.