Skip to content

Getting to know HKDF

The HMAC-based Key Derivation Function makes modern cryptographic capabilities easier for Node.js developers.

Although the HMAC-based Key Derivation Function (HKDF) is not new (the specification for it, RFC 5869 was published by the IETF in May 2010), it is starting to gain prominence in more types of applications and systems.

For example, HKDF is one of the algorithms used within the Cryptography Specification of the Exposure Notification system created by Google and Apple that lies at the heart of the Covid-19 contact tracing applications NearForm and others have produced to help slow the spread of the virus. You can also find HKDF buried in the internals of the new QUIC protocol TLS 1.3 handshake , at the heart of the new Hybrid Public Key Encryption (HPKE) scheme, as a component of several evolving distributed identity frameworks and in many other systems.

As part of the effort to implement the standard Web Crypto API in Node.js 15 , we also implemented built-in support for using HKDF. Here, we introduce you to the fundamentals of HKDF and illustrate how it can be used — on both the server in Node.js and the client.

Introducing HKDF

One of the central goals of the Covid-19 digital contact tracing initiative has been to protect the privacy of individuals as much as possible. From the ground up, exposure notification applications such as Covid Green have been specifically built around this fundamental protection, and the underlying protocols provided by Apple's iOS and Google's Android operating systems have this protection built in.

While full details of the Google-Apple Exposure Notification (GAEN) protocol can be found in the Bluetooth Specifications and published FAQs , we want to focus on one very specific component called the Rolling Proximity Identifier (RPI).

After the contact tracing application is installed and active on your mobile device, the GAEN service will allocate a random Temporary Exposure Key every 24 hours. This key is stored securely on your device and is only shared with the exposure notification system if and when you test positive for Covid-19 and manually choose to anonymously share your diagnosis so others can be notified.

At 15-minute intervals throughout the day, your phone takes that Temporary Exposure Key (TEK) and uses it as part of a function to create a new RPI. Your phone then continuously broadcasts the current RPI over Bluetooth to any other devices in the local vicinity. Neither the TEK nor the RPI contains any information that connects to your phone or you personally in any way.

When a mobile device detects another phone nearby that is broadcasting proximity identifiers, those are recorded and stored securely on the device along with the timestamp of when it was seen.

If a user of the application later tests positive for Covid-19, they open the application and choose to share the list of TEKs their phone has accumulated over the past two weeks. A copy of all uploaded keys is then sent to every phone with the app. When my phone receives the lists of keys, it goes through each, regenerates the associated RPI and checks whether that identifier has been encountered at any time during the past two weeks. If it has, I will get a notification that I may have been exposed.

This system is designed such that none of the keys and identifiers can be correlated with any single individual or device, while still allowing deterministic and algorithmic generation and verification. The HKDF is one of the core mechanisms it uses to accomplish this.

The specific algorithm used in the exposure notification service is straightforward:

teki ← CRNG(16) RPIKi ← HKDF(teki, NULL, UTF8("EN-RPIK"), 16) RPIi, j ← AES128(RPIKi , PaddedDataj)

The steps here are essentially:

  1. Generate the TEK (the tek) by generating 16 cryptographically random bytes. Generate the 16-byte RPI Key (the RPIK) by passing the tek into the HKDF algorithm using no salt, and the UTF-8 encoded string literal "EN-RPIK" as the info.
  2. Finally, generate the RPI (the RPI) by passing the RPIK into the AES-128 encryption function along with some standard padding data that includes a rolling counter called the "ENIntervalNumber".
  3. The use of HKDF and AES-128 together is enough to ensure that the 16-byte RPI generated has a very low probability of collision, is fully deterministic and does not leak any private information.

HKDF in the new QUIC protocols TLS handshake

Aside from contact tracing, HKDF is also used as a critical component of the new QUIC protocols TLS handshake — specifically, HKDF is used to derive the initial set of secrets that are used to kick off a QUIC connection.

When a QUIC client wants to initiate communication with the server, the first thing it does is create a new identifier for itself called a Connection ID (CID). Each endpoint participating in a QUIC connection is persistently identified by a CID throughout the connection's lifetime, and the CID for an endpoint can change multiple times. At the very beginning of the conversation, however, the client sends its initial CID to the server to start the flow of data.

The server takes that initial CID and passes it into the HKDF function with standard salt and info values to generate the initial set of secrets the two peers will use to establish the rest of the TLS session. The algorithm is straightforward:

text
initial_salt = 0xafbfec289993d24c9e9786f19c6111e04390a899
initial_secret = HKDF-Extract(initial_salt, client_dst_connection_id)</p><p>client_initial_secret = HKDF-Expand-Label(initial_secret,
                                          "client in", "",
                                          Hash.length)
server_initial_secret = HKDF-Expand-Label(initial_secret,
                                          "server in", "",
                                          Hash.length)

The salt is standard and used for all QUIC connections, as are the info values "client in" and "server in". The CID provides the source of pseudo-randomness that helps to avoid collisions in the generated initial keys. Once the TLS handshake has started and a stronger cryptographic session has been established, these initial keys are discarded.

Note that the HKDF mechanism works such that, so long as both the client and server are using the same initial CID, they can both independently derive the client and server initial secrets. That determinism lies at the heart of the HKDF scheme.

How it works

The HKDF scheme is detailed precisely in RFC 5869 and consists of two distinct phases, each of which can be used independently or together. The first phase, called "Extract", involves simply generating an HMAC hash over a given salt value and an initial key. Any standard cryptographic hash algorithm can be used, but the most common are the SHA family of hashes — with SHA-256 being the most frequent choice.

Once the cryptographic hash is generated, the second phase, called "Expand", takes the hash through a series of transformations and subsequent additional HMAC generations to produce the output keying material. To give an idea of what this looks like, here is a snippet of the algorithm as documented in RFC 5869:

text
N = ceil(L/HashLen)
T = T(1) | T(2) | T(3) | ... | T(N)
OKM = first L octets of T
where:
T(0) = empty string (zero length)
T(1) = HMAC-Hash(PRK, T(0) | info | 0x01)
T(2) = HMAC-Hash(PRK, T(1) | info | 0x02)
T(3) = HMAC-Hash(PRK, T(2) | info | 0x03)
...

The end result is a byte string that appears random but that can be deterministically regenerated given the identical inputs.

The cryptographic strength of HKDF is a factor of both the initial keying material and the salt selected. It is important that either or both are generated from a strong source of entropy to ensure the resulting key material is unique.

How to use it

The Web Crypto API has built-in support for the HKDF scheme, and whereas browsers have had support for a while, Node.js has only recently gained the built-in ability to use HKDF.

text
// In Node.js, get the SubtleCrypto from the webcrypto implementation
import { webcrypto } from 'crypto';
const { subtle, getRandomValues } = webcrypto;</p><p>// First set up our initial key
const key = await subtle.importKey(
  'raw',
  Buffer.from('initial key'),
  { name: 'HKDF' },
  false,
  ['deriveKey', 'deriveBits']);</p><p>// Then, perform the HKDF Extract and Expand
const out = await subtle.deriveBits(
  {
    name: 'HKDF',
    info: 'the info',
    salt: getRandomValues(new Uint8Array(16)),
    hash: 'SHA-256'
  },
  key,
  128);  // 16 bytes

The Web Crypto API is promise-based, and the Node.js implementation ensures that the HKDF operations occur off the main event-loop thread within the libuv thread pool. This should help performance when an application is generating a large number of derived keys.

With the exception of the first two lines, which are the Node.js-specific way of getting to the Web Crypto API, the example above will work consistently on both the client and server side.

For the sake of being complete, however, we took the additional step of adding HKDF support to the existing legacy Node.js crypto module in both an asynchronous callback version and a synchronous blocking version.

text
const { hkdf, hkdfSync } = require('crypto');</p><p>hkdf('sha512', 'key', 'salt', 'info', 64, (err, derivedKey) => {
  if (err) throw err;
  console.log(Buffer.from(derivedKey).toString('hex'));  // '24156e2...5391653'
});</p><p>const derivedKey = hkdfSync('sha512', 'key', 'salt', 'info', 64);
console.log(Buffer.from(derivedKey).toString('hex'));  // '24156e2...5391653'

There shouldn't be anything surprising about these two variations if you're already familiar with the legacy crypto API in Node.js. Each performs both of the two phases (Extract and Expand) of the HKDF scheme and both accept the initial key, salt and info, along with a cryptographic HMAC function and the total number of bytes to generate. Like the Web Crypto variation, the asynchronous callback function defers the cryptographic operation to the libuv thread pool off the main Node.js event loop, while the synchronous version blocks the current thread until it completes. The APIs have been designed to just do what they are supposed to do without getting in the way.

What happens next

This is just a brief introduction to HKDF, a quick demonstration of a few ways it can be used and an illustration of the APIs available for JavaScript developers to use it. One of our motivations for adding the scheme to Node.js was to make it easier for developers to build more modern cryptographic capabilities into their Node.js applications.

The HKDF support is just one of the new pieces that we've been working on including the full Web Crypto API implementation, support for generating standards-compliant random UUIDs, enabling the allocation and use of secure memory regions and more. In an upcoming post, we will explore more of those new capabilities by looking at the new UUID generation API and drilling down on its performance relative to popular alternatives in the Node.js ecosystem. Stay tuned for more.

Insight, imagination and expertly engineered solutions to accelerate and sustain progress.

Contact