# The HMAC Algorithm

Apr99: The HMAC Algorithm

William is a consultant, lecturer, and author of books on data communications and computer networking. His most recent book is Cryptography and Network Security: Principles and Practice, Second Edition (Prentice-Hall, 1998). He can be contacted at http://www.shore.net/~ws.

Message authentication is a procedure that allows communicating parties to verify that received messages are authentic. The two important aspects are verifying that the contents of the message have not been altered and that the source is authentic. The Message Authentication Code (MAC) is a widely used technique for performing message authentication. A variation on the MAC algorithm has emerged as an Internet standard for a wide variety of applications -- HMAC, short for "Keyed-Hashing for Message Authentication." HMAC was introduced in "Keying Hash Functions for Message Authentication," by M. Bellare, R. Canetti, and H. Krawczyk in Proceedings, CRYPTO '96 (Springer-Verlag,andathttp://wwwcse .ucsd.edu/users/mihir). HMAC is currently an Internet draft that has been distributed by the Internet Engineering Task Force as Request For Proposal (RFP) 2104.

### MAC

MAC algorithms involve the use of a secret key to generate a small block of data, known as a "message authentication code," that is appended to the message. This technique assumes that two communicating parties, say A and B, share K as a secret key. When A has a message to send to B, it calculates the message authentication code as a function of the message and the key. The message and code are transmitted to the intended recipient. The recipient performs the same calculation on the received message, using the same secret key, to generate a new message authentication code. The received code is compared to the calculated code; see Figure 1. If you assume that only the receiver and the sender know the identity of the secret key, and if the received code matches the calculated code, then:

1. The receiver is assured that the message has not been altered. If an attacker alters the message but does not alter the code, then the receiver's calculation of the code will differ from the received code. Because the attacker is assumed not to know the secret key, the attacker cannot alter the code to correspond to the alterations in the message.

2. The receiver is assured that the message is from the alleged sender. Because no one else knows the secret key, no one else could prepare a message with the proper code.

3. If the message includes a sequence number (such as is used with X.25, HDLC, and TCP), then the receiver can be assured of the proper sequence, because an attacker cannot successfully alter the sequence number.

A number of algorithms could be used to generate the code. The National Bureau of Standards (NBS), in its publication DES Mode of Operation (http://www.nist .gov/itl/div897/pubs/fip113.htm), recommends the use of the DES algorithm. The DES algorithm is used to generate an encrypted version of the message, and the last bits of ciphertext are used as the code. A 16- or 32-bit code is typical. HMAC is a more efficient, and increasingly popular, alternative.

The process just described is similar to encryption. One difference is that the authentication algorithm does not need to be reversible, as it must for decryption. It turns out that because of the mathematical properties of the authentication function, it is less vulnerable than encryption to being broken.

### HMAC

In recent years, there has been increased interest in developing a MAC derived from a cryptographic hash code, such as MD5, SHA-1, or RIPEMD-160. The motivations for this interest are:

• Cryptographic hash functions generally execute faster in software than symmetric block ciphers such as DES.
• Library code for cryptographic hash functions is widely available.

• There are no export restrictions for cryptographic hash functions, whereas symmetric block ciphers, even when used for MACs, are restricted.

A hash function such as MD5 was not designed for use as a MAC and cannot be used directly for that purpose because it does not rely on a secret key. There have been a number of proposals to incorporate a secret key into an existing hash algorithm. HMAC received the most support. HMAC has been chosen as the mandatory-to-implement MAC for IP Security, and is used in other Internet protocols, such as Transport Layer Security (TLS, soon to replace Secure Sockets Layer) and Secure Electronic Transaction (SET).

### HMAC-Design Objectives

Design objectives that RFC 2104 lists for HMAC include:

• To use, without modifications, available hash functions. In particular, hash functions that perform well in software, and for which code is freely and widely available.
• To allow for easy replaceability of the embedded hash function in case faster or more secure hash functions are found or required.

• To preserve the original performance of the hash function without incurring a significant degradation.

• To use and handle keys in a simple way.

• To have a well understood cryptographic analysis of the strength of the authentication mechanism based on reasonable assumptions on the embedded hash function.

The first two objectives are important to the acceptability of HMAC. HMAC treats the hash function as a black box. This has two benefits. First, an existing implementation of a hash function can be used as a module in implementing HMAC. The bulk of the HMAC code is prepackaged and ready to use without modification. Second, to replace a given hash function in an HMAC implementation, you must simply remove the existing hash function module and drop in the new module. This could be done if a faster hash function were desired. More important, if the security of the embedded hash function were compromised, the security of HMAC could be retained simply by replacing the embedded hash function with a more secure one (replacing MD5 with SHA-1, for example).

The last design objective in the preceding list is, in fact, the main advantage of HMAC over other proposed hash-based schemes. HMAC can be proven secure provided that the embedded hash function has some reasonable cryptographic strengths.

### The HMAC Algorithm

Figure 2 illustrates the overall operation of HMAC (see Table 1 for definition of the terms in Figure 2). HMAC can then be expressed; see Figure 3. In other words:

1. Append zeros to the left end of K to create a b-bit string K+ (for example, if K is of length 160 bits and b = 512, then K will be appended with 44 zero bytes 0x00).

2. XOR (bitwise exclusive OR) K+ with ipad to produce the b-bit block Si.

3. Append M to Si.

4. Apply H to the stream generated in Step 3.

5. XOR K+ with opad to produce the b-bit block So.

6. Append the hash result from Step 4 to So.

7. Apply H to the stream generated in Step 6 and output the result.

Note the XOR with ipad results in flipping one-half of the bits of K. Similarly, the XOR with opad results in flipping one-half of the bits of K, but a different set of bits. In effect, by passing Si and So through the compression function of the hash algorithm, you have pseudorandomly generated two keys from K. HMAC should execute in approximately the same time as the embedded hash function for long messages. HMAC adds three executions of the hash compression function (for Si, So, and the block produced from the inner hash).

### HMAC Security

The security of any MAC function based on an embedded hash function depends in some way on the cryptographic strength of the underlying hash function. The appeal of HMAC is that its designers have been able to prove an exact relationship between the strength of the embedded hash function and the strength of HMAC. The security of a MAC function is generally expressed in terms of the probability of successful forgery with a given amount of time spent by the forger and a given number of message-MAC pairs created with the same key. In essence, it can be shown that, for a given level of effort (time, message-MAC pairs), on messages generated by legitimate users and seen by attackers, the probability of a successful attack on HMAC is equivalent to one of the following attacks on the embedded hash function:

1. Attackers are able to compute an output of the compression function even with an Initial Value (IV) that is random, secret, and unknown to attackers.

2. Attackers find collisions in the hash function even when the IV is random and secret.

In the first attack, you can view the compression function as equivalent to the hash function applied to a message consisting of a single b-bit block. For this attack, the IV of the hash function is replaced by a secret, random value of n bits. An attack on this hash function requires either a brute-force attack on the key, which is a level of effort on the order of 2n, or a birthday attack, which is a special case of the second attack.

In the second attack, attackers are looking for two messages, M and M',that produce the same hash: H(M)=H(M'). This requires a level of effort of 2n/2 for a hash length of n. On this basis, the security of MD5 is called into question, because a level of effort of 264 looks feasible with today's technology. Does this mean that a 128-bit hash function such as MD5 is unsuitable for HMAC? The answer is no. To attack MD5, attackers can choose any set of messages and work on these offline on a dedicated computing facility to find a collision. Because attackers know the hash algorithm and the default IV, attackers can generate the hash code for each of the messages that attackers generate. However, when attacking HMAC, attackers cannot generate message/code pairs offline because attackers do not know K. Therefore, attackers must observe a sequence of messages generated by HMAC under the same key and perform the attack on these known messages. For a hash code length of 128 bits, this requires 264 observed blocks (273 bits) generated using the same key. On a 1-Gbps link, you would need to observe a continuous stream of messages with no change in the key for about 250,000 years to succeed. Thus, if speed is a concern, it is fully acceptable to use MD5 rather than SHA-1 or RIPEMD-160 as the embedded hash function for HMAC.

Listing One, the appendix to RFC 2104, is sample code for the implementation of HMAC with MD5. Listing Two (also from RFC 2104) presents test vectors for Listing One (trailing '\0' of a character string not included).

DDJ

#### Listing One

```/* Function: hmac_md5 */
void
hmac_md5(text, text_len, key, key_len, digest)
unsigned char* text; /* pointer to data stream */
int text_len; /* length of data stream */
unsigned char* key; /* pointer to authentication key */
int key_len; /* length of authentication key */
caddr_t digest; /* caller digest to be filled in */
{
MD5_CTX context;
unsigned char tk[16];
int i;
/* if key is longer than 64 bytes reset it to key=MD5(key) */
if (key_len > 64) {
MD5_CTX      tctx;
MD5Init(&tctx);
MD5Update(&tctx, key, key_len);
MD5Final(tk, &tctx);

key = tk;
key_len = 16;
}
/* the HMAC_MD5 transform looks like:
* where K is an n byte key
* ipad is the byte 0x36 repeated 64 times
* opad is the byte 0x5c repeated 64 times
* and text is the data being protected
*/
/* start out by storing key in pads */

for (i=0; i<64; i++) {
}
/* perform inner MD5 */
MD5Init(&context);               /* init context for 1st pass */
MD5Update(&context, text, text_len); /* then text of datagram */
MD5Final(digest, &context);          /* finish up 1st pass */
/* perform outer MD5 */
MD5Init(&context);               /* init context for 2nd pass */
MD5Update(&context, digest, 16); /* then results of 1st hash */
MD5Final(digest, &context);      /* finish up 2nd pass */
}
```

#### Listing Two

```key =         0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
key_len =     16 bytes
data =        "Hi There"
data_len =    8  bytes
digest =      0x9294727a3638bb1c13f48ef8158bfc9d

key =         "Jefe"
data =        "what do ya want for nothing?"
data_len =    28 bytes
digest =      0x750c783e6ab0b503eaa86e310a5db738
key =         0xAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
key_len       16 bytes
data =        0xDDDDDDDDDDDDDDDDDDDD...
..DDDDDDDDDDDDDDDDDDDD...
..DDDDDDDDDDDDDDDDDDDD...
..DDDDDDDDDDDDDDDDDDDD...
..DDDDDDDDDDDDDDDDDDDD
data_len =    50 bytes
digest =      0x56be34521d144c88dbb8c733f0e8b3f6

```

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>