Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼


Keccak: The New SHA-3 Encryption Standard

Figure 2 shows the data structure of the internal state S. This C struct, named spongeState, contains eight fields, two of which are aligned arrays (marked in red). The state[] field holds the actual state bytes, while the dataQueue[] field holds the message bytes to be combined and permutated.

struct holding the internal state of the Keccak hash function
Figure 2: The struct holding the internal state of the Keccak hash function.

The capacity field is the hash capacity (c). Its value is set to twice the hash size (2*n). The rate field is the number of message bits processed per round. It is set to the value of 1600-c. The bitsInQueue field is the number of message bits left in the dataQueue[] field. The fixedOutputlength field is the desired hash size (n). The squeezing field is a mode flag. When set to zero, Keccak is in compression mode. When set to 1, Keccak is in extraction mode. And the field bitsAvailableForSqueezing holds the number of state bits that can be assembled into the final message hash.

Keccak uses 24 permutation rounds to reduce the message text into a hash. Each round invokes five modules in succession as shown in Figure 3.

The modules invoked in each permutation round.
Figure 3: The modules invoked in each permutation round.

The theta module renders the internal state into a 5-by-5 array of 64-bit elements. It computes the parities of each column and combines them with an exclusive-or (XOR) operator. Then it XORs the resulting parity to each state bit as follows:

S[i][j][k] ^= parity(S[0...4][j-1][k]) 
	^ parity(S[0...4][j+1][k-1])

where i = 0...4; j = 0...4; k = 0...63

The rho module rotates each 64-bit element by a triangular number. However, it excludes element S[0][0] from rotation. The phi module permutates the 64-bit elements. Permutation follows the fixed pattern assignment shown below:

S[j][2*i + 3*j] = S[i][j]

The chi module adds a non-linear aspect to the permutation round. It combines the row elements using only three bitwise operators: AND, NOT, and XOR. Then it writes the result back to the state array as follows:

S[i][j][k] ^= ~S[i][j + 1][k] & S[i][j + 2][k]

The iota module breaks up any symmetry caused by the other modules. This it does by XORing one of the array elements to a round constant. The module has 24 round constants to choose from. These constants are defined internally by Keccak. The Keccak source code in C is, however, too large to be listed here in its entirety.

Evaluating SHA-3

So, how does SHA-3 compare against its predecessors SHA-1 and SHA-2? To answer this question, I subjected the three hash functions to four separate tests:

  • The first test is the collisions test. Here, I prepare a set of message texts and have the hash functions process each text. A well-designed hash function should emit a unique hash for each text, even if the two texts differ by a mere character. If a collision does occur, it should be rare, with a complexity of at least 2n.
  • The second test is a run-bit test. This test reveals how well the bits are distributed on the hash value. Ideally, a hash value should have a balanced number of 1s and 0s. If we divide the hash by two and count the number of 1s on each half, we should get an equal count. If one half has more 1s than the other, it means a potential skew, which could translate to a possible collision.
  • The next test is the avalanche or cascade test. This one measures a hash function's ability to react to changes in a message text. Ideally, a change in one message byte should cause about half the hash bytes to change; and the changes should happen uniformly along the hash. Too small or too large a change might be indicative of a collision.
  • My final test is a timing test. This usefully measures how fast a hash function can process the message text. A good hash function should take no more than half a second to process a large message text. Longer hash times are undesirable, especially if the function is to be used as part of a high-traffic service.

Table 1 shows how SHA-3 compares with its two predecessors. The test machine is an Apple MacBook with a 2 GHz Intel Core 2 Duo and 2GB of DDR2 RAM. The operating system is MacOS X 10.6.7. The test message is an excerpt from the William Shakespeare's play Hamlet — the "To be or not to be" soliloquy, consisting of 33 lines of text for a total of 261 words. Because SHA-1 can emit only 160-bit hashes, outputs for both SHA-2 and SHA-3 are restricted to 224 bits, the closest comparable size they emit.

comparing SHA-3 with its predecessors
Table 1: Comparing SHA-3 with its predecessors.

Compared with SHA-1 and SHA-2, SHA-3 has a smaller cascade rate. It changes only three hash bytes for every change in message byte. SHA-3 showed a slight bit of skew. The right half of its message hash has twelve more 1s than the left half. SHA-3 is faster than SHA-2, taking 2.80 nanoseconds less to hash the test message.

Table 2 shows how well SHA-3 performs when emitting four different hash sizes. The same machine and message text were used to obtain the test results.

how well SHA-3 performs when emitting four different hash size
Table 2: How well SHA-3 performs when emitting four different hash size.

Here, too, SHA-3 showed a small cascade rate for all four sizes. This seems to be characteristic of Keccak and its sponge engine. SHA-3 also showed a slight skew. Three of the message hashes have more 1 bits in their right halves than on their left. The 512-bit hash, however, has more 1 bits on its left half than on the right. SHA-3 is still faster, taking only 2.10 nanoseconds to produce a 512-bit message hash. That is 2.3 nanoseconds faster than a 256-bit SHA-2.

The tables do not show any collision test results, because the test itself did not find one. The test message is probably too short or too varied to induce a collision. However, larger message texts, especially those with minute variations, might cause at least one collision in one of the hash functions. The Xcode project FooSHA3 is available for reference.

Deploying SHA-3

SHA-3 has not yet made its way into most mainstream platforms. Some platforms (like MacOS X and iOS) still rely on the libcrypto library, which supplies only SHA-1 and SHA-2. One way to deploy SHA-3 to the desired platform is to compile the hash algorithm as a static library. You can then install the new library into /usr/lib or /usr/share, assuming you have write access to those directories. But there is another way to deploy SHA-3, one that works for both MacOS X and iOS, which is as a framework bundle.

A framework bundle is a hierarchical collection of files that together form a dynamic library. Unlike most bundles, a framework bundle is not opaque; its contents can be perused like those in a normal directory. Framework bundles can sit in a common area (such as ~/Library/Frameworks), or they can be copied into the application bundle and used as a private resource. Access to a framework bundle is through the NSBundle class or through Core Foundation's Bundle Services.

Figure 4 shows the structure of the bundle SHA3.framework. At its top level are four shortcuts and a directory named Versions. Inside that directory is another shortcut and the subdirectory A.

structure of the bundle
Figure 4: Installing SHA-3 as a framework bundle.

In subdirectory A are four items. The SHA3 file is the executable binary of the hash algorithm. The Resources directory holds the support files needed by the executable and by the framework. The Headers directory holds the source headers that a Cocoa app can import. And the PrivateHeaders directory holds those headers meant only for the framework to use.

Finally, the Current shortcut points to the location of the latest SHA3 executable, which is subdirectory A in this example. If a more recent executable is in another subdirectory, such as subdirectory B, the Current shortcut should point to that location instead. As for the remaining four shortcuts, they point to their respective framework items.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.