Channels ▼
RSS

Improving the Performance of the Secure Hash Algorithm (SHA-1)



Nowadays we are seeing more and more services in the Internet and personal computing requiring secure data communications and storage. HTTPS with the SSL and TLS protocols are replacing non-secured HTTP traffic in online commerce, banking, secure web mail, and generally what is now being called "the Internet cloud." File system encryption is becoming ubiquitous on client operating systems as well as in enterprise solutions. In addition to security applications, we are seeing the usage of cryptographic hash algorithms in data de-duplication applications for storage and networking. These trends are very well reflected by the feedback we receive from our customers and partners who are seeing a continuous shift in their workload distribution towards more cryptographic computations. We are doing our best to address these compute intensive workloads. In 2010 Intel introduced AES-NI -- the new Intel Architecture (IA) instruction set extension for the Advanced Encryption Standards (AES), the first implemented in processors produced with 32-nm technology like Intel Xeon 5600 and 2010 Intel CoreTM processor families, utilizing u-architecture codenamed "Westemere", with dramatic performance improvements [1]. We also found that it is possible to leverage previously introduced IA instruction set extensions with some algorithmic innovations to increase the performance of the widely used Secure Hash Algorithm (SHA-1) [2]. This is the subject of this article.

Brief History of SHA-1

SHA-1 has been implemented many times in various software and hardware products, including implementations for the Intel Architecture instruction set. One must note that virtually all widely used software implementations of SHA-1 are using only scalar ALU operations on the general purpose register set, while Intel has introduced several generations of vector (or SIMD) instruction set extensions to IA32 and Intel64. Some of them, namely Intel SSE2, Intel Supplemental SSE3 (SSSE3) and Intel SSE4 are particularly useful for integer algorithms.

SHA-1 received its share of researchers' attention. Besides research devoted to the security aspects such as strength and potential weaknesses, there have also been explorations paying attention to implementation efficiency with regard to modern superscalar and parallel computer architectures, like [3] or [4]. The summary is that although generally SHA-1 is a vastly parallel algorithm in terms of instruction level parallelism (it would have been able to efficiently utilize up to 7 ALU pipelines), it does not suit itself easily for an efficient implementation with SIMD instruction set architectures. Our work, however, demonstrates that SHA-1 can efficiently use SIMD to better utilize the capabilities of IA for enhanced performance.

There have been attempts to implement SHA-1 with SSE2. One good example is the implementation made by Dean Gaudet [5]. We built on some of Dean's ideas with several new important improvements.<.p>

It would be appropriate for the reader who is not familiar with SHA-1 to give a brief introduction. Certainly more details can be found in [2] or [6].

SHA-1 produces a 160 bit (20 byte) hash value (digest), taking as an input a sequence of 1 or more 512 bit (64 byte) data blocks. The original source data also requires some padding according to the standard. The data is treated as an array of big-endian 32-bit values. Processing each of the 64-byte input blocks consists of 80 iterations also known as "rounds."

Schematically SHA-1 looks like this using C-like syntax:


int F1(int B, int C, int D) { return (D ^ ( B & (C ^ D)); }
int F2(int B, int C, int D) { return (D ^ B ^ C); }
int F3(int B, int C, int D) { return (B & C) | (D & (B ^ C)); }

#define K1 0x5A827999
#define K2 0x6ED9EBA1
#define K3 0x8F1BBCDC
#define K4 0xCA62C1D6

int K( int i ) { return
    i < 20 ? K1 :
    i < 40 ? K2 :
    i < 60 ? K3 : K4;
}

int F( int i, int B, int C, int D ) { return
    i < 20 ? F1(B, C, D) :
    i < 40 ? F2(B, C, D) :
    i < 60 ? F3(B, C, D) :
             F2(B, C, D) ;
}

// Update HASH[] by processing a one 64-byte block in MESSAGE[]
void SHA1( int HASH[], int MESSAGE[] )
{
    // these arrays are not necessary but used to better highlight dependencies
    int A[81], B[81], C[81], D[81], E[81];
    int W[80];

    int i, FN;

    A[0] = HASH[0];
    B[0] = HASH[1];
    C[0] = HASH[2];
    D[0] = HASH[3];
    E[0] = HASH[4];

    for ( i=0; i<80; ++i )
    {
        // W[i] calculation, also known as 'Message Scheduling'
        if ( i < 16 )
            // reverse the order of bytes on little-endian
            W[i] = BIG_ENDIAN_LOAD( MESSAGE[i] );
        else
            W[i] = ROTATE_LEFT( W[i-3] ^ W[i-8] ^ W[i-14] ^ W[i-16], 1 );

        FN = F( i, B[i], C[i], D[i] );

        A[i+1] = FN + E[i] + ROTATE_LEFT( A[i], 5 ) + W[i] + K(i);
        B[i+1] = A[i];
        C[i+1] = ROTATE_LEFT( B[i], 30 );
        D[i+1] = C[i];
        E[i+1] = D[i];
    }

    HASH[0] += A[80];
    HASH[1] += B[80];
    HASH[2] += C[80];
    HASH[3] += D[80];
    HASH[4] += E[80];
}


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video