Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Security

A Tool for Secret Key Cryptography


AUG88: A TOOL FOR SECRET KEY CRYPTOGRAPHY

John Michener is a senior research scientist. he is working in the field of simulation and beam probing. He can be reached at 177 Moore St., Princeton, NJ 08540.


Many of the proprietary algorithms used in commercial software packages have proven to be vulnerable to the sophisticated statistical tools of cryptanalysis. As you might expect, this creates havoc and even panic in those environments (banks, defense contractors, and so on) where security is a necessity. With this in mind, this article will discuss a cryptographic algorithm that can be used to construct secret key cryptographic systems that are relatively resistant to analysis.

Before discussing crypto-algorithms, however, a few words should be said about how codes are broken Cryptanalysis is a form of applied mathematics and statistics in which the cryptanalyst attempt to strip from the underlying message the confusion supplied by the cryptographic process. The basic tool of the cryptographer is a thorough knowledge of the language and its properties. Written English typically has a redundancy on the order of 75 percent (when using a 26 character alphabet), resulting in each character of English text transmitting somewhat over 1 bit of information. Thi_ se_ten_e i_ an_exa_ple_ of _his _eff_ct _eca_se _ne _n fo_r c_ara_ter_ ha_ be_n d_let_d, _nd _epl_ced_by _ _. It is not very difficult to determine that the text of the previous sentence is "This sentence is an example of this effect because one in four characters has been deleted and replaced by a _." English text encoded as ASCII characters has a redundancy on the order of 85 percent (8 bits per character, resulting in an alphabet of 256 characters). In fact, manu individuals enjoy such language reconstruction problems to the extent that they spend substantial portions of their own time solving similar problems in the form of acrostics and crossword puzzles.

Many documents may be viewed as having far lower information contents than the 1 + bit per character estimate. If we look at "boilerplate" phrases :(such as we always find in the fine print on car loan, credit card, or mortgage agreements) we find that such phrases tend to be highly standardized within any given field. Rather than sending the phrase or paragraph, it would, in principle, by sufficient to send two pieces of information denoting the specialty area and the phrase identifier. Typically, only a few bytes would be necessary to represent an entire paragraph.

Computer language files are frequently far more structured and predictable than documents written in English. This is particularly true for highly structured transactions where it must be assumed that the attacker has detailed knowledge of the message structure. An example of such a situation is the communications of bank money transfer machines with their host mainframes. A likely attacker of such a system is a bank employee who has detailed knowledge of the protocols used, but does not have the key to encipher the communications. The initiating machine must provide the user's identification number, PIN number, the requested transaction, and authentication information to the host computer in such a manner that an attacker is unable to extract money from the system.

Predictability (including the message structure) and the redundancy of source material provide cryptanalysts with their starting material as well as checks for the accuracy of their assumptions. Compression of the data allows a substantial reduction in the message redundancy and length, as well as largely destroying the message structure. Encryption of the compressed message results in ciphertext with a far higher resistance to cryptanalysis than ciphertext of an uncompressed message. It should be noted that the output of any encryption system appears to be quite random. Because enciphered text can not be compressed, text must be compressed before it is enciphered.

Both the history of cryptographic security and common sense tell us that any builder of cryptographic systems must expect that attackers will have a detailed knowledge of the cryptographic system itself. One must also assume that the attackers will have large amounts of matched ciphertext and plaintext available for analysis (this is called a known plaintext attack). no cryptographic system should be considered for use if it cannot resist such an attack. The builder of the cryptographic system should make every effort to minimize the attacker's knowledge of the message structure. One of the best ways of doing this is to reduce the redundancy of the message before it is encrypted.

Data Compaction

The scheme taken to compact the source material is typically dependent upon the material to be compressed, but it may well involve some form of Huffman encoding at the character of word level, the use of dictionary compaction schemes, codebook techniques for common phrases, general purpose compaction techniques, or some combination of these.

Huffman encoding, one of the better known compression techniques, commonly encodes ASCII text into a character set with a variable number of bits. The most frequent characters are assigned to the shortest encoded representation. huffman encoding at the character level typically reduces English text file lengths by 30 percent. Because of variable length character, a single byte in Huffman encoding may represent from less than one to parts of three characters.

Well designed compaction algorithms can reduce the number of bits necessary to encode a character in arbitrary English text to between 2 and 3 bits. Compaction algorithms for highly structured messages can and should be optimized to remove the message structure and redundancy in order to maximize the difficulties faced by the cryptanalysist. When transposition operation are applied to compacted text, structure needed for the decompression routines is diffuse over the message, further hindering the cryptanalysist, who must both decipher the encryption scheme and reconstruct the compressed text.

A Generalized Rotor Cryptographic Operator

In the 1930s encoding machines were introduce that utilized a series of rotors: electrical scrambling units that substitute one character for another. The shaft-mounted rotors were turned by gears and cams. Encryption keys were established by changing rotors, their order in the machine, rotational settings, and other mechanical parameters, thus making analysis of the resulting code difficult. However, the relatively regular manner in which the rotors moved in relationship to one another, as well as the fixed order of the rotors during any given encipherment session, rendered electro-mechanical rotor machines vulnerable to analysis.

Software implementations of rotor techniques are not constrained by the limitations imposed by mechanical gearing on the electro-mechanical machines. A linked substitution list (rotor) technique is resistant to analysis and is well suited for implementation on microcomputers. I call this technique the "Generalized rotor." Its only drawback is the requirement to store the rotor sets in memory during processing and in a readable medium between processing sessions.

This technique utilizes a set of substitution lists (rotors) such that each rotor is a randomly scrambled list of numbers from 0 to p-1, where p is the list length (rotor size), with each element appearing only once. The decryption rotors are the inverse of the encryption rotors, and are calculated form them, but need to be available as a separate rotor set. The code in Listing One provided a means of generating a set of encryption and decryption rotors given a sufficiently long stream of random bytes. No relationships or correlations should exist between the various rotors in the rotor set or between elements in any given rotor. In the descriptions that follow, p and the number of rotors in the rotor set are 256 because of the calculational convenience of this choice. Other choices could easily be made. If ASCII files are to be transformed and it is desired, form an implementation point of view, to keep the rotors and other working variables within a 64K block, it would be logical to use 128 rotors, each 128 bytes long.

The generalized rotor operates by using the outputs of pseudo-random sequence generators to choose the rotors, as well as the relative offsets of each rotor in the working set. After each character is enciphered, the rotors in the working set and their offsets are set to the new values supplied by the pseudo- random sequence generators. The total change of the rotor set and offset values after each character is processed eliminates the regularities that render mechanical rotor systems vulnerable to analysis. There are several "keys" in this system: the contents of the encryption and decryption rotors, the algorithms used in the pseudo-random sequence generators, and their starting contents. The seed of each pseudo-random generator, or the string form which it is created, is the working key for the system and should be changed frequently. The rotor contents and pseudo- random sequence generator algorithms constitute the fixed key and can be changed at greater intervals. The C code block in Listing Two implements encipherment and decipherment of individual characters utilizing generalized rotor substitution operations where the rotors are stored as two dimensional arrays. Use of these routines requires that the message stream be fed through the transformations character by character.

The generalized rotor operator may be viewed as a very complex mixer of multiple pseudo-random sequences with a chosen text stream. The mixing of both is resistant to analysis and is easy to implement. The encryption and decryption rotor sets are implemented as large single arrays, minimizing the index computations necessary to perform the operations. Each rotor uses two pseudorandom values, one for the rotor selection and the other for the rotor offset. The most secure way of selecting these values is to produce each one by a separate generator. The generators have periods that are long and do not share common factors, preventing undesired periodicities in the sequences. A four rotor system thus needs eight sequence generators. The additive generators discussed earlier are well suited for such sequences.

The use of generalized rotor operations--rather than addition--as the combining operation in such a feedback shift register is fast and hard to analyze. If the two values fed back determine the rotor choice and rotor offset values, the resulting value is derived from the rotor transformation X(n) = rot[X(n-a)][X(n-b)]. Such a combining operation has apparently random mapping characteristics and is not easily inverted. The inverse operation of determining what prior states could have yielded a given output is many to one: for rotor sets of 256 rotors, each 256 long, there are 256 previous states that could have yielded a given output.

There is no theory to predict the period lengths of the sequences produced by such generators. The apparently random nature of the mappings suggests that the period is likely to be on the order of p^-1, but can not be guaranteed to be this large. Such sequence generators are very fast due to the speed of memory access, and they add considerable difficulties to the work of the cryptanalysist. This is the whole point of encryption.

Reference Notes

Martin Kochanski, "A Survey of Data Insecurity Packages," Cryptologia (Jan. 1987): 1-15.

Rivest, Shamir, and Adleman "A Method for Obtaining Digital Signatures and Public Key Cryptosystems," Proceedings of the IEEE (March 1979): 397-426.

Martin E. Hellman, "The Mathematics of Public Key Cryptography," Scientific American (Aug. 1979): 146.

Martin E. Hellman and W. Diffie, "New Directions in Cryptography," IEEE Transactions on Information Theory (Nov. 1976): 614-654.

W. Diffie and Martin E. Hellman, "Privacy and Authentication, an Introduction to Cryptography," Proceedings of the IEEE (March 1977): 397-426.

David Kahn, The Codebreakers (New York, N.Y:: MacMilian Co., 1967).

Rudolph F. Lauer, Computer Simulation of Classical Substitution Cryptographic Systems, (Laguna Hills, Calif.: Aegean Park Press, 1981>.

Gilbert Held, Data Compression (New York, N.Y.: John Wiley & Sons, 1983). Frank Rubin, "Experiments in Text File Compression," Communications of the ACM (Nov. 1976): 617-623.

Eugene S. Schwartz, "A Dictionary for Minimum Redundancy Encoding," Journal of the ACM (Oct. 1963): vol. 10, no. 4, 423-439.

Richard Tropper, Binary Coded Text," Byte (April 1982): 396-413.

Jorma Rissanen, "A Universal Data Compression System," IEEE Transactions on Information Theory (Sept. 1983): 656-664.

J. Ziv and A. Lempel, "Compression of Individual Sequences via Variable Rate Coding," IEEE Transactions on Information Theory (Sept. 1978): 530-536.

Bruce Hahn, "A New Technique for Compression and Storage of Data," Communications of the ACM (Aug. 1974): 434-436.

Terry A. Welch, "A Technique for High Performance Data Compression," IEEE Computer (June 1984): 8-19.

Ian H. Witten, Neal M. Radford, and John G. Cleary, "Arithmetic Coding for Data Compression," Communications of the ACM (June 1987): 520-540.

John G. Cleary and Ian H. Whitten, "Data Compression Using Adaptive Coding and Partial String Matching," IEEE Transactions Communication (April. 1984): 396-402.

Donald E. Knuth, The Art of Computer Programming, vol.1, Seminumerical Algorithms, 2d ed, (Reading, Mass.: Addison-Wesley 1981).

Solomon W. Golomb, Shift Register Sequences, Revised Second Edition, (Laguna Hills, Calif. Aegean Park Press, 1982).

Henry Becker and Fredrick Piper, Cipher Systems, (New York, N.Y.: John Wiley & Sons, 1982).

T. Siegentanler, "Correlation-Immunity of Nonlinear Combining Functions for Cryptographic Applications," IEEE Transactions on Information Theory (Sept. 1984): 776-780.

Herbert S. Bright and Richard L. Enison, "Quasi-Random Number Sequences From a Long-Period TLP Generator With Remarks on Application to Cryptography," Computing Surveys (Dec. 1979): 357-370.

T. Etzion and A. Lempel, "Algorithms For The Generation of Full-Length Shift-Register Sequences," IEEE Transactions on Information Theory (May 1984): 480-484.

Ronald L. Rivest, "Forwards and Backwards Encryption," Cryptologia (Jan. 1980): 30-34.

John R. Michener, "The `Generalized Rotor' Cryptographic Operator and Some of its Applications," Cryptologia (April 1985): 97-114.

John R. Michener, "Application of the Generized Rotor Cryptographic Operator in the Construction of Substitution-Permutation Network Block Codes," Cryptologia (July 1985): 193-202.

John R. Michener, "The Application of Key Dependent and Variable Rotor Sets to Generalized Rotor Cryptographic Systems," Cryptologia, (July 1987): 166-171.

Claude E. Shannon, "Communication Theory of Secrecy Systems," Bell System Technical Journal (Oct. 1949) 656-715.

John Kam and George Davida, "Structured Design of Substitution-Permutation Encryption Networks," IEEE Transactions on Computers (Oct 1979): 747-753.

Horst Feistel, "Cryptography and Computer Privacy," Scientific American (May 1973): 15-23.

Arthur Sorkin, "Lucifer, A Cryptographic Algorithm," Cryptologia (Jan 1984): 24-41.

Dov Andelman and James Reeds, "On the Cryptanalysis of Rotor Machines and Substitution-Permutation Networks," IEEE Transactions on Information Theory (July 1982): 578-584.

John R. Michener, "The Use of Complete, Nonlinear, Block Codes for Nonlinear, Noninvertible Mixing of Pseudo-Random Sequences," Cryptologia (April 1987): 108-112.

M.B. Greenlee, "Requirements for Key Management Protocols in the wholesale Financial Services Industry," IEEE Communications Magazine (Sept. 1985): 22-28.

David M. Baleson, "Automated Distribution of Cryptographic Keys- Using the Financial Institute Key Management Standard," IEEE Communications Magazine (Sept. 1985): 41-46

Christian Mueller-Schloer, "A Micro processor-Based Cryptoprocessor, IEEE Micro (Oct. 1983): 5-15.

_A TOOL FOR SECRET KEY CRYPTOGRAPHY_ by John R. Michener

[LISTING ONE]

<a name="017e_0008">


/* -------------------- Listing 1 -----------------------------*/
/* this routine uses a supplied file (n>64K) of random one byte
numbers to generate a set of encryption and
decryption rotors.  These rotors are supplied sequentially in a
128K file.  The first 64K of the file contains
the encryption rotors in series, the second 64K contains the
decryption rotors in series.  The order within the
file is the 256 elements of the first rotor in sequence followed
by the 256 elements of the second rotor, on
through all the rotors in the set.

Since random files are not easily created by users it is
necessary to create a close approximation to random
files.  This can be done by taking a variety of text and program
files, compressing them, encrypting them with
random keys, and concatenating them to form a long binary file.
If English text was the source material this
compressed, binary file should be at least 256K bytes long
(making allowances for the redundancy of English).
The resulting 4:1 overlap of compressed English removes the
redundancies and residual order present in the
original language file.  The length of the binary file should be
entered into the #define NN line since the
binary nature of the file prevents EOF characters.  Use command
line file direction to read the random file
into the program and direct the output into the rotor file. */

#include <stdio.h>
#define NN   /* length of random file in bytes */

main()
    {
    unsigned char rnd[65536];   /* random number array   */
    unsigned char rotor[2*65536];
    register int k, temp, c;
    register long i, j;

    for(i=0, j=0; i<NN; i++)
        {
        rnd[j++] ^= ((unsigned char)(c=getchar()));
        if (j==65536)
            j = 0;
        }

    for(i=0; i<65536; i+=256)
        for(j=0; j<256; j++)
            rotor[i+j] = j;

    for(i=0; i<65536; i+=256)
        for(j=0; j<256; j++)
            {
            k=rnd[i+j];
            temp = rotor[i+j];
            rotor[i+j] = rotor[i+k];
            rotor[i+k] = temp;
            rotor[65536 + i + rotor[i+j]] = j;
            rotor[65536 + i + rotor[i+k]] = k;
            }

    for(i=0; i<2*65536; i++)
        putchar(rotor[i]);
    }

-------------------------------------------------------------------------

<a name="017e_0009"><a name="017e_0009">
<a name="017e_000a">
[LISTING TWO]
<a name="017e_000a">

/* ----------------------------Listing 2 -------------------------------*/
/* in the following subroutines the PR numbers used for GR
operations are stored in rndout[] and are accessed
by the offset values ro.  These values are incremented by 8 for
each character processed to allow for the 8 PR
numbers needed per GR operation. */

/*________ character substitution encryption _________________*/
unsigned char sub(ch,ro)
unsigned char ch;       /* character to be encrypted */
int ro;                 /* offset in random number array */
    {
    int i;

    for(i=0;i<4;i++)
ch=rotor[(int)((rndout[i+ro])<<8)+(int)((rndout[i+4+ro]+ch)&255)];
    return(ch);
    }

/*__________ character substitution decryption _____________*/
unsigned char unsub(ch,ro)
unsigned char ch;       /* character to be decrypted */
int ro;                 /* offset in random number array */
    {
    int i;

    for(i=3;i>=0;i--)
ch=(rotor[65536+(int)((rndout[i+ro])<<8)+(int)ch]-rndout[i+4+ro])&255;
    return(ch);
    }

----------------------------------------------------------------------










Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.