Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Global Developer

The Book Cipher Algorithm

Source Code Accompanies This Article. Download It Now.

Unless you're a professional cryptanalyst, writing cryptography code means meddling with "powers" you cannot fully comprehend, and seemingly insignificant slips can be fatal. During World War II, for instance, Polish and British mathematicians broke Germany's Enigma code only because the same message-key was enciphered twice at the beginning of every message. The Germans did this to avoid mistakes caused by radio interference, but at the same time, it ruined their carefully planned cryptosystem. And how many slips are there in the code that multiply big numbers, look for 1000-digits primes, and encrypt the fixed header of your document?

With the Book cipher algorithm, you're safe from these kinds of errors because it is simple enough that you can code it in a few lines of C that are completely understandable, but still extremely secure. The so-called Beale ciphers (unmuseum.org/beal.htm), which point to a location of buried treasure somewhere in Bedford county, were coded in 1885, but still have not been decoded. This secret (or maybe hoax) has occupied some of the best cryptanalytic minds. Likewise, when Simon Singh gave 10 problems in the appendix of The Code Book, problem #5 (Book cipher) was the most difficult one for the winners of the £10,000 prize (www.simonsingh.com/Cipher_ Challenge.html). Still, the Book cipher has probably never been used in commercial software.

Book Cipher Algorithms

Basically, the Book cipher algorithm uses letters of subsequent words in some text or book as a key to encode a message. Figure 1 is the simplest form, usually called the "running key cipher." In this case, text (usually from a book) is used to provide a very long key stream. The book used is agreed upon in advance, while the passage used is chosen randomly for each message and secretly indicated somewhere in a previous message. In this example, we agreed to use J.K. Rowling's Harry Potter and the Order of the Phoenix and to start on page 335, line 28, with the sentence, "Hermione bit her lip and did not answer." We write this text under the plaintext and use it as the running key. The particular message to send is "DRDOBBS." We XOR the corresponding characters of the message and the running key to get the ciphertext 12 23 22 2 11 13 29.

Plaintext         D   R   D   O   B   B   S
Plaintext (hex)   44  52  44  4F  42  42  53
Running key       H   E   R   M   I   O   N
Running key (hex) 48  45  52  4D  49  4F  4E
Ciphertext (hex)  0C  17  16  02  0B  0D  1D
Ciphertext        12  23  22  2   11  13  29

Figure 1: Running key cipher.

The running key cipher is much better than the famous Vigenère cipher because we do not repeat the key—a book is hopefully long enough to encode everything we have to say. However, security is still poor because the entropy per character of both the plaintext and the running key is low, and the combining operation is relatively easily inverted. By guessing probable plaintexts along the ciphertext, attackers will eventually recognize the book and break the code.

A better idea is to replace words in the plaintext with the location of words from a book. For example, to encode the word "computer," you look for the first appearance of "computer" in the previously chosen book and enter its position as the cipher text. The second appearance of the word "computer" is replaced by the position of the second appearance of that word in the book, and so on. The real problem with this approach is finding the word: If you agree to use David Copperfield as a code book, and then try to encrypt an article about hash functions, you are unlikely to find all the necessary words.

An alternative approach that gets around this problem is to replace individual letters rather than words, in which case the Book cipher is properly a cipher. Figure 2 illustrates the concept. We are encoding a message "DRDOBBS" using the same passage from Harry Potter and the Order of the Phoenix. To code the letter "D," we look for the first word in the passage starting with "D" (it's the 6th word, "did"). Then we look for the first word starting with "R" (the 11th word, "rang"), then for the next word starting with "D" (the 16th word, "down"), and so on. The final ciphertext is 6, 11, 16, 17, 2, 10, 15.

[Click image to view at full size]

Figure 2: Encoding the message "DRDOBBS" using the Book cipher.

Decoding is even simpler. We take the ciphertext number-by-number and look for the corresponding words in the book, generating the original plaintext.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.