Channels ▼


The Block Cipher Square Algorithm

Source Code Accompanies This Article. Download It Now.

Dr. Dobb's Journal October 1997: The Block Cipher Square Algorithm

A fast block cipher that uses a 128-bit key

Joan is an employee of Banksys, Brussels. He can be contacted at [email protected] Lars and Vincent are with the COSIC research group of the Department of Electrical Engineering of the Katholieke Universiteit Leuven, Belgium. They can be contacted at [email protected] and [email protected], respectively.

Sidebar: The Galois Field GF(28)

Square is a new fast block cipher that encrypts data in blocks of 128 bits, using a 128-bit key. Most cipher designs are optimized for one specific implementation platform. Square's structure (that is, its building blocks and their interaction) has been carefully chosen to allow very efficient implementations on a wide range of processors. This versatility is achieved through the use of operations in the finite field GF(28), which is a new approach in block cipher design. A reference implementation was written in C by Paulo Barreto ([email protected]) and runs at 2.63 MB on a 100 Mhz Pentium (available electronically; see "Availability," page 3). By optimizing in assembler, this speed was doubled.


Square is an iterated, but not a Feistel, cipher. Instead, every round acts upon all 16 bytes of its the input. One round of Square consists of four different operations. These operations can be efficiently combined in a single set of table lookups and XOR operations. The round operations are easiest to visualize if you think of the 16 bytes as being arranged in a 4×4 array; see Figure 1. An intermediate value a can then be represented by the 16 bytes ai,j,0<=i,j< 4. The four operations are:

  • A linear transformation that operates separately on the four rows of four bytes each. [theta] is determined by four coefficients cj,0<=j<4. b=[theta](a) corresponds to

    In this formula, additions and multiplications are performed in the Galois field GF(28); see the accompanying text box entitled "The Galois Field GF(28)."

  • A nonlinear transformation [gamma] that uses one invertible 8-bit S-box. Each of the 16 bytes is transformed independently by the S-box.
  • A byte permutation [pi] that rearranges the bytes by interchanging rows and columns (that is, it transposes the array).
  • Key addition. Exclusive-Or with the round key kt is denoted [sigma][kt] (Exclusive-or corresponds to addition in the field GF(28)).
  • The round operation is denoted [rho][kt] and defined as
    Square is defined as eight rounds preceded by an extra key addition and by q-1: Square[k]=


Square was designed following the Wide Trail Design Strategy (see "Cipher and Hash Function Design Strategies based on Linear and Differential Cryptanalysis," by Joan Daemen, doctoral dissertation, March 1995, K.U. Leuven), which separates the functionality of the block cipher round operation in separate layers -- the nonlinear substitution, linear diffusion layer, and key addition. Strict criteria are imposed on each layer to build ciphers that resist linear and differential cryptanalysis.

For a description of the basic principles of linear and differential cryptanalysis, refer to "Differential and Linear Cryptanalysis," by Bruce Schneier (DDJ, January 1996). In short, differential cryptanalysis works by applying small changes to the input of the algorithm and predicting the resulting differences at various intermediate stages of the algorithm. Resistance against differential cryptanalysis can be improved by ensuring that small differences at the input cause large differences at the output (and intermediate) values. In Square, this diffusion of differences is ensured by the two linear transformations: [theta] and [pi].

q takes care of the intrarow diffusion. If one byte of a row at the input is different, all four bytes of this row in the output are different. For example, if n bytes of a row at the input are different (1<=n<=4), at least 5-n output bytes of this row are different.

[pi] produces interrow diffusion. After [pi], the four bytes of a row are spread over the four different rows.

Together, [theta] and [pi] cause a very quick spread of differences; see Figure 2. Suppose two bytes of the same row at the input of the first round are different, then after [theta], at least three bytes of this row are different. [rho] distributes these bytes to three different rows at the input of the next round. Then 3×4 of the output bytes of [theta] are different. The differences are distributed by [rho] over the four rows, and so on.

An immediate consequence of the mathematical structure of Square is that the resistance against linear cryptanalysis can be treated in an analogous way. Square is resistant to linear and differential cryptanalysis after six of its eight rounds.

There exists, however, a dedicated attack that breaks six rounds of Square (see our paper "The Block Cipher Square," in Fast Software Encryption, E. Biham, editor, Springer-Verlag 1995). Since Square has eight rounds, this dedicated attack poses no problems.


To illustrate how Square can be used, we have written a smart card application for the Motorola M68HC05 microcontroller. (Smart cards are becoming important as a platform for implementing cryptography and security.) The machine code occupies 547 bytes of ROM and needs 36 bytes of RAM. One execution of Square (including the key schedule) takes about 7500 cycles. This corresponds to less than 2 milliseconds with a 4-MHz clock. In comparison, a typical DES implementation occupies about 1 KB of ROM. One DES operation encrypts only 64 bits and needs about 16,000 cycles.

Writing out the consecutive transformations of the cipher for 32-bit processors, the succession of steps

occurs seven times. These transformations can be combined into a single set of table lookups. An intermediate value of the text can be represented by four 32-bit words, each containing a row [ai]. Its transpose is denoted by [ai]T. For

we have Figure 3(a). We define the tables M and T as Figure 3(b). T and M have 256 entries of four bytes each. The table M implements the multiplications in GF(28). T combines the nonlinear substitution with the multiplications. Define rotrj([ai]) as the rotation of the row [ai] with j bytes to the right. This yields Figure 3(c). We conclude that

can be done with 16 table lookups, 12 rotations, and 16 XORs of 32-bit words. This implementation needs the table T, with 256 entries of four bytes -- 1 KB in total.

Further Reading

More details about Square are available in our paper "The Block Cipher Square," which has been published in Fast Software Encryption (E. Biham, editor, Springer-Verlag 1995) and at http://www


Copyright © 1997, Dr. Dobb's Journal

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.