Programmers have used the Cyclic Redundance Check (CRC) algorithm for years
to uncover errors in a data transmission. It turns out that you can also use
CRCs to *correct* a single-bit error in any transmission. I first heard
about error correcting CRCs in a conversation I had several years ago [1]. At
the time, I thought this feature of CRCs was general knowledge, but as I did
more research, I saw no mention of CRC error correction in the popular literature.
The traditional response to a CRC error is re-transmission. However, the advance
of computer technology has led to some situations where it is actually preferable
to correct single-bit errors rather than to resend. Some examples include:

- Satellite transmission -- If a host is sending data via a satellite, the cost of sending a regular packet is high, so the cost of a resend just doubles the price for the packet.
- High-speed transmission -- In the future, there may be a tendency to push the technology. (Let's crank this baby up and see what it will do.)The faster bits move through a medium, the higher the probability of error.
- PowerLine Carriers -- Metricom Corporation, a supplier of integrated circuits for computer applications states, "There is a growing interest in the use of PowerLine Carrier (PLC) for data communication using the intrabuilding electric power distribution circuits. Power lines were not designed for data communications and exhibit highly variable levels of impedance, signal attenuation and noise... Harmful effects of impulse noise on data communications systems can be expected." [2].

### Error Correcting CRCs

The algorithm for error correcting CRCs involves determining the remainder after dividing in binary (modulo 2). The algorithm is table driven. The length of the message is determined, a priori, and agreed to by both the sender and the receiver. A generator polynomial value that can be used to determine the position of errors anywhere in the message is also selected at that time.

Before a message is sent, a checksum is calculated and appended onto the end
of the message. The message and checksum are then sent. The receiver gets the
message and calculates a second checksum (of both parts). If the new checksum
value is **0** then the message is considered valid. If not, the checksum
value is used as a subscript in an error correction table to determine the position
of the bad bit. This method will find and correct 1-bit errors. The receiver
builds the error correction table using the chosen Generator Polynomial (GP).
The GP has to be carefully selected, because different GPs have different characteristics.
I discuss the selection of an error correcting GP later in this article.

### Building the Tables

To build the error correction table, I begin with a series of **0**s representing
correct data. One of the bits is then set to **1**. Using modulo 2 division
(exclusive-or), the receiver divides the message by the GP, calculating the
remainder that is used as a subscript in the error correction table. See Algorithm
1.

Algorithm 1:

for (i = 1; i<=Message_Length; i++) { set all bits in the Message to 0 change the i'th bit to 1 calculate the checksum (cs) EC_table[cs] = i }You only have to build this table once for each generator polynomial.

### Building the EC Table for 1011

In the example that follows, I have chosen a 4-bit generator polynomial. With
4-bit GPs, only two GP values can be used for error correction--decimal values
**11** and **13**. I chose **11**. When one divides the message by
the 4-bit GP, the result is a 3-bit remainder, yielding values from **0**
to **7**. This result implies that I can use this GP for a message with a
total length from 1 to 7 bits. (Four bits for the message and 3 bits for the
checksum). The result of the calculation of the checksums is shown in Table
1. At the bottom of the table are the decimal values of the remainders after
the division.

These values are then used to fill the Error Correction (EC) table. The EC
table, in checksum order, is shown in Table
2. You could initialize an array in C with these values. Usually, **EC[0]**
is not used.

int EC[] = {0,7,6,4,5,1,3,2};This process makes up the test for error correcting generator polynomial validity. For a given number, if the entire table cannot be built (i.e., if two or more numbers map to 1 slot), the number chosen cannot be used as an error correcting generator polynomial. In practice, when data is sent, the sender computes the checksum (cs) and then appends the checksum onto the end of the data. When the receiver receives the message combined with the checksum, the receiver computes another checksum (cs2). If the second checksum is zero, the data is (supposedly) ok. If cs2 is NOT zero,

**EC[cs2]**contains the location of the bit in error.

### Sending a Message with GP = 1011

Start with the following generator polynomial: **1011**. Assume the Original
Data is **1100**. First append 3 additional bits (with value 000) on the
end. Then calculate the checksum (using exclusive-or):

1100000 1011 ---- 0111000 1011 ---- 1010 1011 ---- 010The resulting checksum is:

**010**and the string that is sent is the original data appended with the checksum:

**1100010**. Suppose the following string of bits is received

**1101010**. (Numbering from left to right, bit 4 has changed.) The receiver computes the checksum.

1101010 1011 ---- 01100 1011 ---- 1111 1011 ---- 1000 1011 ---- 011The checksum is binary

**011**, which is

**3**in decimal.

**EC[3]**is 4, which is the position of the bad bit! (Refer to Table 2) You can use this scheme for much longer messages, but the preceeding example demonstrates the essence of the algorithm. I wrote a program that determines whether a given bit pattern can be used as an error correcting GP (see Listing 1). None of the existing widely used GPs work for error correction (see the sidebar titled "Generator Polynomials").

There are ways of finding the bad bit without using tables. The following code is a slight modification of an algorithm presented by Fred Halsall [4] for computing an 8-bit CRC.

Algorithm 2:

hpo2 = 8; v = 1; for (i = hpo2-1; i>=1; i--) { if (v == checksum) bad_bit_position = i; v = v + v; if (v >= hpo2) v = v ^ GP; }If you have a lot of packets to check, table lookup is faster.

### Using a Finite State Machine

Intrigued by the idea of CRC error correction, I looked at the method from
different angles, and, as a result, I discovered a method for error correcting
CRCs using finite state methods. Given the following generator polynomial: **11**
(**1011** in binary), I first determine the value of the left-most 1 bit
(8) in the GP. I call it **hpo2**, and it is equal to the highest power of
2 in the GP. Then I build a Finite State Table (FST) for GP = **1011**. This
table represents states of the machine. The table has **hpo2**-1 rows and
2 columns (**0** and **1**). The procedure for building an FST is as follows:
Let **t** equal the current row number. The row number represents the current
state. Double it. (**t = t * 2**) Add the current column number (zero or
one) to **t**. If the value of **t** is **>= hpo2**, then exclusive-or
it with the GP. Place the value of **t** into the current row and current
column of the table. The following C code computes the values in the FST.

Algorithm 3:

for (r = 0; r < hpo2; r++) for (c = 0; c < 2; c++) { t = r*2 + c; if (t >= hpo2) t ^= GP; FST[r][c] = t; }For GP =

**1011**, this algorithm produces the finite state table shown in Table 3.

Both the sender and the receiver must build the FST. Only the receiver builds the error correction table and does so using the following code:

Algorithm 4:

pos = 7; cc = 1; for (r = 0; r < hpo2; r++) { EC[cc] = pos; pos--; cc *= 2; if (cc >= hpo2) cc ^= GP; }The result is the CRC error correction table for

**1011**and is equivalent to Table 2.

### Correcting a 1-bit Error Using an FST

An example sequence of sending a message, losing a bit, then recovering the bad bit follows:

Algorithm 5:

Sender Given an original message, say 1100 Append zeros for the checksum on the end: 1100000. Traverse the FST... cs = 0; for (i=1; i < hpo2; i++) cs = FST[cs][msg[i]]; ... giving a 2 (010 in binary) for the checksum (cs). Send the message with a binary 2 appended: 1100 010 Receiver Receive a message with 1 bit changed (say bit 4): 1101 010 Traverse the FST to get a value (cs2) of 3 if cs2 = 0 then the message is ok else EC[3] = 4, therefore bit 4 is bad Correct the bad bit: 1100 010 Discard the checksum: 1100

### A Sample Program

Listing 2 is a sample program that builds
the tables, generates a random string, and changes a random bit. Part 2 of the
program determines the position of the changed bit and corrects it. (The code
is written with the message and checksum in an array of **int**. This approach
demonstrates the logic while making the code much simpler to read.)

### References

[1] Conversation with Maartin van Sway, now Professor Emeritus at Kansas State University.

[2] Direct quote from "http://www.metricom-corp.com/fec.html"

[3] Andrew S. Tanenbaum, Computer Networks, Prentice-Hall. 1996.

[4] Fred Halsall *Data Communications, Computer Networks and Open Systems*,
Addison-Wesley, 1996, (p. 137).

### About the Author

Bill McDaniel received his Ph.D. from Kansas State University and
is currently the Chair of the Department of Computer Science at the University
of Central Oklahoma. His interests include networking, encryption, CGI programming,
and operating systems. He is the author of the article "Efficiently Sorting
Linked Lists," which appeared in the June 1999 issue of *Dr. Dobbs Journal*.