Dr. Dobb's | Reed-Solomon Error Correction

Reed-Solomon Error Correction

For any number of reasons, Reed-Solomon error correction is commonly implemented in hardware. Here, Hugo presents a highly optimized software implementation of Reed-Solomon error correction, written in C++ and assembly language.

January 01, 1997
URL:http://www.drdobbs.com/cpp/reed-solomon-error-correction/184410107

Transmission media, such as telephone lines, wide-area networks, and satellite links, or storage media, like optical/magnetic disk and tape, are usually imperfect. To make digital data transmission and storage reliable, error-correcting codes are indispensable. The efficient Reed-Solomon class of error-correcting codes is highly popular in many applications, including CD, DAT (digital audio tape) players, tape-backup systems, and similar devices. These all use a hardware implementation of Reed-Solomon. However, because microprocessor speeds have increased considerably, software implementations are also feasible in some applications.

In this article, I'll present a highly optimized software implementation of Reed-Solomon error correction, which I implemented while developing the "Video Backup System" for the Commodore Amiga. This product lets users connect a VCR to the Amiga via a hardware interface to use the VCR as a file-backup device. I developed the software that encodes computer data in a video signal and performs error correction on information read back from the VCR to compensate for drop-outs on the video tape. Even though the original Amiga implementation was written in C and 68000 assembly language, the version I'll discuss here is implemented in 32-bit Intel x86 assembly language using Borland C++ 4.2 and TASM 4.0.

Reed-Solomon Codes

A codeword of a Reed-Solomon (RS) code consists of a sequence of symbols over the Galois field GF(q). GF(q) is a finite field with q elements, where q=p^m for some prime number p. (For an introduction to Galois fields, see Error-Recovery Codes, by Bart de Canne, DDJ, December 1994.) Reed-Solomon codes have, by definition, N=q-1. Out of these N, let K be the number of actual information symbols. The other N-K are redundancy symbols. The amount of redundancy in each codeword defines the maximum number of symbol errors that can be corrected. The necessary error-correcting capability of the code can be determined from the quality of the medium (raw error rate) and the desired reliability of the application. Let T be the number of symbol errors that can be corrected in each code word. Because all Reed-Solomon codes are maximum-distance separable, the necessary overhead to correct up to T symbol errors is only 2T symbols. Therefore, the actual information content of each codeword, K, is N-2T symbols. The resulting data rate is: K/N or 1-(2T/N).

Each codeword can be identified with a polynomial over GF(q) in the following way:

Reed-Solomon codes are cyclic. For each codeword (c₀,...,c_N-1) in a RS code C, (c_N-1,c₀,c₁,..., c_N-2) is also in C. Viewing codewords as polynomials, the aforementioned statement can be written as:

A T-error-correcting RS code C can be described by a single unique polynomial g(x), its generator polynomial. g(x) is a polynomial of degree 2T and the coefficient of x^2T in g(x) is 1 by definition. c(x) is a codeword in C only if it is divisible by g(x). Let be a primitive element of GF(q). This means that every nonzero element of GF(q) can be written as a power of and . For a RS code,

Choosing Code Parameters

To maximize efficiency, choose code parameters that match the architecture of the implementation platform as closely as possible. A convenient unit of information on the 32-bit Intel CPU (and most other computer architectures) is a byte which can be stored, retrieved, and manipulated very efficiently. Therefore, in my RS implementation, each symbol is represented by one byte. (For a more complete example of encoding/decoding, see the section at the end of this article entitled "A Complete Reed-Solomon Encoding and Decoding Example.") A byte has 256 possible values, so the most convenient Galois field size is q=256=2⁸, p=2, and m=8. When shortened RS codes are ignored, this automatically implies that N=255. The only degree of freedom left at this point is the choice of T, the maximum correctable number of symbol errors per codeword. Although the computer architecture does not dictate a value for T, there is a slight advantage if T is a power of 2. For my application, I implemented RS for T=8. Using this as a starting point, it is easy to create your own implementation for other values of T.

Implementation of GF(2⁸)

GF(2⁸) can be regarded as the set of polynomials of degree <8, with coefficients in GF(2); for example, 0 or 1. Addition, subtraction, multiplication, and division are all defined operators. In GF(2⁸), adding two elements is accomplished by adding their corresponding coefficients modulo 2. Since subtraction modulo 2 is the same as addition (1-1 = 1+1 = 0 mod 2, 0-1 = 0+1 = 1 mod 2), subtraction and addition are the same in GF(2⁸) as well.

In Example 1, the most-significant bit (MSb) in a byte represents the coefficient of x⁷; the least-significant bit (LSb) represents the coefficient of x⁰. For example, x⁷+x⁴+x² is represented by byte 1001010. The assembly language equivalent of computing the sum of two elements of GF(2⁸) is the use of the exclusive-or instruction (XOR) on the byte representations of the elements. Multiplication in GF(2⁸) is done modulo a polynomial f(x) of degree 8. f(x) has to be irreducible over GF(2), which means that it cannot be written as a product of two or more other polynomials over GF(2) of lesser degree. If f(x) is irreducible and every nonzero element of GF(2⁸) can be written as a power of x modulo f(x), f(x) is called a "primitive polynomial" and x is a primitive element. Bart de Canne's article contains code for searching for such primitive polynomials (there is always at least one for every Galois field). I use the following primitive polynomial to construct GF(2⁸): x⁸+ x⁶ + x⁵ + x⁴ +1.

Example 1: Addition and subtraction in GF(2⁸).

(x⁴ + x²) + (x² + 1) =(x⁴ + 1)0 - (x⁴ + x) = (x⁴ + x)x + x = 0x - x = 0

In Example 2, my implementation uses a 256×256 byte array called RSG_multiply to efficiently implement multiplication of elements of GF(2⁸). The Intel architecture has a quirk that makes it efficient to access this array, given two values (I and J) to multiply; of the lower 16 bits of the data registers eax through edx, the CPU can separately access the upper and lower byte. Given that the upper 16 bits of ebx are 0 and that edi points to RSG_multiply, Example 3 is all that is needed to load the product of I and J into al. Understanding this makes the assembly listing easier to read. If you are hard-pressed for registers, you could even free up the register edi by making sure that the RSG_multiply array resides at a 64-KB boundary in memory to which the upper 16 bits of ebx> could point.

Example 2: Multiplication in GF(2⁸) is done modulo f (x)= x⁸+ x⁶+x⁵ + x⁴ + 1.

(x⁴ + x²) * (x² + 1) = (x⁶ + x²)x⁸ = x⁶ + x⁵ + x⁴ + 1x²⁵⁵ = x⁰ = 1

Example 3: Loading the product of I and J.

mov bl,I
mov bh,J
mov al,[edi+ebx]

Division of an element a by an element b is accomplished by multiplying a by b^-1, the multiplicative inverse of b. The array RSG_multinverse contains the multiplicative inverse of every nonzero element of GF(2⁸). Finally, RSG_logarithm is a logarithm table that tells you which power of x each nonzero GF(2⁸) element is. For instance, since x⁸ = x⁶ + x⁵ + x⁴+1, log_x (x⁶ + x⁵ + x⁴ +1) = 8 and 01110001₂ = 71 hex, RSG_logarithm[0x71] equals 8. Listing One, gf256.c, is the code to prepare the arrays needed for GF(2⁸) arithmetic. These arrays are used by both the C and the assembly language parts of the implementation.

Listing One

/****************************************************** 
 * File: gf256.c -- contains code to construct GF(2^8) and all required
 * arrays for GF(2^8) arithmetic.
 * Copyright (c) 1995 by Hugo Lyppens with permission to print in DDJ
 ******************************************************/
#include "hdr.h"
#include "rs.h"



#define POL              0x171    /* primitive polynomial used to construct
                                     GF(2^8) is: x8 + x6 + x5 + x4 + 1
                                     represented as 101110001 binary */
UBYTE                    RSG_powers[FIELDSZ];
UBYTE                    RSG_logarithm[FIELDSZ];
UBYTE                    RSG_multinv[FIELDSZ];
UBYTE                    RSG_multiply[FIELDSZ][FIELDSZ];



static UBYTE             multiply_func(UBYTE, UBYTE);
static UBYTE             multinv_func(UBYTE);



void        RSG_ConstructGaloisField()
{
     int          v;
     UBYTE        i, j;
     v = 1;
     for(i = 0; ; i++) {
          RSG_powers[i]         = (UBYTE)v;
          RSG_logarithm[v]  = i;
          v <<=1; // multiply v by alpha and reduce modulo POL
          if(v & (1<<GF_M)) {
                v ^= POL;
          }
          if(i==MAXELT-1)
                break;
     }
     RSG_powers[MAXELT] = 1;
     // construct multiplication table
     for(i = 0; ; i++ ) {
          if(i) RSG_multinv[i] = multinv_func(i);
          for(j = i; ; j++) {
              RSG_multiply[i][j] = RSG_multiply[j][i] = multiply_func(i, j);
              if(j==MAXELT)
                    break;
          }
          if(i==MAXELT)
                break;
     }
}
// compute multiplicative inverse of x. The value y for which x*y = 1
static UBYTE multinv_func(x)
UBYTE x;
{
    if(!x)
        return 0;
    return RSG_powers[FIELDSZ-1-RSG_logarithm[x]];
}
// compute product of i and j
static UBYTE multiply_func(i, j)
UBYTE i, j;
{

     int r, b, k;
     r = 0; k = j;
     for(b = 1; b<FIELDSZ; b<<=1, k<<=1) {
        if(k & (1<<GF_M)) k^=POL;
         if(i & b)       r ^= k;
     }
     return((UBYTE)r);
}

Encoding

Again, each codeword is a multiple of the generator polynomial g(x). To encode an information sequence a(x) = a₀+a₁x+...+ a_K-1x^K-1, you could multiply it by g(x) to turn it into a codeword. However, it is more efficient to compute b(x) = a(x)x^N-K mod g(x). A possible encoding of a(x) is then b(x) + a(x)x^N-K. Yet, I have the intuitive notion that the information symbols should come first and the redundancy symbols should be attached to the end of the data array. Since the RS codes are cyclic, this is no problem and a valid codeword that satisfies this notion can be created by cyclically rotating this encoding to the left by N-K symbol positions. The final encoding of a(x) then becomes a(x)+ b(x)x^K.

As the encoder processes each information symbol in a(x), it maintains b(x). To speed things up, the code uses a table that contains g(x) for each . When the encoder finishes computing b(x), it writes it to the Kth position in the data array to create the final encoding of a(x). Listing Two (genpol.c) calculates the generator polynomial and prepares this table. The function RSG_encode in the program codecsup.asm (available electronically, along with executables; at the top of the first page of this article) is the assembly language implementation of the encoder.

Listing Two

/********************************************************************* 
 * File: genpol.c -- contains code to generate the generator polynomial
 * of degree 2*T for Reed-Solomon encoding/decoding
 * Copyright (c) 1995 by Hugo Lyppens with permission to print in DDJ
 *********************************************************************/
#include "hdr.h"
#include "rs.h"



UBYTE                    RSG_shiftregtab[FIELDSZ][2*T];
Polynomial               RSG_genpol;
void    RSG_CalcGeneratorPoly()
{
    int     i, j;
    UBYTE   a, *p;
    RSG_genpol.degree = 2*T;
    memset(RSG_genpol.c, 0, 2*T);
    RSG_genpol.c[0] = RSG_powers[0];
    RSG_genpol.c[1] = 1;
    for(j = 1; j<2*T; j++) {
        a = RSG_powers[j];
        for(i = 2*T; i>=1; i--) {
           RSG_genpol.c[i] = RSG_genpol.c[i-1]^multiply(RSG_genpol.c[i], a);
        }
        RSG_genpol.c[0] = multiply(a, RSG_genpol.c[0]);
    }
    printf("Generator polynomial: ");
    printpoly(&RSG_genpol);
    for(i = 0; ; i++) {
        p = &RSG_multiply[i][0];
        for(j = 0; j<2*T; j++) {
            RSG_shiftregtab[i][j] =
                 p[RSG_genpol.c[j]];
        }
        if(i==MAXELT)
            break;
     }
}

Decoding

Decoding of received words is the most challenging part of any system that uses error-correcting codes. Fortunately, an efficient algorithm exists for the decoding of RS codes. Decoding a received word r(x) amounts to finding the closest codeword, the one that was most likely transmitted. You need to know at which locations the errors occurred and what their magnitudes are.

Since

then

Given that all codewords are a multiple of g(x),

holds for any valid codeword c(x). Since we know that all multiples of g(x) are codewords, you can infer that a received word r(x) is erroneous when any of the values is nonzero. This set of 2T values is actually referred to as the syndrome of the received word r(x), and it can be expressed as the polynomial

It is important to realize that the syndrome contains all the information needed to decode the received word r(x).

Conceptually, there are three stages in the decoding process:

Compute the syndrome of the received codeword.

If the syndrome is nonzero, find the number of errors, their locations, and their magnitude.

Correct the errors.

Stage 1. To compute the syndrome, the decoder could subsequently evaluate r() through . Unfortunately, this basic syndrome computation cannot be optimized as well as the encoding algorithm. I realized later that, by definition, the syndrome of r(x) is the same as the syndrome of m(x) = r(x) modulo g(x). And this is good, because m(x) is a much shorter polynomial (of degree up to 2T) than r(x), which makes computing m(1) through m(2^T-1) cheap. For calculating m(x) from r(x) and g(x), the decoder uses the same efficient technique as the encoder. Obviously, if m(x) = 0, then r(x) is a valid codeword, and there is no need for further decoding. The function calc_syndrome in codecsup.asm (available electronically) contains the code to compute m(x) and the syndrome of the received word.

Stage 2. The main question in this stage is: How can we derive the error locations and magnitudes from the syndrome? What is their mathematical relationship? Again, assume that is a primitive element of GF(2⁸). I will define the set B of error locations as

For each , the corresponding error magnitude is . Given this mathematical representation of error magnitudes and locations, the error locator polynomial, (x), and the error evaluator polynomial, (x) of a received word, are defined by Figure 1. The error locations are simply the set of values for x, where (x)=0, which is B. For each error locator , the corresponding error magnitude is shown in Figure 2.

Figure 1: (a) Defining the error-locator polynomial; (b) defining the error-evaluator polynomial.

Figure 2: Error magnitude.

At this point, I'll turn to the key equation for the decoding algorithm. After all, all that is known is the syndrome S(x). Recall that T is the error-correcting capability of the code and the degree of the generator polynomial is 2T. The key decoding equation is:

It is interesting that a variation of the ancient algorithm discovered around 300 b.c. by Euclid can be used to determine (x) and (x) from the aforementioned equation. I am referring to the extended algorithm for finding the greatest-common divisor of two polynomials. In this case, it is applied to the two polynomials x^2T and S(x) until the degree of the resulting polynomial falls below T.

If, at this stage, the degree of S(x) is greater than T, it can be concluded that too many errors occurred, and any attempt to correct them will be utterly fruitless. The decoder will return an error code. Now that (x) and (x) are known, the next step is to find B, the set of error locations. There is a shortcut for the case where the received codeword has only 1 error, which implies that (x) has degree 1. Then, (x) can be written as ax+b. This polynomial is zero when x=b/a and B={b/a}.

When the degree of (x) is greater than 1, all nonzero elements of GF(2⁸) are plugged into (x) to find all its zeros. This stage is actually performed by an assembly language routine because assembly offers optimization opportunities in this kind of computation. Examine codecsup.asm to see how short the code is (at label evalpoly, in the function find_roots) that evaluates (x) in each iteration! If, at this stage, the number of zeros found does not match the degree of (x), the decoder returns an error code because too many errors must have occurred. Finally, for each , is computed using the error magnitude formula defined above.

Stage 3. At this stage, you are finally able to enjoy the fruits of your labor, as the algorithm seems to apply the right corrections at the right places in the received symbol sequence to recover the original codeword! By definition, each is equal to , where loc is the error location. loc can be expressed as . The correction of the error at loc is done by adding (the error magnitude) to the loc^th symbol in the received word. This translates to the following simple statement in C data[loc] ^= magn.

The program decode.c (available electronically at the top of the first page of this article) contains the function RSG_decode that integrates stages 1, 2, and 3 of the decoding process. The program codecsup.asm contains the assembly function find_roots that searches for all zeros of (x).

Main Program

The main program (rsmain.c, Listing Three) lets you play with the Reed-Solomon error-correction functions that I have discussed. When you run the program it sets up the arrays for GF(2⁸) arithmetic and computes the generator polynomial. Then, it asks you for an original string. This string is padded with zeros to create an array of K symbols. This array is encoded to form the original codeword. Next, it asks you for a string with up to T symbol errors. The program now overwrites the first K symbols of the original codeword with the second string and executes the decoding algorithm. After this, the process will repeat itself. To exit the program, just enter an empty string at any of the prompts.

Listing Three

/****************************************************** 
 * File: rsmain.c -- main function: repeatedly asks for original string and 
 * string with errors and tries to recover original string from string with 
 * errors using Reed-Solomon decoding
 * Copyright (c) 1995 by Hugo Lyppens with permission to print in DDJ
 ******************************************************/
#include "hdr.h"
#include <conio.h>
#include "rs.h"



main(argc, argv)
int    argc;
char  *argv[];
{
    char         str[N], str2[N];
    int          r;



    RSG_ConstructGaloisField();
    RSG_CalcGeneratorPoly();



    for(;;) {
        printf("Enter original string or enter empty string to quit:\n");
        memset(str, 0, N); str2[0] = 0;
        gets(str);
        if(!str[0]) break;
        RSG_encode((UBYTE *)str);



        printf("Enter string with up to %d symbol errors or enter 
                                            empty string to quit:\n", T);
        gets(str2);
        if(!str2[0]) break;



        strncpy(str, str2, K);
        r = RSG_decode((UBYTE *)str);
        if(r < 0) {
            printf("Decoding error -- too many errors!\n");
        } else {
            printf("Decoding OK, recovered '%s' from '%s' by correcting %d 
                                                errors!\n", str, str2, r);
        }
    }
    return(0);
}

The program generates a lot of diagnostic output to let you follow the decoding process. In Figure 3, the strings "Coding theory is fun!" and "C0ding th.ory is f&n?" are entered by you.

Figure 3: Program output.

Generator polynomial: 54 + 01x^1 + F3x^2 + 6Cx^3 + 
1Bx^4 + 5Fx^5 + 62x^6 + D4x^7 + B2x^8 + CFx^9 + 
1Ex^10 + 72x^11 + BAx^12 + F4x^13 + E7x^14 + 
81x^15 + 01x^16
  
  Enter original string or enter empty string to quit:
  Coding theory is fun!
  Enter string with up to 8 symbol errors or enter empty
  string to quit:
  C0ding th.ory is f&n?
  Syndrome 0: 59
  Syndrome 1: 8d
  Syndrome 2: 5d
  Syndrome 3: 4d
  Syndrome 4:  5
  Syndrome 5: bf
  Syndrome 6: ae
  Syndrome 7: 5c
  Syndrome 8: 18
  Syndrome 9: ad
  Syndrome 10: 6b
  Syndrome 11: b4
  Syndrome 12: c9
  Syndrome 13: c3
  Syndrome 14: e6
  Syndrome 15: fe
  
  Degree of err. loc. polynomial: 4
  Error locator polynomial: CC  + B8x^1 + 05x^2 + 21x^3 +
  3Cx^4
  Error evaluator polynomial: 5C  + 2Fx^1 + 4Dx^2 + ADx^3
  number of roots of elp: 4
  correcting error at location 20 of magnitude 1E
  correcting error at location 9 of magnitude 4B
  correcting error at location 1 of magnitude 5F
  correcting error at location 18 of magnitude 53
  Decoding OK, recovered `Coding theory is fun!' from
  `C0ding th.ory is f&n?' by correcting 4 errors!

Although this program only lets you test cases where the information symbols (the first K symbols of each codeword) are affected by errors, the algorithm also works if some or all of the errors occur in the redundancy part of a codeword.

Conclusion

Reed-Solomon codes are extremely efficient because they only need 2T redundancy symbols to achieve an error correction capability of up to T symbol errors. They are also fairly easy to decode. As demonstrated, a Reed-Solomon code with the right parameters lends itself extremely well to implementation in software on the Intel 32-bit x86 and other computer architectures. As my positive experiences with the Amiga Video Backup System suggest, software Reed-Solomon implementations could be used to build a wide variety of systems (especially low-volume or custom applications) which wouldn't otherwise be feasible because of the high cost of using hardware-error correction.

References

van Tilborg, H.C.A. Coding Theory: A First Course. Eindhoven Technical University, 1993.

Hoffman, D.G., D.A. Leonard, C.C. Lindner, K.T. Phelps, C.A. Rodger, and J.R. Wall. Coding Theory: The Essentials. Auburn, AL: Marcel Dekker. 1991.

Hugo is a programmer/analyst at Goldman, Sachs & Company. He can be reached at [email protected] or at http://www.stack.urc.tue.nl/~hugo.