Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Reed-Solomon Error Correction


Decoding

Decoding of received words is the most challenging part of any system that uses error-correcting codes. Fortunately, an efficient algorithm exists for the decoding of RS codes. Decoding a received word r(x) amounts to finding the closest codeword, the one that was most likely transmitted. You need to know at which locations the errors occurred and what their magnitudes are.

Since

then

Given that all codewords are a multiple of g(x),

holds for any valid codeword c(x). Since we know that all multiples of g(x) are codewords, you can infer that a received word r(x) is erroneous when any of the values is nonzero. This set of 2T values is actually referred to as the syndrome of the received word r(x), and it can be expressed as the polynomial

It is important to realize that the syndrome contains all the information needed to decode the received word r(x).

Conceptually, there are three stages in the decoding process:

  1. Compute the syndrome of the received codeword.
  2. If the syndrome is nonzero, find the number of errors, their locations, and their magnitude.
  3. Correct the errors.

Stage 1. To compute the syndrome, the decoder could subsequently evaluate r() through . Unfortunately, this basic syndrome computation cannot be optimized as well as the encoding algorithm. I realized later that, by definition, the syndrome of r(x) is the same as the syndrome of m(x) = r(x) modulo g(x). And this is good, because m(x) is a much shorter polynomial (of degree up to 2T) than r(x), which makes computing m(1) through m(2T-1) cheap. For calculating m(x) from r(x) and g(x), the decoder uses the same efficient technique as the encoder. Obviously, if m(x) = 0, then r(x) is a valid codeword, and there is no need for further decoding. The function calc_syndrome in codecsup.asm (available electronically) contains the code to compute m(x) and the syndrome of the received word.

Stage 2. The main question in this stage is: How can we derive the error locations and magnitudes from the syndrome? What is their mathematical relationship? Again, assume that is a primitive element of GF(28). I will define the set B of error locations as

For each , the corresponding error magnitude is . Given this mathematical representation of error magnitudes and locations, the error locator polynomial, (x), and the error evaluator polynomial, (x) of a received word, are defined by Figure 1. The error locations are simply the set of values for x, where (x)=0, which is B. For each error locator , the corresponding error magnitude is shown in Figure 2.

Figure 1: (a) Defining the error-locator polynomial; (b) defining the error-evaluator polynomial.

Figure 2: Error magnitude.

At this point, I'll turn to the key equation for the decoding algorithm. After all, all that is known is the syndrome S(x). Recall that T is the error-correcting capability of the code and the degree of the generator polynomial is 2T. The key decoding equation is:

It is interesting that a variation of the ancient algorithm discovered around 300 b.c. by Euclid can be used to determine (x) and (x) from the aforementioned equation. I am referring to the extended algorithm for finding the greatest-common divisor of two polynomials. In this case, it is applied to the two polynomials x2T and S(x) until the degree of the resulting polynomial falls below T.

If, at this stage, the degree of S(x) is greater than T, it can be concluded that too many errors occurred, and any attempt to correct them will be utterly fruitless. The decoder will return an error code. Now that (x) and (x) are known, the next step is to find B, the set of error locations. There is a shortcut for the case where the received codeword has only 1 error, which implies that (x) has degree 1. Then, (x) can be written as ax+b. This polynomial is zero when x=b/a and B={b/a}.

When the degree of (x) is greater than 1, all nonzero elements of GF(28) are plugged into (x) to find all its zeros. This stage is actually performed by an assembly language routine because assembly offers optimization opportunities in this kind of computation. Examine codecsup.asm to see how short the code is (at label evalpoly, in the function find_roots) that evaluates (x) in each iteration! If, at this stage, the number of zeros found does not match the degree of (x), the decoder returns an error code because too many errors must have occurred. Finally, for each , is computed using the error magnitude formula defined above.

Stage 3. At this stage, you are finally able to enjoy the fruits of your labor, as the algorithm seems to apply the right corrections at the right places in the received symbol sequence to recover the original codeword! By definition, each is equal to , where loc is the error location. loc can be expressed as . The correction of the error at loc is done by adding (the error magnitude) to the locth symbol in the received word. This translates to the following simple statement in C data[loc] ^= magn.

The program decode.c (available electronically at the top of the first page of this article) contains the function RSG_decode that integrates stages 1, 2, and 3 of the decoding process. The program codecsup.asm contains the assembly function find_roots that searches for all zeros of (x).

Main Program

The main program (rsmain.c, Listing Three) lets you play with the Reed-Solomon error-correction functions that I have discussed. When you run the program it sets up the arrays for GF(28) arithmetic and computes the generator polynomial. Then, it asks you for an original string. This string is padded with zeros to create an array of K symbols. This array is encoded to form the original codeword. Next, it asks you for a string with up to T symbol errors. The program now overwrites the first K symbols of the original codeword with the second string and executes the decoding algorithm. After this, the process will repeat itself. To exit the program, just enter an empty string at any of the prompts.

Listing Three

/****************************************************** 
 * File: rsmain.c -- main function: repeatedly asks for original string and 
 * string with errors and tries to recover original string from string with 
 * errors using Reed-Solomon decoding
 * Copyright (c) 1995 by Hugo Lyppens with permission to print in DDJ
 ******************************************************/
#include "hdr.h"
#include <conio.h>
#include "rs.h"



main(argc, argv)
int    argc;
char  *argv[];
{
    char         str[N], str2[N];
    int          r;



    RSG_ConstructGaloisField();
    RSG_CalcGeneratorPoly();



    for(;;) {
        printf("Enter original string or enter empty string to quit:\n");
        memset(str, 0, N); str2[0] = 0;
        gets(str);
        if(!str[0]) break;
        RSG_encode((UBYTE *)str);



        printf("Enter string with up to %d symbol errors or enter 
                                            empty string to quit:\n", T);
        gets(str2);
        if(!str2[0]) break;



        strncpy(str, str2, K);
        r = RSG_decode((UBYTE *)str);
        if(r < 0) {
            printf("Decoding error -- too many errors!\n");
        } else {
            printf("Decoding OK, recovered '%s' from '%s' by correcting %d 
                                                errors!\n", str, str2, r);
        }
    }
    return(0);
}

The program generates a lot of diagnostic output to let you follow the decoding process. In Figure 3, the strings "Coding theory is fun!" and "C0ding th.ory is f&n?" are entered by you.

Figure 3: Program output.
Generator polynomial: 54 + 01x^1 + F3x^2 + 6Cx^3 + 
1Bx^4 + 5Fx^5 + 62x^6 + D4x^7 + B2x^8 + CFx^9 + 
1Ex^10 + 72x^11 + BAx^12 + F4x^13 + E7x^14 + 
81x^15 + 01x^16
  
  Enter original string or enter empty string to quit:
  Coding theory is fun!
  Enter string with up to 8 symbol errors or enter empty
  string to quit:
  C0ding th.ory is f&n?
  Syndrome 0: 59
  Syndrome 1: 8d
  Syndrome 2: 5d
  Syndrome 3: 4d
  Syndrome 4:  5
  Syndrome 5: bf
  Syndrome 6: ae
  Syndrome 7: 5c
  Syndrome 8: 18
  Syndrome 9: ad
  Syndrome 10: 6b
  Syndrome 11: b4
  Syndrome 12: c9
  Syndrome 13: c3
  Syndrome 14: e6
  Syndrome 15: fe
  
  Degree of err. loc. polynomial: 4
  Error locator polynomial: CC  + B8x^1 + 05x^2 + 21x^3 +
  3Cx^4
  Error evaluator polynomial: 5C  + 2Fx^1 + 4Dx^2 + ADx^3
  number of roots of elp: 4
  correcting error at location 20 of magnitude 1E
  correcting error at location 9 of magnitude 4B
  correcting error at location 1 of magnitude 5F
  correcting error at location 18 of magnitude 53
  Decoding OK, recovered `Coding theory is fun!' from
  `C0ding th.ory is f&n?' by correcting 4 errors!

Although this program only lets you test cases where the information symbols (the first K symbols of each codeword) are affected by errors, the algorithm also works if some or all of the errors occur in the redundancy part of a codeword.

Conclusion

Reed-Solomon codes are extremely efficient because they only need 2T redundancy symbols to achieve an error correction capability of up to T symbol errors. They are also fairly easy to decode. As demonstrated, a Reed-Solomon code with the right parameters lends itself extremely well to implementation in software on the Intel 32-bit x86 and other computer architectures. As my positive experiences with the Amiga Video Backup System suggest, software Reed-Solomon implementations could be used to build a wide variety of systems (especially low-volume or custom applications) which wouldn't otherwise be feasible because of the high cost of using hardware-error correction.

References

van Tilborg, H.C.A. Coding Theory: A First Course. Eindhoven Technical University, 1993.

Hoffman, D.G., D.A. Leonard, C.C. Lindner, K.T. Phelps, C.A. Rodger, and J.R. Wall. Coding Theory: The Essentials. Auburn, AL: Marcel Dekker. 1991.


Hugo is a programmer/analyst at Goldman, Sachs & Company. He can be reached at [email protected] or at http://www.stack.urc.tue.nl/~hugo.

Dr. Dobb's Journal January 1997: A Complete Reed-Solomon Encoding and Decoding Example


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.