The modern high-speed network is not perfect. Errors can creep into message data during transmission or reception, altering or erasing one or more message bytes. Sometimes, errors are introduced deliberately to sow disinformation or to corrupt data. Several algorithms have been developed to guard against message errors. One such algorithm is Reed-Solomon.
This article takes a close, concise look at the Reed-Solomon algorithm. I discuss the benefits offered by Reed-Solomon, as well as the notable issues it presents. I examine the basic arithmetic behind Reed-Solomon, how it encodes and decodes message data, and how it handles byte errors and erasures. Readers should have a working knowledge of Python. The Python class
ReedSolomon is available for download.
Overview of Reed-Solomon
The Reed-Solomon algorithm is widely used in telecommunications and in data storage. It is part of all CD and DVD readers, RAID 6 implementations, and even most barcodes, where it provides error correction and data recovery. It also protects telemetry data sent by deep-space probes such as Voyagers I and II. And it is employed by ADSL and DTV hardware to ensure data fidelity during transmission and reception.
The algorithm is the brainchild of Irving Reed and Gustave Solomon, both engineers at MIT's Lincoln Labs. Its public introduction was through the 1960 paper "Polynomial Codes over Certain Finite Fields." Interestingly enough, that paper did not provide an efficient way to decode the error codes presented. A better decoding scheme was developed in 1969 by Elwyn Berklekamp and James Massey.
Reed-Solomon belongs to a family of error-correction algorithms known as BCH. BCH algorithms use finite fields to process message data. They use polynomial structures, called "syndromes," to detect errors in the encoded data. They add check symbols to the data block, from which they can determine the presence of errors and compute the correct valuse. BCH algorithms offer precise, customizable control over the number of check symbols. They have simplified code structures, making them attractive for hardware implementations.
Reed-Solomon is also a linear algorithm because it processes message data as discrete blocks. And it is a polynomial algorithm because of its use of modular polynomials in the encoding and decoding processes.
But Reed-Solomon is not without its issues. It performs poorly with large message blocks. Doubling the block size quadruples the time it takes to encode or decode the message data. You can minimize this limitation by maintaining small and uniform blocks. Reed-Solomon also relies heavily on modular operations: And modular operations, especially multiplication, consume more clock cycles than non-modular ones.
The Mathematics of Reed-Solomon
As stated, Reed-Solomon uses a finite field in its encoding and decoding processes. A finite (or Galois) field is a square matrix, whose elements are the possible bytes values for both message and error data. Its size (2
m) is always a power of two, with
m being a prime number. It is also commutative. Combining two of its elements with a primitive modular operator (addition, subtraction, and so on) will return another element.
Figure 1 shows a simple finite field, size
4 (m = 2). Notice its largest element is
3 (0b11), which is less than the matrix size. If you take two field elements (
0b11) and add them, you get
0b00, which is also a field element.
Listing One shows how the Python class
ReedSolomon prepares its finite fields. Here, it declares two list properties. The property
__GFEXP (line 4) is the actual field, with 256 possible byte values. The property
__GFLOG (line 7) is the complement field. Its 256 elements are the binary weights of each field element in
__GFEXP. Together, these properties help simplify modular multiplication and division as you shall see later.
class ReedSolomon: # Galois fields # -- exponents (anti-logarithms) __GFEXP =  * 512 # -- logarithms __GFLOG =  * 256 # INITIALIZATION CONSTRUCTOR def __init__(self): # prepare the exponential and logarithmic fields self.__GFEXP = 1 byteVal = 1 for bytePos in range(1,255): byteVal <<= 1 if (byteVal & 0x100): byteVal ^= 0x11d # update the field elements self.__GFEXP[bytePos] = byteVal self.__GFLOG[byteVal] = bytePos # finalize the exponential field for bytePos in range(255,512): self.__GFEXP[bytePos] = self.__GFEXP[bytePos - 255] # ... continued
Next, the constructor method
__init__() starts by setting element
__GFEXP to 1 (line 13). Then it populates the first 255 elements in both
__GFLOG (lines 15-22). Finally, it populates the remaining 256 elements in
__GFEXP (lines 25-26), by copying the first 256 elements from that same property.
Listing Two shows how the class handles modular multiplication and division. The private methods
__gfDivi() get the same two arguments:
__gfMult() method checks if both arguments are zero and returns a zero result (lines 9-10). Otherwise, it reads an element from
argX as the list index (line 13). Then it reads an element from
argY as the index, and adds that element to
byteValu (line 14). Finally, it reads an element from
byteValu as the index (line 15), and returns that element as the result.
class ReedSolomon: # ...previous listings # # Galois multiplication # argX, argY: multiplicand, multiplier # byteValu: product def __gfMult(self, argX, argY): # parametre checks if ((argX == 0) or (argY == 0)): byteValu = 0 else: # perform the operation byteValu = self.__GFLOG[argX] byteValu += self.__GFLOG[argY] byteValu = self.__GFEXP[byteValu] # return the product result return (byteValu) # Galois division # argX, argY: dividend, divisor # byteValu: quotient def __gfDivi(self, argX, argY): # validate the divisor if (argY == 0): raise ZeroDivisionError() # validate the dividend if (argX == 0): byteValu = 0 else: # perform the division byteValu = self.__GFLOG[argX] - self.__GFLOG[argY] byteValu += 255 byteValu = self.__GFEXP[byteValu] # return the division result return (byteValu) # ... continued
__gfDivi() method also checks for zero arguments. But it returns a zero if
argX is zero, and it raises a
argY is zero (lines 25-30). Otherwise, it uses both
argY to read elements from
__GFLOG. It subtracts the two elements (
byteValu) and adds 255 to the difference (lines 33-34). Then it uses
byteValu to access
__GFEXP and returns that element as the quotient (line 35).
Next, Reed-Solomon uses polynomials in its encoding and decoding processes. In Python, you can represent a polynomial as a list object (Figure 2). Each element in the list corresponds to a coefficient, each index to a term power. Notice the coefficients of each polynomial term is a hexadecimal number.
One important polynomial is the generator polynomial (Figure 3). This is a normalized polynomial. All its terms have a coefficient of 1. It is irreducible. It cannot be factored into two or more polynomials.
Listing Three shows how the class
ReedSolomon prepares a generator polynomial. This private method
_rsGenPoly() gets one argument: the number of error symbols (
errSize). It assigns the local
polyValu a single list element of 1 (line 8). Then it builds the generator polynomial by creating a two-term list object (
__GFEXP (line 11), and combining
_gfPolyValu() (line 12). The final value of
polyValu then becomes the generator polynomial (line 15).
class ReedSolomon: # ...previous listings # # Prepare the generator polynomial # errSize: number of error symbols # polyValu: generator polynomial def _rsGenPoly(self, errSize): polyValu =  for polyPos in range(0, errSize): polyTemp = [1, self.__GFEXP[polyPos]] polyValu = self._gfPolyMult(polyValu, polyTemp) # return the polynomial result return (polyValu) # ... continued
Listing Four shows the four private methods that
ReedSolomon uses to process its polynomial list objects. The method
_gfPolyAdd() (lines 7-20) combines its two arguments,
polyB, through modular addition. The method
_gfPolyMult() (lines 25-36) combines its two arguments through modular multiplication. Notice addition is done with exclusive-or, while multiplication is done with
class ReedSolomon: # ...previous listings # # Polynomial addition # polyA, polyB: polynomial addends # polySum: polynomial sum def _gfPolyAdd(self, polyA, polyB): # initialise the polynomial sum polySum =  * max(len(polyA), len(polyB)) # process the first addend for polyPos in range(0, len(polyA)): polySum[polyPos + len(polySum) - len(polyA)] = polyA[polyPos] # add the second addend for polyPos in range(0, len(polyB)): polySum[polyPos + len(polySum) - len(polyB)] ^= polyB[polyPos] # return the sum return (polySum) # Polynomial multiplication # polyA, polyB: polynomial factors # polyProd: polynomial product def _gfPolyMult(self, polyA, polyB): # initialise the product polyProd = len(polyA) + len(polyB) - 1 polyProd =  * polyProd # start multiplying for posB in range(0, len(polyB)): for posA in range(0, len(polyA)): polyProd[posA + posB] ^= self.__gfMult(polyA[posA], polyB[posB]) # return the product result return (polyProd) # Polynomial scaling # argPoly: polynomial argument # argX: scaling factor # polyVal: scaled polynomial def _gfPolyScale(self, argPoly, argX): # initialise the scaled polynomial polyVal =  * len(argPoly) # start scaling for polyPos in range(0, len(argPoly)): polyVal[polyPos] = self.__gfMult(argPoly[polyPos], argX) # return the scaled polynomial return (polyVal) # Polynomial evaluation # argPoly: polynomial argument # argX: independent variable # byteValu: dependent variable def _gfPolyEval(self, argPoly, argX): # initialise the polynomial result byteValu = argPoly # evaluate the polynomial argument for polyPos in range(1, len(argPoly)): tempValu = self.__gfMult(byteValu, argX) tempValu = tempValu ^ argPoly[polyPos] byteValu = tempValu # return the evaluated result return (byteValu) # ... continued
The next method,
_gfPolyScale(), takes two arguments: a polynomial (
argPoly) and an integer (
argX). It multiplies each polynomial term by
__gfMult() (lines 47-48). The method
_gfPolyEval() also gets
argX as its arguments. However,it solves the polynomial
__gfMult() to combine
argX with each term, and tallies the sum using exclusive-or (lines 62-65).
Encoding with Reed-Solomon
To encode a message block with Reed-Solomon, first you need to set the number of error symbols (
errSize). These symbols are generated by Reed-Solomon and appended to the message block. They are later used to correct any erasures or errors found in the block.