In Listing Five is the entry method `RSEncode() `

from the class `ReedSolomon`

. It starts with a call to `_rsGenPoly()`

, passing for input the argument errSize and storing the polynomial result into `polyGen`

(line 10). It sets the local `outBuffer`

to the combined size of both `argMesg`

and `errSize`

and initializes the list elements to 0 (lines 13-14). Then, it copies each message byte into `outBuffer`

(lines 17-19).

##### Listing Five

class ReedSolomon: # ...previous listings # ## REED-SOLOMON ENCODING # ------ # argMesg: the message block # errSize: number of error symbols # outBuffer: the encoded output buffer def RSEncode(self, argMesg, errSize): polyGen = self._rsGenPoly(errSize) # prepare the output buffer outBuffer = (len(argMesg) + errSize) outBuffer = [0] * outBuffer # initialise the output buffer for mesgPos in range(0, len(argMesg)): mesgChar = argMesg[mesgPos] outBuffer[mesgPos] = ord(mesgChar) # begin encoding for mesgPos in range(0, len(argMesg)): mesgChar = outBuffer[mesgPos] if (mesgChar != 0): for polyPos in range(0, len(polyGen)): tempValu = self.__gfMult(polyGen[polyPos], mesgChar) outBuffer[mesgPos + polyPos] ^= tempValu # finalise the output buffer for mesgPos in range(0, len(argMesg)): mesgChar = argMesg[mesgPos] outBuffer[mesgPos] = ord(mesgChar) # return the output buffer return (outBuffer) # ... continued

Next, `RSEncode()`

begins encoding. It reads each message byte and checks its value (lines 23-24). For a non-zero byte, `RSEncode()`

combines the byte with the generator polynomial using `__gfMult() `

(line 26). Then it updates each element in `outBuffer`

using an exclusive-or (line 27).

Once all the message bytes are encoded, `RSEncode() `

copies the message bytes again into `outBuffer`

(line 30-32); and it returns `outBuffer`

as the encoded result.

Figure 4 illustrates the encoding process itself. In this process, I have the word "Lorem" as my sample block. I assumed text encoding to be 8-bit ASCII and the number of error symbols to be 5.

**Figure 4.**

The top is the sample block, the bottom the encoded block. Before encoding, the message bytes populates the first half of the list object `outBuffer`

. Each encoding pass then clears a message byte and updates the rest of `outBuffer`

. Note the update moves from left to right. When encoding ends, note that none of the original message bytes remain, while the error symbols occupy the right half of `outBuffer`

.

### Decoding with Reed-Solomon

To decode a message block, you still need to supply the number of error symbols. Make sure to use the same number used for encoding. Otherwise, Reed-Solomon will fail to decode the block and correct any erasures or errors present.

To Reed-Solomon, erasures and errors are two different things (Figure 5). An erasure is a missing byte. Its location is known; its value usually a -1 (or `0xff`

for 8-bit ASCII). An error is an incorrect byte. Its location is not known. Reed-Solomon has to take extra steps to locate the affected byte(s).

**Figure 5.**

There is a limit to a number of erasures and errors that Reed-Solomon can correct. That limit is tied to the number of error symbols appended to the message. As a rule, Reed-Solomon can correct up to `errSize`

erasures, but only` errSize/2`

errors. It can also correct a combination of erasures and errors, provided enough error symbols are available for both.

The decoding process consists of five distinct steps (Figure 6). First step is to locate and count the erasures present in the message block. Next is to create the syndrome polynomial. This polynomial reveals any errors present in the block. If all its terms have zero coefficients, then the block is free of errors and decoding ends.

Third step is to calculate the error locator polynomial. This polynomial maps out the location of each incorrect message byte. Next is to calculate the error evaluator polynomial. And the last step is, of course, to correct the encoded block.

**Figure 6.**

Listing Six shows how the `ReedSolomon`

class prepares the syndrome polynomial. This private method `_rsSyndPoly()`

gets two arguments: the encoded block (`argCode`

) and the number of error symbols (`errSize`

). It resizes the local `polyValu`

to `errSize`

and initializes its elements to zero (line 10). Then it calculates the polynomial terms using `__GFEXP`

and `_gfPolyEval()`

(lines 14-15). It stores each term into `polyValu`

and returns the local once done (lines 18).

##### Listing Six

class ReedSolomon: # ...previous listings # # Generate the syndrome polynomial # argCode: the code block # errSize: number of error symbols # polyValu: the syndrome polynomial def _rsSyndPoly(self, argCode, errSize): # initialise the polynomial polyValu = [0] * errSize # compute the polynomial terms for errPos in range(0, errSize): byteValu = self.__GFEXP[errPos] polyValu[errPos] = self._gfPolyEval(argCode, byteValu) # return the polynomial return (polyValu) # ... continued

Listing Seven shows how `ReedSolomon`

calculates its error-locator polynomial. This method, `_rsForney(),`

has three arguments: the syndrome polynomial (`polySynd`

), the list of erasures (`eraseLoci`

), and the number of error symbols. It implements the Forney algorithm, which uses interpolation to generate the polynomial.

##### Listing Seven

class ReedSolomon: # ...previous listings # # The Forney algorithm # polySynd: the syndrome polynomial # eraseLoci: list of erasures # errSize: number of error symbols # polyValu: the error locator polynomial def _rsForney(self, polySynd, eraseLoci, errSize): # make a copy of the syndrome polynomial polyValu = list(polySynd) # compute the polynomial terms for posI in range(0, len(eraseLoci)): termX = errSize - 1 - eraseLoci[posI] termX = self.__GFEXP[termX] for posJ in range(0, len(polyValu) - 1): termY = self.__gfMult(polyValu[posJ], termX) termY ^= polyValu[posJ + 1] polyValu[posJ] = termY polyValu.pop() # return the polynomial result return (polyValu) # ... continued

The method starts by copying `polySynd`

into its local `polyValu`

(line 11). In its outer loop, `_rsForney()`

parses each erasure and uses the position to access `__GFEXP`

(lines 15-16). In its inner loop, `_rsForney()`

combines each term from `polyValu`

with the value from `__GFEXP`

using `__gfMult()`

(line 17-19). And it uses the resulting product to update `polyValu`

(line 20). After parsing the erasures,` _rsForney()`

removes the last polynomial term with a call to `pop()`

(line 21) and returns `polyValu`

as its result (line 24).

In Listing Eight is the method `_rsFindErr()`

, which `ReedSolomon`

uses to identify the affected bytes. Its arguments are the error locator polynomial (`errLoci`

) and the number of error symbols (`errSize`

). It uses the Berlekamp-Massey algorithm to prepare its locator polynomial `errPoly`

(lines 14-30). That algorithm creates a minimal polynomial, one that focuses solely on errors. Then, `_rsFindErr()`

counts the number of terms in `errPoly`

(line 33) and compares that count against `polySynd`

(line 34). It returns a `None`

when the encoded block has too many errors (line 36).

##### Listing Eight

class ReedSolomon: # ...previous listings # # Locate the message errors # errLoci: error locator polynomial # errSize: number of error symbols def _rsFindErr(self, errLoci, errSize): # initialise the polynomial locals errPoly = [1] tempPoly = [1] # generate the error locator polynomial # - Berklekamp-Massey algorithm for posSynd in range(0, len(errLoci)): tempPoly.append(0) termSynd = errLoci[posSynd] for posErr in range(1, len(errPoly)): termPoly = errPoly[len(errPoly) - posErr - 1] termPoly = self.__gfMult(termPoly, errLoci[posSynd - posErr]) termSynd ^= termPoly if (termSynd != 0): if (len(tempPoly) > len(errPoly)): tNewP = self._gfPolyScale(tempPoly, termSynd) tempPoly = self._gfPolyScale(errPoly, self.__gfDivi(1, termSynd)) errPoly = tNewP tempValu = self._gfPolyScale(tempPoly, termSynd) errPoly = self._gfPolyAdd(errPoly, tempValu) # count the number of errors errCount = len(errPoly) - 1 if ((errCount * 2) > len(errLoci)): print "Too many errors to correct" return (None) else: print "Error count: ", errCount, len(errLoci) # calculate the polynomial zeroes errList = [] for errPos in range(0, errSize): errZed = self._gfPolyEval(errPoly, self.__GFEXP[255 - errPos]) if (errZed == 0): errZed = errSize - errPos - 1 errList.append(errZed) if (len(errList) != errCount): print "Could not locate the errors" return (None) else: return (errList) # ... continued

Next, `_rsFindErr()`

evaluates `errPoly`

, solving for zeroes (lines 41-46) and storing the results into `errList`

. Each zero marks the location of each affected byte. Again, `_rsFindErr() `

returns a `None`

when no zeroes were produced (line 50). Otherwise, it returns `errList`

.

Listing Nine contains the entry method `RSDecode()`

. Its arguments are the encoded message block (`argCode`

) and in the number of error symbols. It begins by copying the block into the local `codeBuffer`

(line 10). It parses `codeBuffer`

and counts any erasures it finds (lines 13-17). If the number of erasures exceeds `errSize`

, `RSDecode() `

returns a `None`

for a result (lines 18-20).

##### Listing Nine

class ReedSolomon: # ...previous listings # # Main decode routine # argCode: the message code block # errSize: number of error symbols def RSDecode(self, argCode, errSize): # initialise the code buffer codeBuffer = list(argCode) # count the number of erasures eraseCount = [] for codePos in range(0, len(codeBuffer)): if (codeBuffer[codePos] < 0): codeBuffer[codePos] = 0 eraseCount.append(codePos) if (len(eraseCount) > errSize): print "Too many erasures" return (None) # prepare the syndrome polynomial polySynd = self._rsSyndPoly(codeBuffer, errSize) if (max(polySynd) == 0): print "The message has no errors" return (codeBuffer) # prepare the error locator polynomial errLoci = self._rsForney(polySynd, eraseCount, len(codeBuffer)) # locate the message errors errList = self._rsFindErr(errLoci, len(codeBuffer)) if (errList == None): print "Could not find any errors" return (None) else: print "Located errors: ", errList # start correcting errors and erasures outMesg = self._rsCorrect(codeBuffer, polySynd, (eraseCount + errList)) return (outMesg)

Next, `RSDecode()`

calls `_rsSyndPoly()`

and checks the resulting syndrome polynomial `polySynd`

(lines 23-26). In this case, it returns `codeBuffer`

if `polySynd`

revealed no errors in the encoded block.

`RSDecode() `

then calls `_rsForney()`

, storing the error locator polynomial into the local `errLoci`

(line 29). It calls `_rsFindErr()`

, storing the list of error locations into the local `errList`

(line 32). Finally, `RSDecode()`

calls `_rsCorrect()`

to perform the necessary corrections (lines 40). It stores the corrected message into `outMesg`

and returns `outMesg`

as its result (line 41).

### Conclusion

Reed-Solomon is a fast and effective way to protect data integrity. It is not the only error-correction algorithm: Other algorithms exists that are faster and perhaps better than Reed-Solomon under specific circumstances. But Reed-Solomon an excellent solution that is widely used wherever small data blocks need to be verified.