Channels ▼
RSS

Testing

Error Correction with Reed-Solomon


In Listing Five is the entry method RSEncode() from the class ReedSolomon. It starts with a call to _rsGenPoly(), passing for input the argument errSize and storing the polynomial result into polyGen (line 10). It sets the local outBuffer to the combined size of both argMesg and errSize and initializes the list elements to 0 (lines 13-14). Then, it copies each message byte into outBuffer (lines 17-19).

Listing Five

class ReedSolomon:
	# ...previous listings
	# 
	## REED-SOLOMON ENCODING
	# ------
	# argMesg: the message block
	# errSize: number of error symbols
	# outBuffer: the encoded output buffer
	def RSEncode(self, argMesg, errSize):
		polyGen = self._rsGenPoly(errSize)
		
		# prepare the output buffer
		outBuffer = (len(argMesg) + errSize)
		outBuffer = [0] * outBuffer
		
		# initialise the output buffer
		for mesgPos in range(0, len(argMesg)):
			mesgChar = argMesg[mesgPos]
			outBuffer[mesgPos] = ord(mesgChar)
		
		# begin encoding
		for mesgPos in range(0, len(argMesg)):
			mesgChar = outBuffer[mesgPos]
			if (mesgChar != 0):
				for polyPos in range(0, len(polyGen)):
					tempValu = self.__gfMult(polyGen[polyPos], mesgChar)
					outBuffer[mesgPos + polyPos] ^= tempValu
		
		# finalise the output buffer
		for mesgPos in range(0, len(argMesg)):
			mesgChar = argMesg[mesgPos]
			outBuffer[mesgPos] = ord(mesgChar)
		
		# return the output buffer
		return (outBuffer)
	
	# ... continued

Next, RSEncode() begins encoding. It reads each message byte and checks its value (lines 23-24). For a non-zero byte, RSEncode() combines the byte with the generator polynomial using __gfMult() (line 26). Then it updates each element in outBuffer using an exclusive-or (line 27).

Once all the message bytes are encoded, RSEncode() copies the message bytes again into outBuffer (line 30-32); and it returns outBuffer as the encoded result.

Figure 4 illustrates the encoding process itself. In this process, I have the word "Lorem" as my sample block. I assumed text encoding to be 8-bit ASCII and the number of error symbols to be 5.

encoding process
Figure 4.

The top is the sample block, the bottom the encoded block. Before encoding, the message bytes populates the first half of the list object outBuffer. Each encoding pass then clears a message byte and updates the rest of outBuffer. Note the update moves from left to right. When encoding ends, note that none of the original message bytes remain, while the error symbols occupy the right half of outBuffer.

Decoding with Reed-Solomon

To decode a message block, you still need to supply the number of error symbols. Make sure to use the same number used for encoding. Otherwise, Reed-Solomon will fail to decode the block and correct any erasures or errors present.

To Reed-Solomon, erasures and errors are two different things (Figure 5). An erasure is a missing byte. Its location is known; its value usually a -1 (or 0xff for 8-bit ASCII). An error is an incorrect byte. Its location is not known. Reed-Solomon has to take extra steps to locate the affected byte(s).

erasures and errors
Figure 5.

There is a limit to a number of erasures and errors that Reed-Solomon can correct. That limit is tied to the number of error symbols appended to the message. As a rule, Reed-Solomon can correct up to errSize erasures, but only errSize/2 errors. It can also correct a combination of erasures and errors, provided enough error symbols are available for both.

The decoding process consists of five distinct steps (Figure 6). First step is to locate and count the erasures present in the message block. Next is to create the syndrome polynomial. This polynomial reveals any errors present in the block. If all its terms have zero coefficients, then the block is free of errors and decoding ends.

Third step is to calculate the error locator polynomial. This polynomial maps out the location of each incorrect message byte. Next is to calculate the error evaluator polynomial. And the last step is, of course, to correct the encoded block.

decoding process
Figure 6.

Listing Six shows how the ReedSolomon class prepares the syndrome polynomial. This private method _rsSyndPoly() gets two arguments: the encoded block (argCode) and the number of error symbols (errSize). It resizes the local polyValu to errSize and initializes its elements to zero (line 10). Then it calculates the polynomial terms using __GFEXP and _gfPolyEval() (lines 14-15). It stores each term into polyValu and returns the local once done (lines 18).

Listing Six

class ReedSolomon:
	# ...previous listings
	# 
	# Generate the syndrome polynomial
	# argCode: the code block
	# errSize: number of error symbols
	# polyValu: the syndrome polynomial
	def _rsSyndPoly(self, argCode, errSize):
		# initialise the polynomial
		polyValu = [0] * errSize
		
		# compute the polynomial terms
		for errPos in range(0, errSize):
			byteValu = self.__GFEXP[errPos] 
			polyValu[errPos] = self._gfPolyEval(argCode, byteValu)
		
		# return the polynomial
		return (polyValu)
	
	# ... continued

Listing Seven shows how ReedSolomon calculates its error-locator polynomial. This method, _rsForney(), has three arguments: the syndrome polynomial (polySynd), the list of erasures (eraseLoci), and the number of error symbols. It implements the Forney algorithm, which uses interpolation to generate the polynomial.

Listing Seven

class ReedSolomon:
	# ...previous listings
	# 
	# The Forney algorithm
	# polySynd: the syndrome polynomial
	# eraseLoci: list of erasures
	# errSize: number of error symbols
	# polyValu: the error locator polynomial 
	def _rsForney(self, polySynd, eraseLoci, errSize):
		# make a copy of the syndrome polynomial
		polyValu = list(polySynd)
		
		# compute the polynomial terms
		for posI in range(0, len(eraseLoci)):
			termX = errSize - 1 - eraseLoci[posI]
			termX = self.__GFEXP[termX]
			for posJ in range(0, len(polyValu) - 1):
				termY = self.__gfMult(polyValu[posJ], termX)
				termY ^= polyValu[posJ + 1]
				polyValu[posJ] = termY
			polyValu.pop()
		
		# return the polynomial result
		return (polyValu)
	
	# ... continued

The method starts by copying polySynd into its local polyValu (line 11). In its outer loop, _rsForney() parses each erasure and uses the position to access __GFEXP (lines 15-16). In its inner loop, _rsForney() combines each term from polyValu with the value from __GFEXP using __gfMult() (line 17-19). And it uses the resulting product to update polyValu (line 20). After parsing the erasures, _rsForney() removes the last polynomial term with a call to pop() (line 21) and returns polyValu as its result (line 24).

In Listing Eight is the method _rsFindErr(), which ReedSolomon uses to identify the affected bytes. Its arguments are the error locator polynomial (errLoci) and the number of error symbols (errSize). It uses the Berlekamp-Massey algorithm to prepare its locator polynomial errPoly (lines 14-30). That algorithm creates a minimal polynomial, one that focuses solely on errors. Then, _rsFindErr() counts the number of terms in errPoly (line 33) and compares that count against polySynd (line 34). It returns a None when the encoded block has too many errors (line 36).

Listing Eight

class ReedSolomon:
	# ...previous listings
	# 
	# Locate the message errors
	# errLoci: error locator polynomial
	# errSize: number of error symbols
	def _rsFindErr(self, errLoci, errSize):
		# initialise the polynomial locals
		errPoly = [1]
		tempPoly = [1]
		
		# generate the error locator polynomial
		# - Berklekamp-Massey algorithm
		for posSynd in range(0, len(errLoci)):
			tempPoly.append(0)
			termSynd = errLoci[posSynd]
			
			for posErr in range(1, len(errPoly)):
				termPoly = errPoly[len(errPoly) - posErr - 1]
				termPoly = self.__gfMult(termPoly, errLoci[posSynd - posErr])
				termSynd ^= termPoly
			
			if (termSynd != 0):
				if (len(tempPoly) > len(errPoly)):
					tNewP = self._gfPolyScale(tempPoly, termSynd)
					tempPoly = self._gfPolyScale(errPoly, self.__gfDivi(1, termSynd))
					errPoly = tNewP
				
				tempValu = self._gfPolyScale(tempPoly, termSynd)
				errPoly = self._gfPolyAdd(errPoly, tempValu)
		
		# count the number of errors
		errCount = len(errPoly) - 1
		if ((errCount * 2) > len(errLoci)):
			print "Too many errors to correct"
			return (None)
		else:
			print "Error count: ", errCount, len(errLoci)
		
		# calculate the polynomial zeroes
		errList = []
		for errPos in range(0, errSize):
			errZed = self._gfPolyEval(errPoly, self.__GFEXP[255 - errPos])
			if (errZed == 0):
				errZed = errSize - errPos - 1
				errList.append(errZed)
		
		if (len(errList) != errCount):
			print "Could not locate the errors"
			return (None)
		else:
			return (errList)
	
	# ... continued

Next, _rsFindErr() evaluates errPoly, solving for zeroes (lines 41-46) and storing the results into errList. Each zero marks the location of each affected byte. Again, _rsFindErr() returns a None when no zeroes were produced (line 50). Otherwise, it returns errList.

Listing Nine contains the entry method RSDecode(). Its arguments are the encoded message block (argCode) and in the number of error symbols. It begins by copying the block into the local codeBuffer (line 10). It parses codeBuffer and counts any erasures it finds (lines 13-17). If the number of erasures exceeds errSize, RSDecode() returns a None for a result (lines 18-20).

Listing Nine

class ReedSolomon:
	# ...previous listings
	# 
	# Main decode routine
	# argCode: the message code block
	# errSize: number of error symbols
	def RSDecode(self, argCode, errSize):
		
		# initialise the code buffer
		codeBuffer = list(argCode)
		
		# count the number of erasures
		eraseCount = []
		for codePos in range(0, len(codeBuffer)):
			if (codeBuffer[codePos] < 0):
				codeBuffer[codePos] = 0
				eraseCount.append(codePos)
		if (len(eraseCount) > errSize):
			print "Too many erasures"
			return (None)
		
		# prepare the syndrome polynomial
		polySynd = self._rsSyndPoly(codeBuffer, errSize)
		if (max(polySynd) == 0):
			print "The message has no errors"
			return (codeBuffer)
		
		# prepare the error locator polynomial
		errLoci = self._rsForney(polySynd, eraseCount, len(codeBuffer))
		
		# locate the message errors
		errList = self._rsFindErr(errLoci, len(codeBuffer))
		if (errList == None):
			print "Could not find any errors"
			return (None)
		else:
			print "Located errors: ", errList
		
		# start correcting errors and erasures
		outMesg = self._rsCorrect(codeBuffer, polySynd, (eraseCount + errList))
		return (outMesg)

Next, RSDecode() calls _rsSyndPoly() and checks the resulting syndrome polynomial polySynd (lines 23-26). In this case, it returns codeBuffer if polySynd revealed no errors in the encoded block.

RSDecode() then calls _rsForney(), storing the error locator polynomial into the local errLoci (line 29). It calls _rsFindErr(), storing the list of error locations into the local errList (line 32). Finally, RSDecode() calls _rsCorrect() to perform the necessary corrections (lines 40). It stores the corrected message into outMesg and returns outMesg as its result (line 41).

Conclusion

Reed-Solomon is a fast and effective way to protect data integrity. It is not the only error-correction algorithm: Other algorithms exists that are faster and perhaps better than Reed-Solomon under specific circumstances. But Reed-Solomon an excellent solution that is widely used wherever small data blocks need to be verified.

References

C.K.P. Clarke. (2002). Reed-Solomon Error Correction. [PDF] BBC R&D White Paper.

Jeff A Reid. (2008). Tutorial on Reed-Solomon Error Correction Coding. [PDF] NASA Tech Briefs, MSC-2183.

Joel Sylvester. (2001). Reed-Solomon Codes. [PDF] Electrokbit.


José Cruz is a freelance engineering writer based in British Columbia. He frequently contributes articles to Dr. Dobb's.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video