## Calculating CRC Checksums in C++

### Colin Mahoney

Here's a handy template that computes a variety of CRC checksums.

A while back I was asked to produce some C++ code to calculate CRC (cyclic redundancy code) checksums. I was given a copy of Ross Williams' article, "A Painless Guide to CRC Error Detection Algorithms" [1], which includes a detailed description of the basic CRC algorithm, its optimization using tables and other means, and information about various CRC protocols. The article describes a parameterized model for CRC algorithms called the "Rocksoft™ Model CRC Algorithm." The article includes C source code for an implementation of the model. The model seemed to be made to measure for an implementation in C++ using templates. In this article I describe **CRCGenerator**, a C++ class which I produced based on Ross's original C code. Readers who want a detailed description of CRC algorithms should refer to Ross's original article.

#### CRCs — The Basics

CRCs create hash values for a block of data. These values can be used to check for file integrity and transmission errors in streams. They should not be used to check for malicious tampering with data, as it is quite easy to change the original data in a way that conserves the original CRC value.

A CRC checksum is obtained by treating the input data as a very long binary number, and dividing it by a shorter number — usually 16 or 32 bits. The remainder of this division is the CRC value. The quotient is not used and thus does not have to be recorded. The division is carried out modulo two without carries, and can be calculated efficiently using shifts and XORs (exclusive ORs). The divisor is known as the "polynomial," for reasons related to the mathematical theory underlying such calculations. CRC checksums have a useful property: if the checksum is appended to the original data, the new checksum for <data + original checksum> is zero. This makes testing for transmission errors very easy; the checksum can be transmitted after the data, and the receiver has only to check that the checksum for all the data received is zero.

#### The CRCGenerator Implementation

CRCGenerator (Figure 1 a partial listing of **crc.h**)
uses a nested table class, **CRCTable**, to do most of the work of calculating
the checksum. **CRCTable** has one data member, **Table**, and three member
functions — a constructor, a subscripting operator, and **CalcTableEntry**.
The latter calculates the values for the elements of the table, and is called
repeatedly from the constructor to fill the table. **CRCGenerator** has a
static data member **Table**, of type **CRCTable**. The **unsigned long**
member **Register** holds the current value of the CRC calculation. It is
initialized to the value of the template parameter **Init**. The code uses
template parameters in conditional expressions to select the appropriate element
of **Table** when processing data. This will produce a lot of "conditional
expression is constant" style messages on compilation. You may want to disable
this warning when compiling **crc.h**.

**CRCGenerator** is defined in the namespace **cdm_crc**. Also defined in this namespace are **CRCMaxBits**, which defines the maximum number of bits that can be held in an **unsigned long**, and an insertion operator to write the CRC value to an ostream.

#### The Rocksoft Model

The Rocksoft model uses seven parameters to describe each CRC protocol. Six of these make natural template parameters in C++, while the seventh is a check value used to validate implementations of the algorithm. I wrote the **CRCGenerator** class to use just five of the Rocksoft parameters as well as one additional parameter. The declaration of the **CRCGenerator** class is:

template < int Width, unsigned long Poly, unsigned long Init, unsigned long XOrOut, bool Ref, bool Direct=true > class CRCGenerator;

If you want to calculate CRC-16 checksums, you can typedef the type:

typedef CRCGenerator < 16, 0x8005, 0, 0, true > CRC16;

The template parameters are as follows.

#### Width

This parameter describes the width of the polynomial used to calculate the checksum. It is equal to the highest exponent of x used in the polynomial representation of the divisor. **CRCGenerator** works with polynomials of widths from 8 to 32 bits.

#### Poly

This parameter is the value representing the polynomial used to calculate the CRC, but *not* including the most significant bit. This value should have **Width** bits. The ability of a CRC algorithm to detect errors depends to a large part on the mathematical properties of the polynomial chosen. For this reason it is not really a good idea to "invent" your own polynomial — unless you really know what you're doing you should use one that has been tried and tested.

#### Init

This is the value used to initialize the CRC value, as given by the CRC protocol in question. Common values are **0**, as in CRC-16, and **0xFFFFFFFF**, as in CRC-32.

#### XOrOut

This is the value to be XORed with the final result of the CRC calculation to obtain the CRC value for the data. If **XOrOut** is non-zero, the **CRCGenerator** function **GetNormalCRC** will not return zero after processing **<data+ original checksum>**.

#### Ref

Some CRC protocols process bytes least-significant bit first — the bytes are reflected. **Ref** controls whether the input bytes should be reflected before being processed, and whether the final CRC value should be reflected before it is read. The Rocksoft model uses separate parameters for input and output, but my implementation uses a single value for both.

#### Direct

When **Direct** is true (the default value) each byte processed is passed immediately through **Register** — that is, there are no bytes sitting in **Register** waiting to be propagated out by the following data. The effect is as if each byte processed had been pushed through **Register** by **Width** zero bits. Any actual data that follows is superimposed on top of these zero bits and then processed in the same way.

The value that has been fed into **Register** by the end of the calculation is equal to the total input data augmented by **Width** zero bits. This is the required behavior for most CRC alorithms. For some algorithms, however, the bytes processed should be placed at the bottom of the register then pushed through it by the following data. You can get this behavior by setting the **Direct** parameter to **false**. If you have **Direct** set to **false** and you want to force all the data through the register, you can combine calls to **Process** and **ProcessBits** to pass **Width** zero bits through the register after processing all your data.

The "Check" parameter in the Rocksoft Model is the value that should be obtained when the string **"123456789"** is processed.

#### The CRCGenerator Interface

The public interface for **CRCGenerator** is shown below:

CRCGenerator(); unsigned long GetCRC() const; unsigned long GetNormalCRC() const; bool GetDirect() const; bool GetReflect() const; bool GetWidth() const; void LoadRegister ( unsigned long value ); void Process(unsigned char byte); void Process ( unsigned char* block, int block_size ); template<class InIter> void Process(InIter first, InIter last); void ProcessBits ( unsigned char bits, int count ); void Reset(); void Write(ostream& os) const;

A synopsis of these functions follows.

#### CRCGenerator

Only a default constructor is required since all the necessary information is given in the template parameters.

#### GetCRC

This function returns the value in the CRC register after XORing with **XOrOut**. If the **Ref** parameter is **true** this value is bitwise reflected. If you want to feed this value into the CRC calculation, you should do so least significant *byte* first — you don't need to worry about the bits, as the calculation takes care of reflecting them. If **Ref** is **false** the bytes are correctly ordered. The value is returned in the bottom **Width** bits of the return value.

#### GetNormalCRC

This function reflects the bytes in the CRC register if **Ref** is **true**, and returns the CRC value in the most significant bytes of the return value. The value returned from **GetNormalCRC** can be fed into the calculation most-significant byte first, irrespective of whether **Ref** is **true** or **false**.

#### GetDirect, GetReflect, GetWidth

These functions return the values of the **Direct**, **Ref**, and **Width** parameters respectively. These may be useful for shifting or formatting the value returned by **GetCRC** or **GetNormalCRC**.

#### LoadRegister

This function can be called to XOR a value into the register. The value is reflected as necessary. This function should not be required for most calculations.

#### Process

The three **Process** functions allow data to be fed into the calculation either one byte at a time or in blocks. An alternative to the version using member templates is provided; it is commented out and can be used if your compiler doesn't support member templates.

#### ProcessBits

Processes **count** bits from bits. If **Ref** is **false** they are read in least-significant bit first; if **Ref** is **true**, most-significant bit first. Unless you want to do something unusual with CRCs having widths that are not integral numbers of bytes you probably won't need to use this function.

#### Reset

This function sets the CRC register value to the value of the **Init** parameter.

#### Write

Writes the CRC value to a stream, taking **Ref** into account. **Write** can be used to append the CRC value to a file stream. The file will then produce a CRC value of zero if read in and processed through **CRCGenerator**.

#### An Example: CRC-16

I have already shown the typedef used by the standard CRC-16 protocol. The following code runs the test case for the Rocksoft model:

typedef CRCGenerator < 16, 0x8005UL, 0UL, 0UL, true > CRC16; unsigned char data[] = {'1','2','3', '4','5','6','7', '8','9'}; CRC16 crc16; crc16.Process(data, 9); unsigned long crc_val; crc_val=crc16.GetNormalCRC();

Figure 2 shows complete code to calculate and print
CRC-16 checksums for a list of files specified on the command line. Note the
call to **infile.unsetf(ios_base::skipws)**, which is necessary because **istream_iterator<>**
skips whitespace by default.

#### Conclusion

The above examples demonstrate the convenience offered by C++ template classes. The template mechanism takes care of the creation and initialization of the static look-up table, allowing the class user to write code uncluttered by global variables or calls to initialization functions. My intention in writing the **CRCGenerator** class was to provide efficiency, ease of use, and flexiblity. In as much as this has been achieved, it is because C++ has made these often conflicting goals easily attainable.

#### Note

[1] Available online at Ross's web site at http://www.ross.net/crc/.

*Colin Mahoney teaches English and is an independent developer of software for language learners in Sabadell, Spain. He can be contacted at cmahoney@readysoft.es.*