Channels ▼

Al Williams

Dr. Dobb's Bloggers

Ham It Up

May 14, 2012

I was having a conversation this week about gray codes and thought I should write a little something about them. About half way through, I realized it sounded familiar and discovered I had written almost the exact same thing over a year ago (see Shades of Gray). It is a bad sign when you start repeating yourself!

Gray codes always make me think of hamming distance, though, since they are a case where each adjacent code has a hamming distance of 1 from each other. For the case of a mechanical or optical encoder, that's exactly what you want. If you are looking for fault tolerance, then you want more hamming distance, not less.

Consider some networking scheme where I have a one byte address for each node. In theory, then, my network could support 256 nodes (although I probably use a few addresses for special purposes). I probably don't have 256 nodes though. In fact, you probably have way less in a typical project.

Let's take the silly example of having three nodes. You might just as well number them 0, 1, and 2. As long as everything works as you expect, there's nothing wrong with that. What happens if things don't work like you expect though? If some sort of error causes a single bit to flip, then address 0 could become address 1 or 2. Conversely, addresses 1 or 2 could easily become zeros. That's because the hamming distance between 0 and the other two addresses is 1. Of course, you could have errors that generated other addresses (like 1 to 3) but those aren't as important because there is no address 3 and it would be easy to figure out that some error occurred. There's no way to tell a bogus address 2 from a real address 2, so that's the kind of error you worry about.

You can't have a single bit error that causes address 1 to become 2 or vice versa, though. The hamming distance there is 2, so it would take two errors to get an undetected mistake (and, again, most errors would cause a totally illegal address — I'm not worried about those as much because I can detect them).

A smarter move would be to use those extra bits in the address to make sure that errors could not easily corrupt an address into another good address. For example, suppose you numbered the nodes (in hex) 03, 0C, and 30. Now any single bit error would generate an illegal address. There's so much extra room that you could easily create a set of 3 numbers that had an even greater hamming distance (07, 38, and C0, for example). In real life, you probably have more than just 3 nodes in an 8-bit address space, but you can still work it out.

Oddly enough, last week I was writing about bit population count algorithms. I wasn't thinking about hamming distance, but the bit population count is key to computing hamming distance. If you want to know the distance between two bit strings A and B, simply XOR them together and then do a bit population count of the result. The XOR will have a 1 bit for every bit position that is different and the bit population count tells you how many 1 bits you got. Using the code from last week:

unsigned int bit_count(unsigned int v)
   unsigned int count;
  for (count=0; v; count++)
  return count;
unsigned int hamming_dist(unsigned a, unsigned b)
    return bit_count(a^b); 

If you think about it, parity is just a simple case of using an extra bit to generate a hamming distance of two. For example, a 7-bit ASCII character with parity has at least a distance of 2 between each character (because adjacent characters will have different parity bits).

If you are like me and too lazy work out codes yourself, you might find this tool useful. Of course, as with any redundant scheme, you do lose the use of some bits. Using that tool, you can see that you could shoehorn 128 numbers into a byte and maintain a minimum hamming distance of 2. With a hamming distance of 3, the total items you can fit drops to 16! At a hamming distance of 6, you can only hold 2 distinct numbers.

Hamming distance is important in reliable systems and is also used in some cryptography algorithms. Of course, the idea of it is also important for gray codes. It is one of those simple bit tricks everyone should have in their toolkit.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.