Channels ▼

Al Williams

Dr. Dobb's Bloggers

Stock in Trade

May 04, 2012

In years past, I used to travel around teaching week-long seminars on different programming topics. Although I haven't done that for awhile, I do miss teaching and working with students. I guess because I miss it, I try to drop in on some high-school engineering classes from time to time (and yes, these days high schools have engineering classes). I also talk to regular classes about what it means to be an engineer.

More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

One theme I talk about with kids is that engineers often make trades. I want a car that is perfectly safe, costs $10, and gets 200 miles per gallon of gas. But I can't have all of those things, so I make trades, or perhaps trade-offs would be a better way to put it. More metal makes the car safer, but makes it get less gas mileage and cost more. That's a trade.

The classic trade in computer science is trading memory for execution speed. You can see that in a nearly pure form by looking at how you'd write a C function to count the number of bits set in a word (sometimes known as the bit population count).

At first glance this seems pretty easy:

unsigned int value, count;
for (count=0; value; value >>=1)  count+=value & 1;

That's easy enough and doesn't take much memory once compiled. The downside is the speed. If you have an 8-bit byte, you could make 8 passes through the loop. At least the code does stop early if you run out of set bits, but you can't count on it ever taking less than 8 passes for a byte. For a 32-bit word, you could take 32 loops.

One easy way to speed things up is to trade memory for speed. I can go faster if I'm willing to invest in memory for a lookup table.


unsigned int value, count;
unsigned char bitset[] =  { 0, 1, 1, 2, 1, 2, 2, 3, …. };
count=bitset[value];

That's great and takes only one array lookup. But the price is a 256 element array to process a byte. A 32-bit word? Forget about it! You'd need 232 entries in your table.

On the other hand, both of these are extreme cases. Consider this code:

unsigned int value, count=0;
unsigned char bit4set[] = 
   { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4 };
while (value)
   {
     count+=bit4set(value&0xF);
     value>>=4;
   }

This is the middle trade. The table only works on 4 bits at a time, and you have to (potentially) take more than one pass through the loop. For a byte, you might make 0, 1, or 2 loops. A 32-bit word could take as much as 8 loops (which is still better than 32).

There's at least one more engineering lesson to be learned here: You can optimize code to some point, but really big optimizations are usually due to selecting a smarter algorithm. Consider this algorithm that Peter Wegner published in 1960:

unsigned int value, count;
for (count=0; value; count++)
    {
   value&=value-1;
    }

You have to think about this one a little. If the number is already zero, the for loop doesn't execute and count is zero, as it should be. Any other number is going to have a 1 in some position. Calculating value-1 gives me the same number except that the rightmost 1 bit will become a 0, and all the bits to the right of that will be 1s. For example:

00000100 -> 00000011
10101000 -> 10100111
00000001 -> 00000000

Using the and operator ensures that all the rightmost bits will go to 0, but none of the bits ahead of the rightmost 1 will change. What this means is that the loop only executes once for every 1 bit in the word. Granted, it could still take 8 loops to count a byte, but assuming the inputs are evenly distributed, that won't happen often. Using the simple non-table approach, half the possible inputs would take 8 loops and 64 would require 7 loops. Only one bit pattern would take a single loop.

With the value-1 method, there are 8 different byte-sized inputs that take a single loop. There is only one input (11111111) that actually requires 8 loops. So while it is memory efficient, it should be faster on the average.

For a final engineering lesson from this simple example, you can also consider reading the documentation. If you use GCC, you could use the __builtin_popcount function (here) to provide the correct count. You can presume that it uses assembly language and any special features the CPU might have and ought to be faster than anything you could build yourself in C. Then again, maybe characterizing each method in theory and then in practice would be a pretty good exercise for one of these high-school classes.

Related Reading






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video