# The Decimal Divide

July 15, 2011

Pretty much everyone knows the "multiply by 10" trick. After all, you need to multiply decimal input by 10 all the time and many processors don't directly have a multiply instruction, or they have one that is slow. The problem, however, is dividing by 10 — another common operation when dealing with decimal input or output.

But first, back to multiplication. Even if your CPU has a multiply instruction (and it's fast enough you can use it), you can probably still speed the operation up by not using the instruction (depending on your processor). If you don't have a multiply instruction, you can simply do a loop to add the number 10 times, but that's very inefficient:

int i;  /* counter */
int x=3;  /* number */
int y=0;  /* will have 10*n in it */
for (i=0;i<10;i++) y+=x;

The way to simplify this is to realize that 10x=8x+2x. Since 8x is x<<3 and 2x is x<<1 you can simply write:

y=(x<<3)+(x<<1);

Most processors can shift rapidly, and even those that have to shift one bit at a time will still only need 5 instructions to make this work. You should look at your compiler's output to see if it is smart enough to optimize, particularly if your CPU only shifts one bit at a time. In that case, you want something like:

y=x<<1;
y=y+(y<<2);

This reuses the first shift so you don't have to do it again. If your CPU can do both shifts in one instruction each, this optimization probably doesn't matter.

Sadly, there is no simple way to convert a division by 10 into a few shifts. You can count how many times you can subtract 10, but like the looping multiply, that's not very efficient:

int y=0;  /* will hold the answer */
int x=33; /* will hold the remainder */
while (x>=10)
{
y++;
x-=10;
}

There are, however, some slightly smarter ways. For example, 205/2048 is very nearly 1/10. If you have a reasonable multiple routine you can compute:

y=(x*205)>>11;

Just make sure x*205 doesn't overflow.

Another trick that works well on 32 bit machines is to multiply by 1/10 x 4,294,967,296 (that is, 429,496,729 or 429,496,730 if you prefer to round) and then shift right by 32. If your CPU does a 32-bit multiply into a 64-bit register, you can simply throw the bottom half away after the multiply.

You can also get close by observing that 255 is divisible by 5 (255/5=51) and that 255/256 is very nearly 1. Therefore, if you multiply a number by 51 and throw out the last byte, that is very nearly the same as multiplying by 1/5 or dividing by 5. Over a byte's worth of data, you'll be off by one, so the expression is:

y=((x*51)>>8)+1;   /* /5 */

Keep in mind that a machine with 16-bit registers that you can access as a byte can do a >>8 operation for free by just taking the upper half of the register. This is the divide by 5 value, so to get back to 10 you need to shift the answer right again one more time.

This isn't always useful for decimal conversion though because there is some error involved and the routine rounds rather than provides a remainder. So using the above algorithm, we learn that 32/10=3, 35/10=3, but 36/10=4. You need to examine the behavior of such shortcuts over the range of numbers you expect to handle. In particular, make sure that x*51 doesn't overflow!

I've also seen people use the approximation 26/256. This is very simple and fast because 26x = 16x+8x+2x. However, the error is too large as the numbers increase. For example, for 1,024 the result is 104 (instead of 102). On the other hand, depending on your needs, you may be able to find an approximation that works for you.

If you want to read more about division techniques on integer computers, take a look at Douglas Jones' tutorial on reciprocal multiplication. If you are interested in efficient math, it is well worth reading.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.