## Standard C

### Math Primitives

### P.J. Plauger

*P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is Standard C, which he co-authored with Jim Brodie. You can reach him at [email protected] uunet.*

#### Introduction

Last month, I began a discussion of the functions declared in <

*math.h*>. I gave a brief history of the evolution of the C math library. I outlined the goals I set myself when I wrote my own set of math functions. I ended that column with a brief sampler of code.

I present a much heavier dose of code this month. My goal is to show you the underpinnings of a math library. These are the functions that isolate most of the numerical code from the actual representation of floating-point values on a given computer. They make possible a combination of precision, portability, and speed that you could not otherwise hope to achieve. These functions also form a handy collection of nuts and bolts. You may have occasion to write a function not already in the math library. Knowing how to use the low-level functions can save you a lot of grief.

You don't have to like numerical programming to benefit from a study of these functions. Programming abounds with interfaces. An important skill to master is encapsulating the peculiarities of an interface in an economical set of type definitions and functions. That's the essence of designing good objects in languages like C++. Even without all the trappings of an object-oriented language, however, you can encapsulate and isolate to advantage.

The Standard C library contains a number of explicit encapsulations. Most of the 15 standard headers provide collections of related types, macros, and function declarations. <*stdarg.h*> holds everything you need to walk a variable argument list. <*ctype.h*> has a comprehensive set of character classification and mapping functions. A few headers are catchalls, to be sure. (<*stddef.h*> and <*stdlib.h*> are the worst offenders.) The rest form respectable "objects," even if they lack the discipline that C+ + imposes.

The library also contains a few *implicit* encapsulations. Consider the storage allocation functions — *calloc*, *free, malloc*, and *realloc*. They cooperate to maintain a heap from which you can allocate data objects on the fly. They also conspire to hide details of the implementation from the program. You can identify a handful of the functions declared in <*stdio.h*> as fundamental. Get them working and you can write all the rest of the library in terms of calls to those low-level primitives. (UNIX programmers know that you can get a lot of mileage out of the system calls *close, lseek, open, read*, and *write*.) The math library is built atop a similar implicit encapsulation.

You don't have to even *know* numerical programming to benefit from a study of the math primitives. Here is where the numbers meet the bits. These functions do more masking and shifting than any sort of arithmetic. At that, the arithmetic involves integer operations, not floating-point. The whole idea is to depend as little as possible on how the implementation performs floating-point, particularly with extreme values. You want to avoid overflow, underflow, and zero divide at all costs. Otherwise, you have to master how each machine generates special codes or causes traps.

#### The Primitives

The C math library includes quite an assortment of semi-numerical functions. You would expect these to be the set of primitives we're looking for. Indeed, the earliest implementations of C used these visible functions to construct many of the others.

Those functions predate the IEEE 754 floating-point standard, however. They have no provision for handling infinities and NaNs (for not-a-number). I found it more convenient, therefore, to build on a different set of primitives. The existing semi-numerical functions become mostly simple calls on the lower-level functions.

A good way to illustrate what the primitives do is to show them in conjunction with their visible counterparts. I began that process last month with the function *fabs*. That trivial function is not so trivial when you have to worry about the special codes of IEEE 754. I simplified it as much as possible, and made it more portable in the bargain, by having it call the primitive *_Dtest.*

Listing 1 is a repeat of the function *_Dtest*.
It classifies its *double* argument broadly as a NaN, an infinity, zero,
or some finite value. For space considerations, I won't repeat *fabs* or
the relevant portions of the header *"xmath.h"*. See last month's column
if you're curious.

Listing 2 shows the function *ceil* and Listing
3 shows the function *floor.* They are the next simplest semi-numerical
functions after *fabs*. Each function requires that any fraction part of
its argument be set to zero. Moreover, each needs to know whether the fraction
part was initially nonzero. Each function then adjusts the remaining integer
part in slightly different ways. The common operation is to clear fraction bits
and test whether any bits got cleared. That's what the primitive *_Dint*
is for.

Listing 4 shows the function *_Dint*. If **px*
has a finite value, the function tests and clears all fraction bits less than
a threshold value. That threshold is effectively two raised to the power *xexp.*
(Other functions have occasion to call *_Dint* with values of *xexp*
other than zero.) The code for clearing fraction bits is a bit tricky.

Note the use of an index within an index in the term *ps[sub[xchar]]*. The index *sub[xchar]* corrects for differences in layout of floating-point values on different computer architectures. The *switch* statement contains a cascade of *case* labels, a practice that is generally misleading and unwise. I indulge in both practices here in the interest of performance.

Listing 5 shows the file *modf*, which is only
slightly more ornate than *ceil* and *floor*. Like those functions,
*modf* relies on the function_*Dint* to do the hard part.

Listing 6 shows the function *frexp* that unpacks
the exponent from a finite argument *x*. Once again, a reasonably simple
function is complicated by the presence of the various special codes. And once
again, a more flexible low-level function does most of the hard work.

Listing 7 shows the function *ldexp*, which faces
problems similar to *frexp*, only in reverse. Once it dispatches any special
codes, it still has a nontrivial task to perform. It too calls on a low-level
function. Let's look at the two low-level functions.

Listing 8 shows the function _*Dunscale*, which
combines the actions of _*Dtest* and *frexp* in a form that is handier
for several other math functions. By calling_*Dunscale*, the function *frexp*
is left with little to do.

_*Dunscale* itself has a fairly easy job except when presented with a gradual underflow. In the IEEE 754 representation, a normalized value has a non-zero characteristic and an implicit fraction bit to the left of the most-significant fraction bit that is represented. Gradual underflow is signaled by a zero characteristic and a nonzero fraction with *no* implicit leading bit. Both these forms must be converted to a normalized fraction in the range [0.5, 1.0), accompanied by the appropriate binary exponent. The function _*Dnorm*, described later, handles this messy job.

Listing 9 shows the function _*Dscale*. It, too,
frets about special codes, because of the other ways that it can be called.
Adding the *short* value *xexp* to the exponent of a finite **px*
can cause overflow, gradual underflow, or underflow. You even have to worry
about integer overflow in forming the new exponent. That's why the function
first computes the sum in a *long*. Most of the complexity of the function
_*Dscale* lies in forming a gradual underflow. The operation is essentially
the reverse of _*Dnorm*.

Listing 10 shows the function _*Dnorm*. It normalizes
the fraction part of a gradual underflow and adjusts the characteristic accordingly.
To improve performance, the function shifts the fraction left 16 bits at a time
whenever possible. That's why it must be prepared to shift right as well as
left one bit at a time. It may overshoot and be obliged to back up.

Listing 11 shows the function *fmod*, which
is the last of the semi-numerical functions declared in <*math.h*>.
It is also the most complex. In principle, it subtracts the magnitude of *y*
from the magnitude of *x* repeatedly until the remainder is smaller than
the magnitude of *y*. In practice, that could take an astronomical amount
of time, even if it could be done with any reasonable precision.

What *fmod* does instead is scale *y* by the largest possible power of two before each subtraction. That can still require dozens of iterations, but the result is reasonably precise. Note the way *fmod* uses _*Dscale* and _*Dunscale* to manipulate exponents. It uses _*Dunscale* to extract the exponents of *x* and *y* to perform a quick but coarse comparison of their magnitudes. If *fmod* determines that a subtraction might be possible, it uses _*Dscale* to scale *x* to approximately the right size.

#### Conclusion

That's the complete set of low-level primitives you need to isolate machine-dependencies in the math library. They work well for a broad range of floating-point representations. (They need work if a

*double*is not a 64-bit quantity representable as four 16-bit

*shorts*. They fail miserably if the floating-point base is not a power of two.) On some machines, you can replace many of them with assembly-language versions that are much faster. But you don't have to.

The representation still affects the rest of the code in subtle ways. For example, a sixth-order polynomial may be a good approximation for IEEE 754 arithmetic with 53 bits of precision. That would be overkill on a machine that retains only 32 bits of precision. It would fail on a machine that retains much more than 53 bits. You have similar issues in dealing with the range of exponents the machine can represent.

Still, having a narrow interface like this helps tremendously. It gives you a language in which to describe the higher-level math functions. As you will see, that can make higher-level math functions not just more portable, but more readable as well.