Implementing Half Floats in D
Programming languages usually come with a suite of built-in types, such as int, long, char, float, double, and so on. Built-in types enjoy advantages over library types such as better optimization, compile-time execution, and specialized literals. D has an unusually large set of them. But there are never enough, and there's always pressure to add more.
For example, some graphics systems employ a half-float type. This is a 16-bit type used to store an IEEE floating-point value. (Floating-point types are normally 32 or 64 bits.) The half-float type is used only for compact storage — it is expanded to a 32-bit float type for computation, and crushed back down to 16 bytes for storage.
The alternative to a built-in type is, of course, a user-defined library type. How close can D get to implementing a half-float type as a user-defined type? Is it close enough that the pressure for making it built-in is eased? Let's find out.
Laying down some requirements:
- Implicit promotion to float to do any computations. This is because the hardware floats are so fast, they are sure to beat a software-emulated 16-bit computation. And besides, 16 bits loses too much precision in intermediate results.
- Explicit conversion from float to half float.
- A nice literal for the half floats.
That doesn't look so daunting. The promotion rules mirror that of C for shorts (and floats, too, which C allows to be promoted to double for computation).
Starting with the obvious:
struct HalfFloat {
private:
ushort s = 0x7C01;
}
The 0x7C01 forms the default initializer, and that value is the NaN value for half floats.
In order to enable casting to a HalfFloat, add a constructor:
this(float f) {
s = floatToShort(f);
}
The floatToShort() function abstracts away the dirty details of manipulating floating point values elsewhere, so our struct will just focus on the mechanics of creating a new type. Back to the constructor — it's a bit greedy. It'll accept any argument that implicitly converts to a float. In order to restrict it to only accepting floats, turn it into a template and check the type with static if:
this(T : float)(T f) {
static assert(is(T == float));
s = floatToShort(f);
}
And now we can write expressions like:
float f;
HalfFloat hf = cast(HalfFloat)f;
Implementing implicit conversions to float is a bit less obvious:
@property float toFloat() { return shortToFloat(s); }
alias toFloat this;
The alias toFloat this construct is unusual — it tells the compiler that if it can't find the struct member it is looking for, to resolve it to the toFloat member. The toFloat member is a property that yields the HalfFloat converted to a float. Again, we've abstracted away the bit twiddling into shortToFloat().
Now, we can write:
HalfFloat g;
HalfFloat hf = cast(HalfFloat)(g + 3.2f);
and g gets implicitly converted to a float before being added to 3.2f using the floating-point hardware.
No, I didn't forget the HalfFloat literals. They're as simple as:
template hf(float v)
{
enum hf = HalfFloat(v);
}
and used like:
HalfFloat h = hf!1.3f;
The template takes a single argument of type float, which is the 1.3f, constructs a HalfFloat out of it, and assigns it to the manifest constant hf, which becomes the result of the template.
I know what you're thinking: "Like hell that's a user-defined literal. It's calling a bunch of runtime code!" Allow me:
HalfFloat foo() {
return hf!1.3f;
}
Compiling it, then disassembling the code generated for foo():
push EAX
mov word ptr [EAX],03D33h
pop ECX
ret
The 0x3D33 is, indeed, the half-float representation of 1.3f. The compiler, in turning HalfFloat(v) into a manifest constant, ran the constructor and the floatToShort(f) all at compile time.
So there we have it, a nice user-defined half-float implementation, and one that can serve as a model for creating many other types of unusual user-defined arithmetic types.
Here is the complete half float implementation, unit tests and all.
Imperfections
While the half-float type looks and acts pretty much like a built-in type, there remains some subtle differences. Built-in types tend to run faster because the optimizer can take advantage of mathematical identities, any available hardware acceleration, and the back end optimizations often can only deal with built-in types.
Conclusion
D offers a set of features, the confluence of which enables the creation of user-defined types that work well enough to take the pressure off of adding more built-in types. Adding a built-in type is a major disruptive change to any language, so there's a really high bar to justify it. But a user-defined type can be quickly added by anyone with just a few lines of code.
Thanks to Jason House for reviewing a draft of this.



