Implementing Half Floats in D

January 19, 2013

Programming languages usually come with a suite of built-in types, such as int, long, char, float, double, and so on. Built-in types enjoy advantages over library types such as better optimization, compile-time execution, and specialized literals. D has an unusually large set of them. But there are never enough, and there's always pressure to add more.

For example, some graphics systems employ a half-float type. This is a 16-bit type used to store an IEEE floating-point value. (Floating-point types are normally 32 or 64 bits.) The half-float type is used only for compact storage — it is expanded to a 32-bit float type for computation, and crushed back down to 16 bytes for storage.

The alternative to a built-in type is, of course, a user-defined library type. How close can D get to implementing a half-float type as a user-defined type? Is it close enough that the pressure for making it built-in is eased? Let's find out.

Laying down some requirements:

1. Implicit promotion to float to do any computations. This is because the hardware floats are so fast, they are sure to beat a software-emulated 16-bit computation. And besides, 16 bits loses too much precision in intermediate results.

2. Explicit conversion from float to half float.

3. A nice literal for the half floats.

That doesn't look so daunting. The promotion rules mirror that of C for shorts (and floats, too, which C allows to be promoted to double for computation).

Starting with the obvious:

struct HalfFloat {
private:
ushort s = 0x7C01;
}

The 0x7C01 forms the default initializer, and that value is the NaN value for half floats.

In order to enable casting to a HalfFloat, add a constructor:

this(float f) {
s = floatToShort(f);
}

The floatToShort() function abstracts away the dirty details of manipulating floating point values elsewhere, so our struct will just focus on the mechanics of creating a new type. Back to the constructor — it's a bit greedy. It'll accept any argument that implicitly converts to a float. In order to restrict it to only accepting floats, turn it into a template and check the type with static if:

this(T : float)(T f) {
static assert(is(T == float));
s = floatToShort(f);
}

And now we can write expressions like:

float f;
HalfFloat hf = cast(HalfFloat)f;

Implementing implicit conversions to float is a bit less obvious:

@property float toFloat() { return shortToFloat(s); }
alias toFloat this;

The alias toFloat this construct is unusual — it tells the compiler that if it can't find the struct member it is looking for, to resolve it to the toFloat member. The toFloat member is a property that yields the HalfFloat converted to a float. Again, we've abstracted away the bit twiddling into shortToFloat().

Now, we can write:

HalfFloat g;
HalfFloat hf = cast(HalfFloat)(g + 3.2f);

and g gets implicitly converted to a float before being added to 3.2f using the floating-point hardware.

No, I didn't forget the HalfFloat literals. They're as simple as:

template hf(float v)
{
enum hf = HalfFloat(v);
}

and used like:

HalfFloat h = hf!1.3f;

The template takes a single argument of type float, which is the 1.3f, constructs a HalfFloat out of it, and assigns it to the manifest constant hf, which becomes the result of the template.

I know what you're thinking: "Like hell that's a user-defined literal. It's calling a bunch of runtime code!" Allow me:

HalfFloat foo() {
return hf!1.3f;
}

Compiling it, then disassembling the code generated for foo():

push    EAX
mov    word ptr [EAX],03D33h
pop    ECX
ret

The 0x3D33 is, indeed, the half-float representation of 1.3f. The compiler, in turning HalfFloat(v) into a manifest constant, ran the constructor and the floatToShort(f) all at compile time.

So there we have it, a nice user-defined half-float implementation, and one that can serve as a model for creating many other types of unusual user-defined arithmetic types.

Here is the complete half float implementation, unit tests and all.

Imperfections

While the half-float type looks and acts pretty much like a built-in type, there remains some subtle differences. Built-in types tend to run faster because the optimizer can take advantage of mathematical identities, any available hardware acceleration, and the back end optimizations often can only deal with built-in types.

Conclusion

D offers a set of features, the confluence of which enables the creation of user-defined types that work well enough to take the pressure off of adding more built-in types. Adding a built-in type is a major disruptive change to any language, so there's a really high bar to justify it. But a user-defined type can be quickly added by anyone with just a few lines of code.

Thanks to Jason House for reviewing a draft of this.

More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>