Porting the D Compiler to Win64
64-bit Windows was the last major x86 platform that the dmd compiler didn't support, so last summer my colleagues and I decided it was past due. Given that dmd already supports 64-bit compiling on Linux, OS X, and FreeBSD, I didn't expect it to be too complicated. How hard could it be? I was about to find out.
White PapersMore >>
The only major decision was to pick one of the following options:
- Do our own line of tools, which dmd does for 32 bit Windows
- Use the GCC toolset for Win64
- Fit in to the Microsoft Visual Studio 64 bit toolchain
We decided on option 3 because a lot of D users wanted to use D in conjunction with Visual Studio.
The first problem is the file formats. Visual Studio C uses its own object file format and its own object library file format. Fortunately, there's a spec for them. Unfortunately, specs tend to be vague, incomplete, and incorrect, and this is no exception. That's why I always build a file dumper for them, which helps me to not only understand the formats involved, but also compare dmd's output with what the baseline tools emit.
dmd has a tool, dumpobj, to dump OMF, Dwarf, and Mach-O object files. Adding MS-Coff support was simple, and once that was done, I extended the disassembler obj2asm to also disassemble MS-Coff files. Now I can look at the output of the Visual Studio compiler in detail.
(The Visual Studio library format was almost trivial, being a variant of the ar format found on Linux. Only one minor hiccup: The Visual Studio librarian regarded as "optional" what the spec called "required".)
I started reworking the dmd back end to emit MS-Coff for 64-bit files, while sticking with OMF for 32-bit files. This turned out to be more tedious than it should have been, because most of the code was written 30 years ago, and was a mess. The Elf and Mach O had been interwoven with it using conditional compilation, and was intertwined with the rest of the back-end logic. It all had to be refactored into an interface that used virtual functions. I'd intended to fix that years ago, but it was never a priority. I did it a piece at a time, running the full regression test suite after each piece to ensure nothing broke.
Once that was done, I created a new derived interface for the MS-Coff format. Of course, I had no test suite for that. The test suite would just feed it into the Visual Studio linker and see what happened. Sometimes, the linker would crash. dumpobj would say the generated MS-Coff file was fine, but obviously the linker had different ideas. Trying to find the problem was a lot like poking a stick through the bars of a cage in the zoo while blindfolded, trying to figure out what species of animal lived there. But I'm used to debugging the hard way; it's all in a day's work.
After a while, the object file output graduated to a point where the Visual Studio linker wouldn't crash, but would exit with a grumpy message about "corrupt" files. I spent a lot of time staring at the output of dumpobj for dmd's output versus Visual Studio's output. (Since D is not a C compiler, it does not emit the same object file records as a C compiler, nor does it emit records saying that it came from C source code, nor a Microsoft compiler, nor the Microsoft compiler options used. I hoped that whatever Visual Studio tools read object files did not require those records. Fortunately, the gods were kind and I did not need to have dmd pretend to be a Microsoft compiler to work with the Microsoft toolchain.)
The object file format, though, is hardly the end of it. It also has to work with the Visual C++ startup code, as it will link with the Visual Studio runtime library. D supports garbage collection, so it needs to know the beginning and end of the data section. Linkers for OMF and Elf helpfully define and emit a couple of magic variables that bracket the data section. Mach O does one better by providing a system call to get the beginning and end of any section. With MS-Coff, there is no obvious way to do it. The linker does not emit any magic variables that I could find, and all the documentation points to using the dbghelp dll to get the information. This would require every D program to have dbghelp, which was unacceptable.
I could not discover a precise way of doing this, but the gc doesn't actually require a precise method — the data section just has to be between the values selected. So I picked a name out of the startup code that was linked in first to get the beginning, and I picked a name out of the exception handling tables that went at the end. Ugly, but it worked.
The exception handler tables were another bugaboo. I used the "triplets" method to find the beginning and end of them. This means always emitting the exception handling data in a trio of sections, always in the same order, so the first section is linked first and has only one declaration in it (the locomotive), the second section has the exception handling data (the payload cars), and the third section has only one declaration (the caboose of the data).
The only problem I had was the Visual Studio linker would decide to emit arbitrary numbers of null bytes between the sections. Maddeningly, it would neglect my imperious commands to set the alignment so they'd butt up against each other, and would do its own thing. To make it work, I had to have the exception handling reader code skip over nulls until it found real data. Gaak.
(People sometimes ask me why I like to write my own linker. Well, it's infuriating things like the aforementioned that force wacky workarounds.)
So, things had progressed to the point where the linker was accepting the dmd output, the map file looked good, and code execution had at least arrived at D's entry point. Then, of course, it promptly crashed. dmd didn't emit any sort of symbolic debug info in any format that the Visual Studio debugger understood, so I had to figure things out in assembler mode. But hey, it was still a lot better than the bad old days where I used an oscilloscope to debug code.
The only really bad thing about assembler debugging in Visual Studio was VS absolutely refused to show a stack trace. I was to find out why much later. Also, Microsoft's Windbg.exe, when going from Windows XP to Windows 7, forgot how to do stack traces, so I knew something weird was up.
But back to the crashing. Native programming languages all conform to an Application Binary Interface (ABI) of some sort that defines how data is passed to and from a function. Linux, OS X, and FreeBSD all share a common ABI, despite having different object file formats, so making the ABI work for one pretty much made it work for the others. I had also done some refactoring of the code gen to make it, I thought, easy to adapt to different ABIs.
But the 64 bit Visual Studio ABI was different enough to thoroughly break all of my glorious, ABI-flexible code. The trouble was that all values, regardless of their size, are passed as 8-byte quantities. That's no problem if the size is less than or equal to 8 bytes. The trouble comes when they are larger — then they are passed by reference and an 8-byte pointer to a temporary is passed. Conceptually this is not difficult, but the dmd code generator just was not designed to work that way. Most of the trouble and bugs came from finding, one by one, the assumptions embedded in the code generation that value types were passed by value and reference types by reference, no exceptions.
Passing that hurdle led to the next one. Visual Studio has no support for the 80-bit long double type. Much of the floating-point part of the D runtime library depended on C's 80-bit support. I didn't want to give up 80-bit support in dmd, so that meant rolling my own 80-bit math functions. That work is still incomplete, but I hope to get it done soon. It'll be nice to have it done anyway, as C 80-bit runtime library support tends to be erratic. With D's own, it can be predictable, fixable, and reliable.
At last, D was running through the bulk of its test suite. It was time to investigate why stack tracing didn't work, and to bring up symbolic debugging.
The first problem I needed to knock down was stack tracing. Visual Studio's debugger didn't recognize the stack frames generated by the compiler (although the debugger on every other platform did). It turns out that the debugger relied on some static tables inserted into the object file that tell it how the stack frame is set up and torn down. I fail to understand the point of these, as it's pretty easy to just read the instructions in the prolog and figure it out by decoding that table, but my job is not to reason why, it's to get the debugger to work with D code. By generating the tables, with a bit of trial and error (since dmd's frames are different from Visual C++'s frames), I got it to work.
Symbolic debug info is in yet another file format embedded within the MS-Coff format. This is not an officially documented format, although I (of course) think it ought to be. Googling it reveals that it is a variant of the older Codeview format, and various people have decoded parts of it and put it up on the Internet. Going back to my old trusty technique of writing a file dumper for it, and with a little help from colleagues, I was able to figure it out and generate symbolic debug info. Of course, the debugger has never heard of D, so the symbolic debug info emitted pretends to be C++. Microsoft's linker has a checker for the symdeb info. Unfortunately, if the symdeb info is invalid only a generic message comes out. It can be rather frustrating figuring out where it went wrong.
But, eventually I got it to work, and it's nice to see the symbols, locals, and so on, suddenly appear in the debugger. Of course, it's never going to be perfect since the debugger thinks it's C++ code and, for example, strings are 0 terminated.
Things aren't quite done yet. Some 80-bit floating point processing code still needs to be written, and I'm not satisfied with the way stack frames are laid out, but it works well enough for people to try as an alpha version.
Thanks to Rainer Schuetze, Manu Evans, Herb Sutter, and Brad Roberts for their help in getting the Win64 compiler to work.
Thanks to Rainer Schuetze and Andrei Alexandrescu for their invaluable comments on this article.