Channels ▼

Walter Bright

Dr. Dobb's Bloggers

Targeting OS X 64 Bit

December 27, 2011

A while back, I targeted the D programming language compiler to generate 32-bit code for OS X, and 64-bit code for Linux. For the last several years, all Mac OS X machines include 64-bit CPUs, so the obvious next step is to target 64-bit code for OS X.

Having a debugged and working 32-bit port to OS X, and a debugged and working 64-bit code generator, this should be straightforward. (Hah!)

The object file format for OS X is the Mach-O, which is unique to OS X (the Linux universe uses the ELF format). The first step is to convince my dumpobj utility to recognize and dump the Mach-O 64 format.

Yes, I know there are existing off-the-shelf object file dumpers, but by writing my own I learn how the file format really works. This was a quick and straightforward job, as the Apple documentation on it is good. The next job was to retarget the obj2asm disassembler. Obj2asm already was doing 64-bit instructions, so it just had to learn the Mach-O 64 format. Again, this was simple, and soon I had the tools to examine the output of GCC.

The D compiler can generate library (.a) files directly, so the next job was to figure out that format and adjust the compiler as needed. This turned out to be trivial, as the .a format was the same for 32-bit object files; it just had to deal with Mach-O 64 files. With the knowledge I gained from dumpobj and obj2asm, this was quick work.

Tackling the compiler output involves:

  1. Adjusting the object file generator to output Mach-O 64 format. This was easy, now that I'd learned the format by adjusting dumpobj.

  2. Conforming to the 64-bit ABI. Fortunately, OS X follows the same C ABI as 64-bit Linux does. This meant that all the agony I went through figuring out how to compile variadic functions worked out of the box for OS X. I didn't have to change a thing. Phew!

  3. Fixups. When a reference is made in a source file to a symbol, such as printf, the compiler doesn't know what address to use for the symbol. Instead, it outputs a "fixup record" that consists of a symbol, and a location in the object file that must be "fixed up" with the real address of that symbol when it becomes known. These become known by the linker when it combines object files and resolves symbols like printf and later the loader to adjust those addresses to where the program actually winds up in memory.

Fixup records and schemes are different for every system. They're usually defined by the object file format, which is called Mach-O for OS X.

Fixups used to be simple. Back in the early MS DOS days, there weren't any fixups at all for COM programs. Just copy the bytes into memory and jump to it. (Any relocation was done by the hardware through the segment registers.) Those glory days didn't last long. Now we've got multiple addressing modes, multiple sections, shared libraries, position-independent code, offsets known and unknown by the compiler, etc.

Just for starters, there's:

  1. data referencing other data
  2. data referencing code
  3. code referencing code
  4. code referencing data

Throw in the other cases and there's quite a smorgasbord of quirky and obscure detail.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video