Channels ▼
RSS

Open Source

The Mac's Move to Intel

Source Code Accompanies This Article. Download It Now.


October, 2005: The Mac's Move to Intel

Tom is a technical writer providing support for J2ME wireless technologies at KPI Consulting. He can be contacted at tom_thompson@lycos.com.


A Portable Operating System


Apple Computer's CEO Steve Jobs clearly dropped a bombshell when he told software developers at Apple's World Wide Developer's Conference (WWDC) that the Macintosh computer platform was going to switch from PowerPC to Intel x86 processors. He added, of course, that there would be time to adapt and manage the change, because Macs with Intel processors would be phased in over a two-year period.

Mac old-timers will recall that Apple accomplished a similar major processor transition in 1994. Back then, the switch was from Motorola's 68K processors to Motorola/IBM's PowerPC. For software developers, the transition's difficulty depended upon whether their programs used high-level code exclusively, or accessed lower-level system services. In the latter case, it required that developers make fundamental changes in how they wrote their code; some of these changes I helped document in the book The Power Macintosh Programmer's Guide (Hayden, 1994). As a consequence, Apple has ample prior experience to guide its migration to the x86 processor.

Pundits and others have already debated the wisdom and reasons for making the processor switch. The decision has been made; it's not worth repeating those arguments here. Seasoned Mac programmers are determining how painful and how expensive the transition is going to be. In this article, I examine the migration plan, and describe pitfalls you should be aware of.

Infrastructure Support

Migrating an application to the new hardware platform requires that there be a certain amount of infrastructure in place to support its development and execution. There are several key technical pillars that comprise this infrastructure. There must be:

  • A native version of the operating system available to provide the needed system services.
  • A mechanism to package an application's code for distribution and execution on disparate processors during the transition period. This scheme must be transparent to less tech-savvy users, or else you frustrate them when the application won't run because it's on the wrong platform.
  • Tools that translate the application's source code into the platform's native code.

At the WWDC announcement, Jobs revealed that, since its introduction in 2001, every PowerPC release of Mac OS X has had an x86-based Dopplegnger—a separate Intel version of the OS was quietly developed and maintained (see the accompanying text box entitled "A Portable Operating System," for how Mac OS X's design allowed this). Mac OS X 10.4, (aka Tiger), which was released this April for the PowerPC Macs, is slated to be the preliminary x86-based OS release. In short, the first infrastructure pillar is therefore already in place and has been tested for years.

The core of the Mac x86 platform's distribution mechanism is the universal binary file for Mac OS X applications. A universal binary carries two versions of the application—a version implemented as PowerPC machine code, and a version implemented in x86 machine code that is stored in a single, larger executable file. However, the application's GUI elements—TIFF images of buttons or other visual controls—can be shared between the two versions. Sharing these GUI elements, known as "resources," helps keep the universal binary application's size manageable.

The universal binary scheme is an extension of Mac OS X's Mach-O executable files (see http://developer.apple.com/ documentation/DeveloperTools/Conceptual/ MachORuntime/FileStructure/chapter_2.1_section_7.html#//apple_ref/doc/uid/ TP40000895-//apple_ref/doc/uid/20001298-154889-CJBCFJGH). Universal binary files consist of a simple archive that contains multiple Mach-O files; see Figure 1. Each Mach-O file contains contiguous bytes of an application's code and data for one processor type. Special file headers enable the platform's runtime to quickly locate the appropriate Mach-O image in the file. Listing One shows how this is done. The "fat" header identifies the file as a universal binary and specifies the number of "fat" architecture structures that follow it. Immediately past the fat header, each fat architecture data structure references the code for a different processor type in the file. These architecture structures supply the runtime with the CPU type, an offset to the start of the embedded Mach-O file within this file, and its size. When a universal binary application is launched, the operating system uses the headers to automatically locate, load, and execute the Mach-O file of the application that matches the Mac's processor type.

Universal binaries thus form the second support pillar, the distribution mechanism. Typical users are unaware of the dual sets of code packaged in the file, and they can copy the application by simply dragging and dropping it. When they launch the application, the Mac OS X runtime performs a sleight-of-hand that loads the appropriate application code from the universal binary file, then executes it. No matter what Mac it's installed on, a universal binary application thus executes at native speeds on either PowerPC- or x86-based systems.

Mac old-timers will recognize the universal binary scheme as an echo of the "fat" binary file format that handled software distribution during the Mac platform's transition to the PowerPC. Fat binaries consisted of two versions of the application—68K and PowerPC—and the Mac OS determined which version to load and run. The fat binary distribution scheme worked very well, and based on its success, I have high expectations that the universal binary scheme will work, too.

The third pillar, the development tools necessary to generate x86 code for the x86 Mac platform, is represented by Apple's Xcode 2.1 development tool set. Xcode consists of an IDE with GUI-based tools such as a source-code editor and debugger. It uses GCC 4 compilers to generate x86 machine code from C, C++, Objective-C, and Objective-C++ source code. Source-level debugging is possible through the use of the standard GDB tool. For x86 development, you'll need to install Xcode 2.1, along with the 10.4 Universal SDK. This SDK contains the APIs and header files that enable you to generate PowerPC code, x86 code, and universal binaries. Generating a universal binary becomes just a matter of selecting both PPC and Intel processors for code generation in Xcode's controls, and building the program. Metrowerks, whose CodeWarrior toolset helped Apple get through the 68K/PowerPC transition, will not be participating in this transition. The company has sold its x86 compiler and linker to a third party, and thus, the CodeWarrior toolset can't generate universal binaries.

For a limited time, Apple offered a Developer Transition Kit that contained the Xcode tools and universal SDK. For actual testing on the target platform, the kit also had a preliminary x86 hardware platform with a 3.6-GHz Pentium 4 processor, running a preview release of Mac OS X 4.1 for Intel.

Code Casualties

Apple has laid a solid foundation for making the migration possible. However, any programmer who's done a code port regards this plan with a healthy skepticism, because the infrastructure is still preliminary in some areas. More important, not all applications will be easy to port, and some applications will be left behind, due to design issues and costs. Let's see if we can't draw up a triage list of which applications are most likely to survive the transition.

First and foremost, any application ported to the x86 Mac platform must be a Mac OS X application. Fortunately, Mac OS X provides a wealth of different APIs for writing and migrating applications—there's Carbon, Cocoa, Java, and BSD UNIX. Table 1 provides a brief summary of the APIs that Mac OS X offers.

To start the triage list, it should be obvious that if you're writing a kernel extension, driver, or other low-level system service that requires intimate knowledge of the kernel plumbing or processor architecture, you've got a code rewrite ahead of you, no matter what API you use.

Mac OS 8/9 applications won't survive the transition unless they're ported to the Carbon API. Furthermore, existing Carbon apps that use the PowerPC-based Preferred Executable Format (PEF) will have to be rebuilt with Xcode 2.1 for conversion to the Mach-O executable format. The reason is that Mac OS X uses the dyld runtime, which is the native execution environment for both PowerPC and Intel Mac platforms. The dyld runtime uses the Mach-O format for its executable files, and as we've already learned, universal binaries rely on the Mach-O format to package PowerPC and x86 binary images.

Applications that use common system services should port easily. However, caveats abound. For example, how the two processors store data in memory can cause all sorts of problems even for simple applications.

Architecture's Impact

High-level application frameworks hide the gritty hardware details from developers to improve code portability and stability. When the platform's processor changes, fundamental differences in hardware behavior can ripple up through these frameworks and hurt you. Let's take a look at two of these differences and see how they affect porting a PowerPC application to the x86 platform.

One such problem is known as the "Endian issue" and occurs because of how the PowerPC and Intel processors arrange the byte order of data in memory. The PowerPC processor is Big endian in that it stores a data value's MSB in the lowest byte address, while the Intel is Little endian because it places the value's LSB in the lowest byte address. Normally, the Endian issue doesn't rear its ugly head unless you access multibyte variables through overlapping unions, use a constant as an offset into data structure, or use bitfields larger than a byte. In these situations, the position of bytes within the variable matter. Although the program's code executes flawlessly, the retrieved data is garbled due to where the processor placed the bytes in memory, and spurious results occur. To fix this problem, reference data in structures using field names and not offsets, and be prepared to swap bytes if necessary.

The Endian issue manifests itself another way when an application accesses data piecewise as bytes and then reassembles it into larger data variables or structures. This occurs when an application reads data from a file or through a network. The bad news is that any application that performs disk I/O on binary data (such as 16-bit audio stream), or uses network communications (such as e-mail or web browser), can be plagued by this problem. The good news is that each Mac OS X API provides a slew of methods that can perform the byte swapping for you. Consult the Universal Binary Programming Guidelines (http://developer.apple.com/documentation/MacOSX/Conceptual/ universal_binary/universal_binary.pdf) from Apple for details.

Another potential trap manifested by the Endian issue is if your Mac application uses custom resources. Mac OS X understands the structure of its own resources and will automatically perform any byte-swapping if required. However, for a custom resource whose contents are unknown to the OS, you will have to write a byte-swapping routine for it. Those applications that use CodeWarrior's PowerPlant framework require a byte-swapping routine to swap the custom PPob resources that this framework uses. Appendix E in the Universal Binary Programming Guidelines document has some example code that shows how to swap PPob resources, and this code serves as a guideline on how to write other byte-swapping routines.

It's In the Vector

Another major processor architectural issue is for those applications that make heavy use of the PowerPC's AltiVec instructions for scientific computing and video editing—some of the Mac's bread-and-butter applications. (AltiVec is a floating-point and integer SIMD instruction set referred to as "AltiVec" by Motorola, "VMX" by IBM, and "Velocity Engine" by Apple; see http://developer.apple.com/hardware/ ve/.) AltiVec consists of more than 160 special-purpose instructions that operate on vectors held in 32 128-bit data registers. A 128-bit vector may be composed of multiple elements, such as four 32-bit integers, eight 16-bit integers, or four 32-bit floats. The AltiVec instructions can perform a variety of Single Instruction Multiple Data (SIMD) arithmetic operations (multiply, multiply-add, and others) on these elements in parallel, yielding high-throughput data processing.

Applications relying on AltiVec instructions must be rewritten to use Intel's SIMD instructions, either its Multimedia extensions (MMX) instructions or its Streaming SIMD Extensions (SSE) instructions. There are several flavors of SSE instructions (SSE, SSE2, and SSE3) and they work on eight 128-bit data registers.

The ideal solution for this problem is to use Cocoa's Accelerate Framework. It implements vector operations transparent of the underlying hardware. An application that uses the Accelerate Framework can operate without modification on both Mac platforms. This framework provides a ready-made set of optimized algorithms for image processing, along with DSP algorithms for video processing, medical imaging, and other data-manipulation functions.

If you must port your AltiVec code to the x86 SSE instructions, on the plus side, Intel provides a high-level C interface that simplifies the use of these instructions. Another major plus is that you've already "vectorized" your high-level code for use with AltiVec, and these modifications apply to using SSE instructions as well. That is, you should have unrolled code loops and modified the data structures they reference to take advantage of the SIMD instruction's parallel processing capabilities.

The big negative to porting to SSE is that the rest of your code will need to be heavily revised due to the differences between the AltiVec and SSE instructions. For example, there's no direct correspondence in behavior between the AltiVec and x86 permute instructions. The magnitude of the shift performed by the AltiVec permute operation can be changed at runtime, while the x86 permute requires the magnitude be set at compile time. This makes it difficult for the x86 permute to clean up misaligned data accesses, especially for use with the SSE instructions themselves. In general, AltiVec instructions that execute on the vector complex integer unit (such as the multiply-accumulate operations) have no direct counterparts in the SSE instruction set, and these portions of the vector code will need the most work.

Returning to the triage list, even applications written in Cocoa and Carbon aren't immune to certain processor issues. Applications that do any file or network I/O will have to be examined and modified, due to the Endian issue. Even mundane applications that make use of special data structure will need to be checked carefully. Those applications that make use of AltiVec will have to be completely rewritten, either to the Accelerate Framework or to SSE3 instructions. Whether they survive the transition depends on how much it will cost to correct these issues.

A Real-World Example

How well will this transition go? Some early developer reports pegged the initial porting process at taking anywhere from 15 minutes to 24 hours, depending upon how well Apple's guidelines were followed when the application was written. Those developers whose applications were written in Cocoa usually experienced the least difficulty, which shouldn't come as a surprise because Cocoa was engineered from the ground up as an application framework for NeXTSTEP, which became Mac OS X.

Bare Bones Software's port of BBEdit, its industrial-strength programming editor, offers an interesting glimpse of the process (http://www.barebones.com/products/ bbedit/). Portions of BBEdit were written in C++ and use the Carbon API, while others portions were written in Objective-C and use the Cocoa API. It only took 24 hours to get BBEdit running on the Mac x86 platform. It helped that the files BBEdit works with—ASCII text files that consist of byte data—were Endian neutral.

However, BBEdit's developers emphasize that although they got the program running quickly, getting every feature to work reliably took another several weeks of work, especially for testing to ensure that the features worked reliably in the new environment. Still, considering that the application was executing on a completely different platform in a short amount of time and without requiring a total code rewrite, this augers well for many Mac OS X applications making the transition. In the end, time and developers will show us how well Apple managed the transition to the Intel x86 processor.

DDJ



Listing One

#define FAT_MAGIC   0xcafebabe
#define FAT_CIGAM   0xbebafeca  /* NXSwapLong(FAT_MAGIC) */

struct fat_header {
    uint32_t    magic;      /* FAT_MAGIC */
    uint32_t    nfat_arch;  /* number of structs that follow */
};
struct fat_arch {
    cpu_type_t  cputype;    /* cpu specifier (int) */
    cpu_subtype_t   cpusubtype; /* machine specifier (int) */
    uint32_t    offset;     /* file offset to this object file */
    uint32_t    size;       /* size of this object file */
    uint32_t    align;      /* alignment as a power of 2 */
};
Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video