Channels ▼
RSS

Design

Porting to 64-bit Platforms


Analysis

There are two goals to reach in order to make your program architecture-independent. One of them is internal consistency, i.e. making sure that all the types used in the program are inter-consistent regardless of whether the code is being compiled on a 32-bit or 64-bit machine. Let’s look at sizes of the basic types on both architectures (see Table 1).

Table 1: Base integer types in 32-bit and 64-bit systems.

The notable difference here is change of sizes of pointer types and long type. Because of size changes some code built upon implicit dependencies might not migrate correctly. Possible problems are mostly related to data interchange between integers and other integers or pointers.

Discrepancies of Integer Sizes

On 32-bit system, int and long types are both 32-bit, which means that programmers may use both types interchangeably. Furthermore, C does not have strict typing rules so int parameters in a function may be passed long parameter without error. While this is not a good coding practice, this code might work flawlessly for years. However in 64-bit system int and long have different sizes and the code would not work correctly after migration.

Furthermore, assigning long values to int variables may cause truncation. The following code would work fine in 32-bit systems, but not in 64-bit systems.


 int sum;
 long val1, val2;
 sum = val1 + val2;  // possible truncation in 64-bit system

The following code will probably cause compiler warnings in some compilers, but this code, which would work in 32-bit mode would not work in 64-bit mode because sizes of values on the stack change.


 int add(int parm1, int parm2);
 // .....
 long val1, val2;
 long sum = add(val1,val2);    // mismatch of parm size on 64-bit machine

Some other subtle things may happen that would affect the integrity of the code. For example, compile-time function sizeof() returns result of the type size_t, which has the size of the pointer. Assigning this value to the integer may lead to possible (if not very likely) truncation.


 int mySize = sizeof(MyStructure);  // possible size truncation

Some other subtle things may happen that would affect integrity of the code. For example, compile-time function sizeof() returns a result of the type size_t, which has the size of the pointer.

Another even more subtle error is sign extension for bit fields. For example:


 struct Mask {
   unsigned int flags:20, width:12;
 };
 
 struct Mask myMask;
 myMask.flags = 0x80000;
 unsigned long bigFlags = (myMask.flags << 12);

On a 32-bit machine, unsigned value 0x80000 would turn into 0x80000000 after a shift, and after an assignment to unsigned long, it would produce the same value 0x80000000.

Consequently, on a 64-bit machine, you would expect that value 0x80000000 will be assigned to an unsigned long and would become 0x0000000080000000.

However, the resulting value will be 0xFFFFFFFF80000000. It doesn't seem to make much sense, but that's what the sequence of statements above will actually produce.

Since there is no inherent 20-bit integer type, myMask.flags will be promoted automatically to 32-bit. Since we need only 20 bits, there is room for a sign bit. Therefore flags will be promoted to int -- not to unsigned int -- because of implicit conversion rules (ISO C). If this value later gets assigned to unsigned long, the sign bit gets propagated.

While we cannot change the default behavior of the C compiler, we can be aware of its behavior and prevent this type of error by implicit casting:


  unsigned long bigFlags = (unsigned int)(myMask.flags << 12);

Casting Between Pointers and Integers

While casting between pointers and integers is not considered a good programming practice (except for traditional casting of number 0 to a NULL pointer), it is still widely used. Casting between pointers and integers is error-prone and could create portability problems when moving from 32-bit to 64-bit architectures.

There is a safe way of converting integers to pointers and back without explicit casting. These macros only cast a constant 0 into a pointer.


  #define INT2PTR(type,n) ((type *)(((char*)0)+(n)))       
  #define PTR2INT(p) (((char*)p)-((char*)0))

 char* p = INT2PTR(char,160);     // creating a char pointer with address 160
 size_t displ = PTR2INT(&obj1);   // getting numeric value of the address 

These macros use pointer arithmetic and don't make any assumptions about relative sizes of pointers and numeric values. These macros are architecture-independent and will work on every system. Usage of these kinds of macros will eliminate virtually all pointer-related issues, reducing most portability problems to integer issues described in the previous paragraph.

If it is impractical to get rid of direct casts between integers and pointers, many of the issues related to integers might be also relevant to pointers.

Structures for Inter-program Communication.

While memory sizes and layouts may be different between different architectures, the same code may work properly on each of the architectures. Pointers would be 4 bytes long on 32-bit machines and 8 bytes long on 64-bit machines, but as long as those sizes are relevant only to the running processes, there is no problem.

However sometimes programs need to communicate with the outside world using binary data. Data written by 32-bit program might be read by 64-bit program. In such cases, the input/output binary structures should contain only those types that are immune to 32-bit to 64-bit migration.

There are two types of integer types ;- "conceptual" types and "physical" types. The first group, conceptual, is mostly defined in sys/types.h. It includes types like uint, u_char, u_long, size_t, uid_t, and many other types. The purpose of the types is to represent logic and meaning of those types without emphasizing any physical characteristics like size, signedness, etc. While these types are quite useful within one system, they should not be used when binary data is passed to a different system.

For example, if a structure written to the file has binary fields, those fields have a fixed size regardless of which system is going to read them. Shared structures between systems MUST explicitly specify size of the fields. For this purpose there is a header inttypes.h, which contains available types with explicit size. These types might be implemented using different base types on each system, but they guarantee appropriate sizes of the fields regardless of the system. Those types include uint8_t, uint16_t, uint32_t, uint64_t, etc. All output structures should enforce using only these types to make code immune to migration.

Rules for Padding

While not exactly related to 32/64-bit issues, a related problem involves structures and padding. Structures are normally padded to a certain boundary. For example, if a one-byte character field is followed by a four byte word, the compiler adds three slack bytes so that the four byte word has a displacement which is a multiple of four. Alignment is defined normally through compiler options. Therefore, structures that are common between 32-bit and 64-bit machines must have matching compiler alignment settings.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video