Compile and Resolve Diagnosed Issues
Compiling the application in big-endian mode still results in a mixed-endian application because the application interacts with system libraries whose byte order is typically dependent on the target architecture, which in this case is little-endian. With the BEC technology it is possible to execute mixed-endian code. One of the requirements when employing the BEC technology is for every type used in the program to have its byte order defined. When the byte order of a type associated with a variable, a parameter, or a structure is undeclared, the compiler uses the default byte order as specified earlier. C and C++ are not type-safe languages and therefore present some challenges in the compiler implementation.
For instance, when a function's prototype is not declared, the function will execute correctly as long as the byte order of the types used in the function declaration and function call are the same. If the byte order does not match, the function may not execute properly because the function argument may not be in the expected byte order. For these cases, the compiler is enhanced to detect such byte order inconsistency and issue a diagnostic. Example 3 shows such an example where the function foo calls an undeclared function zee. Assume function zee is defined to expect one argument whose type is in little-endian byte order. In this case, the function zee may produce an incorrect result for function foo when the function foo is compiled to expect big-endian types. In general, for mixed-endian code use of function prototypes is essential. The warning message #266 generated from the compiler is to encourage such programming practice.
int foo(int x)
{
zee(x);
return 0;
}
>$ icc -big-endian -c test.c
test.c(3): warning #266: function "zee" declared implicitly
zee(x);
Example 3: Undeclared function (Source: Intel Corporation),
The byte order of a pointer has two aspects, the byte order of the pointer type itself and the byte order of the pointed-to data type. For example, Example 4 shows a case where the compiler generates a diagnostic when the byte order of the pointed-to data type is different in an assignment statement.
#pragma byte_order (push, bigendian)
int *z;
#pragma byte_order (pop)
int foo(int *x) {
z = x;
return 0;
}
>$ icc -c -littleendian test1.c
test1.c(6): warning #1696: implicit pointer conversion changes byte order of
the pointed-to types from "int" to "bigendian int"
z = x;
Example 4: Byte order difference of pointed-to types (Source: Intel Corporation).
Specifically, the pointer, z, points to a big-endian integer, which is declared by the explicit pragma declaration. The pointer, x, points to a little-endian integer whose byte order is specified by the command line. Therefore, dereferencing of the assigned pointer (z) may result in reading the value in the opposite byte order as was intended. Warning #1696 is emitted by the compiler in these cases and must be heeded in order to produce correctly executing code. To address these warnings, the code should be modified to convert the pointed to data to have a type with the same byte order before assigning. For structures and bit fields the compiler also implements byte order-specific warnings. Example 5 lists a structure in which the big-endian and little-endian bit fields are allocated differently in their containers. Big-endian bit fields are allocated from high to low bit, while little-endian bit fields are allocated from low to high bit. As a result, big-endian and little-endian bit fields allocated to the same container could potentially overwrite each other. Therefore, the compiler issues an error when it detects a structure containing both big-endian and little-endian bit fields. To address these issues, ensure all bit fields contained in the structure have the same byte order.
typedef __attribute__((bigendian)) int be_int;
typedef __attribute__((littleendian)) int le_int;
struct foo {
be_int x:16;
le_int y:16;
};
>$ icc -c -little-endian test2.c:
Test2.c(6): error #1700: adjacent bit fields have different byte order
le_int y:16;
Example 5: Byte order difference of Bit fields (Source: Intel Corporation).
The BEC technology allows a byte order to be attributed to the void type. This extension alleviates potential issues casting through the void type where the original and final casted types are of opposite byte order. Example 6 shows a potential error case caused by the void type. The specific issue is that a void pointer defined in a big-endian context is passed to a function that expects a pointer that points to a little-endian variable. There is a risk that the void pointer points to a big-endian variable. To alleviate this problem, the compiler is enhanced to issue a diagnostic in this case. Since this checking is off by default, option -diag-enable 2324 should be used to turn it on. To address this issue, ensure the pointed-to type for each function argument and function parameter has the same byte order.
#pragma byte_order (push, littleendian)
typedef int myleint;
void func(myleint *arg);
#pragma byte_order (pop)
#pragma byte_order (push, bigendian)
void *void_var1;
#pragma byte_order(pop)
int main() {
func(void_var1);
return 0;
}
>$ icc -c -little-endian -diag-enable 2324 test3.c
test3.c(10): warning: implicit pointer conversion (involving void) may change
byte order of the pointed-to types from "bigendian void" to "myleint" func(void_var1);
Example 6: Byte order difference of void type (Source: Intel Corporation).
Finally, conversion between pointers to values of different sizes (for example from int * to char *), while safe in code that employs little-endian types, may result in an incorrect pointed-to value in code that employs big-endian types and executes on a little-endian processor. Standard compiler diagnostics are extended to account for pointer casts of different sizes. One method of addressing this class of issue is to explicitly convert the source and destination pointed to values to little endian before the cast.
Employ Symbol Consistency Checking
The second step in porting is to employ the symbol consistency checking utility and resolve identified incompatibilities between different compilation units. To perform symbol consistency checking, compile the sources with the -symcheck option and feed the resulting executable to the BEC symbol consistency checking tool.
Consider cases where symbol incompatibility exists and why they would cause problems for applications. As was previously described, the compiler makes automatic code adjustments based on the byte order information available through type declarations and function prototypes. While the compiler checks for correctness in a compilation unit, it knows nothing about interactions across different units. The BEC symbol consistency checking tool verifies that global symbols referencing the same variable have bi-endian compatible types across all compilation units and reports incompatibilities if they exist. Two types are considered bi-endian compatible when they are compatible according to the C language specification and have either the same byte order or are byte order agnostic. Thus, if a global symbol has type A in one compilation unit and type B in a second, the types should be bi-endian compatible to guarantee proper functioning of an application. The BEC symbol consistency checking tool verifies that global symbols with the same name have bi-endian compatible types across all compilation units compiled with -symcheck and reports incompatibilities.
An example of bi-endian incompatible types is the case of two types with different byte order. Assume you have included header file "i.h" with a declaration of a global variable g in two compilation units compiled with different default byte order. As a result, the symbol g is attributed as little-endian in one compilation unit and big-endian in another. The BEC symbol consistency checking tool reports the error listed in Example 7.
To address the identified issue, the declarations would need to be modified to have the same byte order.
bepostld: error #32600: symbol 'g' type differs between modules
First declared at t2.c(2)
Later redeclared at i.h
included from t1.c(1)
Error reason:
incompatible endianess:
__attribute__((littleendian))
declared at t2.c(2)
is not compatible with __attribute__((bigendian))
declared at i.h
included from t1.c(1)
Example 7: Symbol consistency error example (Source: Intel Corporation).
The utility diagnoses other problems that may affect application correctness including mismatching type size, number of function arguments, and field offset. Typically, to help ensure correct execution of the application, one needs to address all the errors reported by the tool. These diagnostics serve a useful purpose in helping the developer correct issues before they result in difficult-to-debug execution time problems.
Manual Review and Debug
The third porting step is to conduct a manual review of the code and debug using BEC-enabled debuggers. There are parts of the code that cannot always be automatically checked for byte order consistency. For example, Example 8 illustrates a case where inlined assembler code that manipulates C variables receives automatic adjustment of the byte order by the compiler.
However if the source code of the application contains assembler instructions that directly manipulate big-endian data in memory, the source code must be modified to assume the little-endian byte order of the target architecture.
#include <stdio.h>
int main(){
int __attribute__((littleendian)) i = 1;
int __attribute__((bigendian)) j = 0;
// The asm code below sets j = i;
asm ("movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(j)
:"r"(i)
:"%eax");
printf("i = %d, j = %d\n", i, j);
return 0;
}
$ icc -o test.exe -big-endian test.c
$ ./test.exe i = 1, j = 1
Example 8: Assembly language byte order example (Source: Intel Corporation).
Other examples of problematic code that require manual review are: overlapping union fields specified to have different byte order, bit fields operated on in bulk, and other operations on data that disregard the specified data type.
To help users catch byte order-specific issues the Intel debugger (idb) is employed. It is capable of displaying values of various byte orders correctly. Consider two global variables of different byte order defined in a source file test.c as listed in Example 9.
int __attribute__((bigendian)) bi = 1; int __attribute__((littleendian)) li = 1;
Example 9: Byte order of global variables (Source: Intel Corporation).
The option -debug biendian enables the compiler to produce additional debug information for correct display of big-endian data:
icc -g -debug biendian -big-endian test.c
After compilation the executable can be debugged using idb. A debug session will display the byte order and correct value as shown in Example 10.
(idb) whatis bi type = int __be (idb) p bi $1 = 1 (idb) whatis li type = int (idb) p li $2 = 1 (idb)
Example 10: Debugger console view of variables (Source: Intel Corporation).
Porting Effort
Table 1 details statistics on the effort to port various applications. These applications were ported by compiling them with the -big-endian option and following the steps detailed previously. 19 C/C++ SPEC 2006 applications were ported.
| Application | Total Lines of Code | Modified Lines of Code |
|---|---|---|
| 400.perlbench | 154034 | 77 |
| 401.bzip2 | 7057 | 0 |
| 403.gcc | 457947 | 185 |
| 429.mfc | 2057 | 0 |
| 433.milc | 13192 | 8 |
| 444.namd | 4589 | 8 |
| 445.gobmk | 174467 | 68 |
| 447.dealII | 176393 | 20 |
| 450.soplex | 36829 | 0 |
| 453.povray | 130551 | 48 |
| 456.hmmr | 32692 | 17 |
| 458.sjeng | 12273 | 10 |
| 462.libquantum | 3391 | 0 |
| 464.h264ref | 46046 | 19 |
| 470.lbm | 975 | 6 |
| 471.omnetpp | 40200 | 75 |
| 473.astar | 4467 | 1 |
| 482.sphinx3 | 22438 | 54 |
| 483.xalancbmk | 463252 | 86 |
Table 1: Porting statistics (Data source: Intel Corporation, 2011).
As shown, half of the applications required changes amounting to 10 lines of code or less. Large applications require an increased number of changes, but these changes are still relatively small compared with the overall size of the application. All of these applications except one executed correctly after completion of the first porting step (address the default bi-endian-specific compiler diagnostics). One remaining application required manual changes. The overall porting of most of the applications took less than one hour each (for an experienced BEC user); the most complex cases took up to a day. A typical amount of training for a new user amounts to approximately three days.


