Open Source

Cross-Platform Database Programming

By William Fairman and Randal Hoff, March 01, 1995

If you want to write software that's portable on platforms from supercomputers to embedded systems, you'll want to use the programming techniques presented here.

Function Wrappers

Another way to isolate your application from the specifics of the system or of third-party run-time libraries is to use function wrappers. These act as a layer between what your application needs to accomplish (say, adding a record to a database) and the particular function which will perform the desired action.

By placing all the wrapper functions in one module, you can change the underlying operations without affecting the many application modules which use these functions. However, while C++ makes it easy to modify the parameters used to invoke an action, C is more rigid. Therefore, to keep your application well insulated from the underlying functions, you must carefully select the parameters used in your wrapper functions. While you cannot ignore the parameter requirements of the underlying functions, you must make sure that the wrapper function parameters reflect the essential nature of your application and the function being called. A wrapper should not simply repeat the exact parameters used in the underlying function.

For example, in c-tree Plus AddRecord uses a small integer value to identify the data file involved. You may wish to use symbolic names to refer to data files. In this case, you would pass the symbolic name to the wrapper function, which would in turn call your own function to translate the name to a c-tree Plus file number. This same translation function would be used in many of the wrapper functions which call the c-tree Plus library.

Carefully selecting a naming convention for your wrapper functions simplifies the task of locating them if they must be modified. We would suggest, for example, that all your database wrapper functions begin with dbw_, followed by the desired action; say, dbw_AddRecord for the function to add a record.

Size and Alignment of Data Objects

The three most pressing issues related to moving data across platforms are structure alignment, size of data objects, and byte order of binary values.

Different hardware architectures and different C compilers enforce different alignment restrictions on various data types. An alignment restriction refers to the legitimate addresses at which a data object can be referenced. For instance, if a CPU can only address integers on even boundaries, integers are "word-aligned." Attempting to reference an integer on an odd boundary (that is, its beginning address is odd) would probably cause a system exception. Generally, a data object is at most restricted to an address boundary no larger than the data object itself. For instance, a 4-byte integer will at most be required to be aligned on a 4-byte (double-word aligned) boundary while a 2-byte integer on the same machine will, at worst, be restricted to a 2-byte boundary.

For information which only exists temporarily in memory, alignment restrictions are not a concern. But if your data structures are not carefully planned, then information stored on disk may not be usable across different platforms: The position of members within a data structure will change between platforms, and/or the size of the data structures will be different across platforms. To avoid these dilemmas, we:

Create a set of constant size typedefs for basic data items (as discussed earlier for COUNT and LONG).
Place members in structures to encourage "automatic" alignment, and use explicit padding between members as necessary.
Add padding to the end of a structure, if necessary, to keep the size of the structure a multiple of its largest-sized data type.

The first step implies that we discourage the use of natural integers as part of data structures used for permanent storage. If moving your data across platforms is not important, then this is not an issue. (Some application developers will be more than satisfied if the application is portable, with no regard to the portability of the data. They do not expect the data to be moved from platform to platform.)

The second step implies that the largest data items be placed first in the structure, or that shorter data objects be grouped together to form clusters, the size of the most restricted alignment requirement. For example, if the largest member of a data structure is a 4-byte integer, then the 4-byte integers should be at the beginning of the structure. If you wish to place shorter members at the beginning, then group them in clusters which are multiples of four bytes. Note that character arrays are treated (along with individual characters) as the smallest data types, and should occur at the end of the structures.

The third step is necessary to ensure no size difference across platforms regardless of whether padding was required between structure members.

Two good examples of proper alignment techniques are shown in Example 2. (Note that UTEXT represents a 1-byte unsigned character and TEXT, a signed character.)

Example 2: Techniques for maintaining proper alignment.

(a)

typedef struct invent_record {
      LONG       invent_id;
      LONG       invent_level;
      LONG       invent_reorder;
      COUNT      invent_status;
      COUNT      invent_bin;
} INVENT_RECORD;

(b)

typedef struct vendor_record {
      COUNT      vendor_type;       /* The first three members  */
      UTEXT      vendor_status;     /* of this structure use    */
      UTEXT      vendor_reserved;   /* precisely four bytes.    */
      LONG       vendor_acc_pay;
      TEXT       vendor_name[58];
      TEXT       vendor_padding[2]; /* Keep struct multiple of 4*/
} VENDOR_RECORD;

If you do not follow this strategy, compilers on various platforms may be forced to insert padding bytes in front of some structure members to force the required alignment. Further, the size of the structures may vary from platform to platform. The structure in Example 3 may result in an 8-byte structure on a byte-aligned platform and a 12-byte structure on a double-word-aligned platform. On a double-word platform, three bytes of padding would be inserted before the customer_acc_rcv member and one byte of padding before the customer_zone member.

Example 3: The structure size in this example depends upon the byte alignment of the platform.

typedef struct customer_record {
     UTEXT     customer_status;
     LONG      customer_acc_rcv;
     UTEXT     customer_priority;
     COUNT     customer_zone;
} CUSTOMER_RECORD;

Finally, we strongly suggest omitting pointers to other structures within data structures used for permanent disk storage. While the use of pointers within structures is a very powerful and useful technique in C programming, we discourage it for actual data-storage structures. The size of pointers varies across platforms from as small as two bytes to as large as eight bytes, and the values of address pointers lose their meaning once the structure is placed on disk.

Binary Word Order

CPUs differ in the manner in which integers and floating-point values are stored in memory. On Little-endian machines, the lowest-order byte is stored in the first byte of the integer, and the most significant byte is stored last. Such CPUs include the Intel family of processors and the new DEC Alpha processors. On Big-endian machines, the highest-order byte is stored in the first byte, and the least-significant byte is stored last. These CPUs include the Motorola 68000 family of processors and the IBM RS/6000 family. (In some unusual circumstances, a binary value may be a mixture of these strategies.)

While most application code is totally independent of the internal word ordering, this difference does pose a problem when moving application data across platforms. Such a move results in invalid binary values if the binary word ordering is different. c-tree Plus uses two different strategies to deal with this problem. One is to store the binary data on disk in the same order regardless of the platform's internal order. c-tree Plus uses this approach for its nonserver implementations, and stores the data in the Little-endian order (because of the great preponderance of Intel processors). The second strategy, employed with client/server implementations of c-tree Plus, stores the data in the server's native ordering. This places the burden for transforming byte ordering onto the client processors, relieving the server processor of this overhead.

To permit c-tree Plus to automatically perform the byte-order transformations on application data, we take advantage of c-tree's ability to store resources in data files. c-tree Plus allows you to specify the field types of your data records in a resource stored within the data file. When the data is accessed, the field type information directs any necessary transformations. Also, if the data file is moved, it is still possible to interpret the data properly.

Summary

Careful organization and isolation of your application code from user and file-handling interfaces can significantly reduce the effort required to move your application code from one platform to another. Creating a test program sensitive to the platform-dependent elements of your application will further reduce the time and problems encountered in moving the code. With each port, you become more attuned to the issues of portability, and can further refine your strategy.

By defining basic computational data objects which are size invariant across platforms, and by constructing stable, well organized data structures, your applications will even be able to share data across different platforms, or use data stored on different platforms.

William is the founder of FairCom Corp. and senior developer of c-tree, c-tree Plus, r-tree, and the FairCom Server. Randal is FairCom's director of technical operations.

Previous 1 2 3

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Open Source

Cross-Platform Database Programming

Function Wrappers

Size and Alignment of Data Objects

Example 2: Techniques for maintaining proper alignment.

Example 3: The structure size in this example depends upon the byte alignment of the platform.

Binary Word Order

Summary

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Open Source Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Open Source

Cross-Platform Database Programming

Function Wrappers

Size and Alignment of Data Objects

Example 2: Techniques for maintaining proper alignment.

Example 3: The structure size in this example depends upon the byte alignment of the platform.

Binary Word Order

Summary

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Open Source Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content