Channels ▼
RSS

Database

Memory-mapped File I/O

Source Code Accompanies This Article. Download It Now.


SP93: Memory-mapped File I/O

Memory-mapped File I/O

Fast executables and reduced development time for Windows NT

Doug Huffman

Doug, who is co-owner of FlashTek, was both the author of the 32-bit Zortech DOSX extender and the project leader for the FlashTek X-32VM DOS extender. He can be contacted at 208-476-4781 or doug@proto.com.


The full power of the 80386/486 processor is just beginning to be tapped by operating systems available today. Multitasking, virtual memory, huge address space, multiple levels of protection, and more are available either through completely new operating systems such as Windows NT, OS/2 2.0, or through DOS extenders.

So much emphasis has been placed on 386 features for operating-system designers to work with, we often overlook the features present in an 386 processor that can be used to make life easier for application designers. One such feature is the ability to remap linear address space so that reading and writing from memory becomes equivalent to reading and writing from a device such as a disk drive. This feature of the 386 was primarily intended for implementing virtual memory, whereby the operating system emulates additional memory by swapping portions of memory to disk and remapping memory to locations where the application is attempting to read or write to memory.

Virtual Memory

Writing an application that runs under a well-optimized virtual-memory (VM) manager can be a real high for the programmer first experiencing it. Suddenly, the free memory available is roughly equal to free disk space, and physical RAM is remapped, shuffled about and reinitialized from disk, transparent to the application. As the huge array of numbers in "memory" is crunched, altered, and processed by your application, the machine takes on a life of its own: The hard drive is busy retrieving and storing data from the less-commonly accessed portions of the linear address space, while the more-commonly accessed portions of code, data, and stack usually stay in memory and are seldom, if ever, swapped out.

It seems that we may never get enough memory to make us content, but we must sometimes stop and take a good look at just what we are doing with all of this memory. Where does all the information come from that fills up memory? In most applications, nearly all information originates on disk. Memory space is used to hold the executable code and the data the application is processing, both of which must be loaded from disk prior to the application utilizing it. With this in mind, it is apparent that providing an efficient link between the application and the disk drive is essential.

Virtual memory is one such link between the disk drive and the computer's memory. The application's main link to disk is typically in the form of reads from disk to memory or writes from memory to disk. In many situations, the application's efforts at reading and writing files are not well coordinated with the operating system's attempt to provide efficient virtual memory. For example, suppose the application is reading a large file into memory in preparation for processing data. If the size of the read exceeds the available physical memory, the VM manager will start swapping data to a swap file on disk. The net result is often silly because while the application is reading data from a file into memory, the VM manager is busy swapping the same data back out to a swap file. The net result can be that the data is read from the original file, written to a swap file, and then, when the application accesses the data, the operating system must read it from the swap file. This obviously is not optimum for either execution time or efficient use of disk space. If an application attempts to load files totaling several Mbytes in size into VM on a system with a nearly full hard drive, there simply won't be enough free disk space available for swap space. When used as just described, a conventional VM manager essentially stores duplicates of files read from disk in the swap file, soaking up disk space and execution time. Many databases today operate on machines on which the disk drive(s) are nearly filled with huge files, and it is clearly impractical to read all required files into VM since there isn't space for a swap file of sufficient size. Even if there were, the execution time would be severely impacted with this simple approach.

Because of the aforementioned limitations, most programmers working with large data files are forced to read in a small portion of each file as needed. To optimize the speed of this type of application, the programmer must keep records of which portions of which files have been read into memory, and must keep commonly accessed portions of files in memory permanently while overwriting less-commonly needed portions with other data. The program must keep records indicating which portions need to be written back to disk, and so on. This can be a challenging, time-consuming problem, and many an hour has been spent trying to optimize and debug programs of this nature.

MMFIO

A much simpler approach is memory-mapped file I/O (MMFIO). This is supported by modern 386 operating systems such as Windows NT and some versions of UNIX, and is also available to DOS programmers through 32-bit DOS extenders such as Phar Lap's 386|VMM and FlashTek's X-32VM DOS extender. MMFIO is a feature whereby a file can be mapped into a linear address space with a simple call to the operating system. It represents a cooperative effort between the VM manager and the application to optimize the speed of the application, reduce total disk I/O, and conserve valuable disk space. Perhaps the most important advantage of MMFIO is that it usually results in the simplest implementation of a disk I/O intensive application.

Once a file is mapped into memory, the application can read and write from that file simply by reading and writing to memory. The operating system's VM manager deals with the task of reading only those portions of the file that are accessed and writing back only those portions that are altered. Most 386 VM managers work with 4-Kbyte chunks of memory called pages. Commonly accessed pages of a file usually stay in memory continuously, while seldom-accessed pages are swapped to disk

if required. A well-optimized virtual- memory manager provides runtime optimization, controlling which pages of a file are kept in memory and which are swapped to disk. This takes into account the amount of physical RAM available on the machine and which portions of a file are frequently accessed. It provides a bias towards not swapping modified pages, since those require the extra step of writing the page to disk while unmodified pages already exist on disk. (The 386 processor automatically marks which pages are modified, or "dirty," in the operating system's page tables.) MMFIO not only saves the application designer many hours of hard work when attempting to process files larger than physical RAM, it can often result in faster execution time than is possible with conventional solutions to the problem.

An Example

To illustrate, I'll examine a hypothetical application called prog.exe--a 5-Mbyte executable that manages a customer list, reads and writes to two files on disk, and is operating on a system with 2 Mbytes or less of RAM. I'll map in a file called customer.dat, a large list of customers with names, addresses, phone numbers, and other data contained in thousands of structures (see Listing One, page 18). I'll also map in state.dat, which contains configuration information indicating how the application was last used: which menu was in use, which screen colors were selected, what macros were set up, and so on. state.dat is used essentially as nonvolatile RAM to store the state of the application.

I tested the code discussed here with both the Zortech and Watcom 32-bit compilers in combination with the FlashTek X-32VM DOS extender. The functions _x32_mmfio_open and _x32_

mmfio_flush are provided with the X-32VM DOS extender but similar functions exist for other operating systems supporting MMFIO. For example, refer to the Windows NT functions MapViewOfFile and FlushViewOfFile in the accompanying textbox entitled, "Memory-mapped Files for Windows NT."

When the operating system is initializing, it sets up a swap file to support the memory allocated by malloc and the stack space; I'll refer to it as swapfile.dat. X-32VM also keeps the .exe file open for VM purposes and uses it as a read-only file to access pages of code and static data that the application doesn't modify. It will automatically switch any pages modified by the application over to the swapfile, which is a read/write file. prog.exe calls the function map_files (see Listing Two, page 18) during initialization. This function opens the files and maps them into the linear address space. The function _x32_mmfio_open is passed an open file handle and the size of the address space to map in. Note that the requested address space can be larger than the actual file size. This allows the file to be expanded if more customers are to be added to the list. In this case, both files are opened in read/write mode, but read-only file handles can be used only if the application treats that region in memory as read only. _x32_mmfio_open returns a near pointer to the address space in which the file is mapped, this pointer is stored for future reference to the file.

Once map_files returns, we have a situation similar to the one shown in Figure 1. The operating system has four files open, supporting over 165 Mbytes of memory space. If this is running on a system with only 2 Mbytes of total RAM, it can't even hold the entire executable in physical memory. Therefore, we obviously have only a small portion of these files loaded into memory at any given time. In fact, when map_files returns, the operating system has only loaded a small portion of the exe, the portion containing initialization code, and has not yet loaded any portion of the files state.dat and customer.dat.

The function prnt_name in Listing Three (page 18) outputs a customer name to stdout using printf. The customer name is a zero-terminated ASCII string contained in the data structures within the file customer.dat. The function is passed an index number that indicates which customer we are currently working with, and it merely prints the customer's name and returns.

When the application first attempts to read or write to a page of memory in the MMFIO address space, the 386 issues a page fault to inform the operating system that the application has attempted to access a location that is "not present." The operating system handles the page fault by mapping a page of RAM into the desired address space and initializing the RAM with information from disk if it exists on disk. The same thing happens as the application branches to pages of executable code that have not yet been accessed: The operating system reads code into memory only as required. Once the operating system has all available physical RAM in use, it has to start swapping pages of RAM each time a page fault occurs.

The RAM containing initialization code that is normally only accessed once is a prime candidate to be swapped out. The operating system monitors how often each page of code and data is accessed, and attempts to optimize performance based on that information. By reading from disk only when required, only writing information back to disk when it has been modified, and only flushing data from memory when a page of RAM is needed elsewhere, applications based on MMFIO can sometimes run with startling speed. The first time I saw an application requiring over 1 Mbyte of address space run on a system with no available extended memory, I was shocked. It ran to completion in the wink of an eye--it normally takes longer than that just to load an exe file of that size. It took a few minutes for me to realize that only a handful of 4-Kbyte pages had actually been accessed in that particular run, and so only a small portion of the executable file had been loaded from disk, thus saving considerable time.

Before prog.exe terminates, all new information needs to be written to the MMFIO space. Since many of the changes made to the data in the

MMFIO space may exist only in RAM, operating systems supporting MMFIO must provide a means of flushing any modified pages to disk. Windows NT provides this feature via the function FlushViewOfFile while X-32VM provides the function _x32_mmfio_flush. These functions update specific MMFIO files but do not close or unmap the files. Only the modified pages are written to disk and then only those not already swapped to disk during normal VM swapping activity. Critical changes can be flushed to disk as they are made, thus ensuring file integrity in case of unexpected termination such as power failure.

MMFIO is not just useful in large applications when handling large files. It can also be used in small and medium-sized applications where enough memory is often available to read the entire file into RAM if required. If there is sufficient RAM to hold all required data, nothing is ever swapped out, and yet only those pages accessed will be read so it still often results in an optimum application.

Conclusion

Using MMFIO, an application running on a system with a few Mbytes of physical memory and a nearly full hard drive can map hundreds of Mbytes of files into linear memory and then read and write to each file as simply as reading and writing from an array in memory. This results in a fast and efficient application with a minimum of development effort.

Figure 1: MMFIO can be viewed as a link between memory and the system's hard drives.

Sorting Files with NT's Memory-mapped File I/O

Eric Bergman-Terrell

Eric is a freelance programmer and can be contacted at 8547 E. Arapahoe Road, Suite J-147, Greenwood Village, CO 80112.


Memory-mapped file I/O lets Windows NT programmers associate a region of virtual memory with a disk file, such that any access of the memory will access the associated file data. The program presented here, mmf_sort, uses memory-mapped file I/O to sort files containing fixed-length records. Mmf_sort (see Listing Four, page 19), which was compiled and tested with Version 0.00.3043d of the Microsoft NT C/C++ compiler and the March pre-release of Windows NT, requires a sort definition file that describes the fields of the file being sorted. For example, to sort a file containing DATA records as in Example 1(a), you could use the sort definition file in Example 1(b). Comments have the # character in the first column. Each non-comment line specifies a field name, sort order, size, alignment, count, a/d flag, and the name of an optional comparison function.

The field name is an arbitrary identifier for a record field. The sort order specifies the order in which fields are compared. A field with a sort order of 1 is compared first, followed by a field with a sort order of 2, and so on. Consider a file containing the records in Figure 2(a). If chr had a sort order of 1 and lng had a sort order of 2, the file would be sorted as in Figure 2(b). If lng had a sort order of 1 and chr had a sort order of 2, the file would be sorted as in Figure 2(c).

Many computer systems require that 2-byte integers are stored in even addresses and that 4-byte integers are stored in addresses that are multiples of 4. For example, Table 1 shows how the DATA structure in Example 1 is stored in memory on a 386/486 using the Microsoft NT C/C++ compiler. Notice that two padding bytes were inserted in the structure to force lng, a 4-byte integer, to be stored on an address that is a multiple of 4.

Use an alignment value of 2 for fields that must be stored in even addresses, an alignment value of 4 for fields that must be stored in addresses that are multiples of 4, and so on. Use an alignment value of 1 for fields that can be stored in any address. Most computer systems can store 1-byte quantities in any address. After padding the individual fields, mmf_sort automatically pads the record size to be a multiple of the largest alignment value.

The record fields sorted by mmf_sort can be arrays. The count entry specifies the number of elements in the field. For example, in Example 1 the cnt entry for chr is 2 since chr is a two-element array.

Fields can be sorted in ascending or descending order. Use an a/d entry of A to sort fields in ascending order, and an entry of D to sort fields in descending order. The a/d entry can be upper or lowercase.

Mmf_sort is run from the NT command prompt. The first argument specifies the name of the file to be sorted, and the second specifies the name of the sort definition file. The file to be sorted is opened and its size is calculated. After the sort definition file is read the record size is known and the program verifies that the file size is a multiple of the record size.

Calls to CreateFileMapping() and MapViewOfFile() associate the file_addr pointer variable with the contents of the file to be sorted. Any accesses of file_addr accesses the corresponding data in the file being sorted. The standard C library, qsort(), sorts the file using compare(), a comparison function based on the sort definition file.

The call to UnMapViewOfFile() breaks the association between file_addr and the file being sorted and also flushes all changes to the file being sorted. After the records-sorted-per-second is calculated and output, the program exits.

In sort_def.h (Listing Five, page 19), the FIELD_INFO structure contains all the data specified in one line of the sort definition file. The SORT_DEF structure contains all the data specified in the entire sort definition file as well as the number of fields and the record size. The SORT_DEF structure is used by the compare() function to sort records.

In sort_def.c (Listing Six, page 19) read_sort_def() stores the data from each sort definition file entry in the SORT_DEF structure. Lines beginning with # are comments. The comparison function entry in the sort definition file is optional. If specified, a pointer to the function is found by function get_cmp_fcn_ptr() and stored in the SORT_DEF structure. Otherwise a NULL function pointer is stored.

The max_alignment variable keeps track of the largest alignment value. C structures are padded to be a multiple of the largest alignment value of any structure field. Consequently, mmf_sort uses the max_alignment variable to pad the record size.

After the contents of the SORT_DEF structure are printed, it is sorted using qsort() so that fields will be compared in the correct order.

Listing Seven, page 19 (compare.h) and Listing Eight, page 20 (compare.c) contain functions that compare signed and unsigned characters, short integers, and long integers. Each comparison function returns:

   -1     if the first argument is less than the second argument.
    0     if the arguments are equal.
    1     if the first argument is greater than the second argument.

Since the comparison functions may be used to compare arrays, each function uses its count argument to compare every element of the field.

Compare() compares record fields based on the sort order specified in the sort definition file. Fields lacking a comparison function are ignored. If a field is sorted in descending order, the result of the comparison is reversed.

To add new comparison functions, add the function to compare.c and the cmp_fcn_info[] table.

error() is used to output an error message and then exit the program (see Listings Nine and Ten, page 20 and xx, respectively). In globvars.h (Listing Eleven, page 20) and globvars.c (Listing Twelve, page 20), the SORT_DEF data structure is a global variable since it is an implicit parameter to the compare() function called by qsort().

The makefile (Listing Thirteen, page 20) is used to compile mmf_sort. The makefile supports the Microsoft NT C/C++ compiler. It will have to be modified to support a different compiler.

Example 1: (a) Sample DATA records to be sorted; (b) sort definition file.

(a)

typedef struct
{
char  chr[2];
long  lng;
} DATA;

(b)

# field-name sort-order size alignment count a/d optional-comparison-function
chr 1 1 1 2 a uns_chr_cmp
lng 2 4 4 1 a uns_lng_cmp

Figure 2: Sample file containing records; (b) the file after it's sorted with chr having a sort order of 1 and lng a sort order of 2; (c) the file sorted with lng having a sort order of 1 and chr a sort order of 2.

(a)

chr        lng

ZX         100
AB         100
AT         200
AA         200


(b)

chr        lng

AA         200
AB         100
AT         200
ZX         100


(c)

chr        lng


AB         100
ZX         100
AA         200
AT         200


Table 1: How the DATA structure in Example 1 is stored in memory on 386/486 systems using the Microsoft NT C/C++ compiler.

<B>Address   Contents</B>
0           <I>chr[0]</I>
1           <I>chr[1]</I>
2           padding
3           padding
4           <I>lng</I>'s least significant byte
5           <B>_</B>
6           <B>_</B>
7           <I>lng</I>'s most significant byte

[LISTING ONE]

(Text begins on page 14.)

struct cust
{
  char name[40];      /* Both first and last names. */
  char address1[30];
  char address2[30];
  char address3[30];
  char address4[30];
  char phone1[20];
  char phone2[20];
}

[LISTING TWO]


#include <io.h>
#include <fcntl.h>
#include <structs.h>    /*      defines structure "cust"                */

struct cust *customer;  /*      pointer to customer.dat file    */
void *state;            /*      pointer to state.dat file       */
int c_handle,s_handle;

void *_cdecl _x32_mmfio_open(int fd,int size);

int map_files()
{
  if ((c_handle = open("customer.dat",O_RDWR)) == -1)
  {
    return -1;          /*      return failure if unable to open file   */
  }
  if ((s_handle = open("state.dat",O_RDWR)) == -1)
  {
    return -1;          /*      return failure if unable to open file   */
  }
/*      We will allow 150 megabytes of space for customer.dat even if the file
        is smaller than this.  This sets a maximum limit for the address space
        available for accessing this file.  If the file is currently smaller
        than 150 megabytes, this will allow the operating system to
        automatically expand the file up to 150 megabytes if more customers
        are added to the list.
*/
  if ((customer = _x32_mmfio_open(c_handle,150*1024*1024)) == 0)
  {
    return -1;          /*      return failure if unable to map file    */
  }

/*      State.dat is a fixed length file of 500 kilobytes       */
  if ((state = _x32_mmfio_open(s_handle,500*1024)) == 0)
  {
    return -1;          /*      return failure if unable to map file    */
  }

/*      The pointers customer and state have been successfully initialized,
        return success.
*/
  return 0;
}
</PRE>
<P>
<h4><a name="03d0_0010"><a name="03d0_0011"><B>[LISTING THREE]</B><a name="03d0_0011"></h4>
<P>
<pre>

#include <io.h>
#include <structs.h>            /* defines structure "cust"       */

extern struct cust *customer;   /* pointer to customer.dat file    */

/*      This function uses printf to output a customers name specified by
the customer index number which is passed to the function. This character
string resides in a mmfio region and will automatically be read from the file
customer.dat if it does not already exist in memory.
*/

void prnt_name(int cust_num)
{
  printf("\nCustomer name: %s",customer[cust_num].name);
  return;
}
</PRE>
<P>
<h4><a name="03d0_0012"><a name="03d0_0013"><B>[LISTING FOUR]</B><a name="03d0_0013"></h4>
<P>
<pre>

#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "compare.h"
#include "sort_def.h"
#include "error.h"
#include "globvars.h"


int main (int argc, char *argv[])
{
clock_t  start = time(NULL);
HANDLE   hinput_file, hinput_map;
void     *file_addr;
DWORD    file_size;
size_t   recs_in_file;
int      secs;
if (argc != 3)
  fatal_err("usage:  mmf_sort <input filename> <def filename>");
printf("\nSorting file %s...\n\n", argv[1]);
/* Open file to be sorted. */
hinput_file = CreateFile(argv[1], GENERIC_READ | GENERIC_WRITE, 0, NULL,
                         OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hinput_file == INVALID_HANDLE_VALUE)
  fatal_err("cannot open file %s", argv[1]);
file_size = GetFileSize(hinput_file, NULL);
read_sort_def(argv[2]);

/* Make sure file is the correct size. */
if (file_size % Sort_def.rec_size != 0)
  fatal_err("file %s does not contain a whole number of %d byte records",
            argv[1], Sort_def.rec_size);
recs_in_file = file_size / Sort_def.rec_size;
hinput_map = CreateFileMapping(hinput_file, NULL, PAGE_READWRITE, 0, 0, NULL);
if (hinput_map == NULL)
  fatal_err("cannot create mapping for file %s", argv[1]);
file_addr = MapViewOfFile(hinput_map, FILE_MAP_WRITE, 0, 0, 0);
if (file_addr == (void *) NULL)
  fatal_err("cannot map view of %s", argv[1]);
/* Sort the file. */
qsort(file_addr, recs_in_file, Sort_def.rec_size, compare);
UnmapViewOfFile(file_addr);
CloseHandle(hinput_map);
CloseHandle(hinput_file);
secs = (int) (time(NULL) - start);

printf("Sorted %ld records per second\n", secs > 0 ? recs_in_file / secs :
       recs_in_file);
exit(EXIT_SUCCESS);
}
</PRE>
<P>
<h4><a name="03d0_0014"><a name="03d0_0015"><B>[LISTING FIVE]</B><a name="03d0_0015"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\SORT_DEF.H \c AnsiText|#define MAX_FIELDS       100
#define FIELD_NAME_LEN     8

#define CMP_FCN_NAME_LEN  16
typedef struct
{
char         field_name[FIELD_NAME_LEN + 1],cmp_fcn_name[CMP_FCN_NAME_LEN + 1];
size_t       size, align_size, offset;
int          sort_order, alignment, count, ascending;
CMP_FCN_PTR  cmp_fcn;
} FIELD_INFO;
typedef struct
{
int         num_fields;
size_t      rec_size;
/* first num_keys entries are keys, in order */
FIELD_INFO  field_info[MAX_FIELDS];
} SORT_DEF;
void read_sort_def(const char *def_filename);
</PRE>
<P>
<h4><a name="03d0_0016"><a name="03d0_0017"><B>[LISTING SIX]</B><a name="03d0_0017"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\SORT_DEF.C \c AnsiText|#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "compare.h"
#include "sort_def.h"
#include "error.h"
#include "globvars.h"

int sort_def_cmp(const void *a, const void *b)

/* Compare the sort order fields of the SORT_DEF records. */

{

return ((FIELD_INFO *) a)->sort_order -
       ((FIELD_INFO *) b)->sort_order;
}
void read_sort_def(const char *def_filename)
/* Read the contents of the sort definition file and store them in Sort_def. */
{
FILE        *input;
char        line[1024], *ptr;
int         i, line_nbr = 0, max_alignment = 0;
FIELD_INFO  *cur;
const char  *delims = " \n\t";
size_t      offset = 0;
if ((input = fopen(def_filename, "r")) == (FILE *) NULL)
  fatal_err("read_sort_def: cannot open file %s", def_filename);
Sort_def.num_fields = Sort_def.rec_size = 0;
cur = Sort_def.field_info;
/* Get fields from sort definition file. */
while (fgets(line, sizeof(line), input) != (char *) NULL)
  {
  /* Discard comment lines. */
  if (line[0] == '#')
    continue;
  line_nbr++;
  if (Sort_def.num_fields >= MAX_FIELDS)
    fatal_err("read_sort_def: too many entries in sort definition file");
  /* Get field name. */
  if ((ptr = strtok(line, delims)) == (char *) NULL)
    fatal_err("error in field name on line %d of sort def. file", line_nbr);
  memset(cur->field_name, '\0', sizeof(cur->field_name));
  strncpy(cur->field_name, ptr, sizeof(cur->field_name) - 1);
  /* Get sort order. */
  if ((ptr = strtok(NULL, delims)) == (char *) NULL)
    fatal_err("error in sort order on line %d of sort def. file", line_nbr);
  cur->sort_order = atoi(ptr);
  /* Get size. */
  if ((ptr = strtok(NULL, delims)) == (char *) NULL)
    fatal_err("error in size on line %d of sort def. file", line_nbr);
  if ((cur->size = atoi(ptr)) <= 0)
    fatal_err("invalid size on line %d of sort def. file", line_nbr);
  /* Get alignment. */
  if ((ptr = strtok(NULL, delims)) == (char *) NULL)
    fatal_err("error in alignment on line %d of sort def. file", line_nbr);
  if ((cur->alignment = atoi(ptr)) < 0)
    fatal_err("invalid alignment on line %d of sort def. file", line_nbr);
  /* Keep track of maximum alignment. */
  if (cur->alignment > max_alignment)
    max_alignment = cur->alignment;
  /* Get count. */
  if ((ptr = strtok(NULL, delims)) == (char *) NULL)
    fatal_err("error in count on line %d of sort def. file", line_nbr);
  if ((cur->count = atoi(ptr)) <= 0)
    fatal_err("invalid count on line %d of sort def. file", line_nbr);
  /* Get ascending/descending flag. */
  if ((ptr = strtok(NULL, delims)) == (char *) NULL)
    fatal_err("error in a/d flag on line %d of sort def. file", line_nbr);
  *ptr = toupper(*ptr);
  if (*ptr != 'A' && *ptr != 'D')
    fatal_err("invalid a/d flag on line %d of sort def. file", line_nbr);
  cur->ascending = (*ptr == 'A');
  /* Get comparison function. */
  if ((ptr = strtok(NULL, delims)) != (char *) NULL)
    {
    memset(cur->cmp_fcn_name, '\0', sizeof(cur->cmp_fcn_name));
    strncpy(cur->cmp_fcn_name, ptr, sizeof(cur->cmp_fcn_name) - 1);
    cur->cmp_fcn = get_cmp_fcn_ptr(ptr);
    }
  else
    {
    /* This field will not be compared. */
    strcpy(cur->cmp_fcn_name, "none");
    cur->cmp_fcn = (CMP_FCN_PTR) NULL;
    }
  while (offset % cur->alignment != 0)
    offset++;
  cur->offset = offset;
  cur->align_size = cur->size;
  while (cur->align_size % cur->alignment != 0)
    cur->align_size++;

  Sort_def.num_fields++;
  offset += cur->count * cur->align_size;
  cur++;
  }
Sort_def.rec_size = offset;
/* Entire record must be padded to maximum alignment. */
while (Sort_def.rec_size % max_alignment != 0)
  Sort_def.rec_size++;
/* Print out contents of sort definition file. */
printf("%-*.*s  Order   Size    Offset  Alignment  Count  A/D   "
       "Cmp. Fcn.\n\n",
       FIELD_NAME_LEN, FIELD_NAME_LEN, "Field");
for (i = 0, cur = Sort_def.field_info; i < Sort_def.num_fields; i++, cur++)
  printf("%-*.*s %6d %6d    %6d     %6d %6d  %s  %s\n",
         FIELD_NAME_LEN, FIELD_NAME_LEN, cur->field_name,
         (int) cur->sort_order, (int) cur->size, (int) cur->offset,
         (int) cur->alignment,
         (int) cur->count, cur->ascending ? "asc " : "desc",
         cur->cmp_fcn_name);
printf("\nRecord Size: %d bytes\n\n", (int) Sort_def.rec_size);
/* Sort the fields based on the user-specified sort order. */
qsort(Sort_def.field_info,Sort_def.num_fields,sizeof(FIELD_INFO)
 ,sort_def_cmp);
fclose(input);
}
</PRE>
<P>
<h4><a name="03d0_0018"><a name="03d0_0019"><B>[LISTING SEVEN]</B><a name="03d0_0019"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\COMPARE.H \c AnsiText|int compare(const void *a, const void *b);

typedef int (*CMP_FCN_PTR)
  (const char *a, const char *b, size_t align_size, int count);
CMP_FCN_PTR get_cmp_fcn_ptr(const char *cmp_fcn_name);




<a name="03d0_001a"><a name="03d0_001b">
[LISTING EIGHT]
<a name="03d0_001b">
{INCLUDE C:\\ARTICLES\\MMF_SORT\\COMPARE.C \c AnsiText|#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "compare.h"
#include "sort_def.h"
#include "error.h"
#include "globvars.h"

#define COMPARE(a, b) ((a > b) ? 1 : (a < b) ? -1 : 0)
int uns_chr_cmp(const char *a, const char *b, size_t align_size, int count)

/* unsigned comparison of characters */

{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(unsigned char *) a, *(unsigned char *) b);
return cmp;
}
int sig_chr_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of characters */
{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(signed char *) a, *(signed char *) b);
return cmp;
}
int uns_shr_cmp(const char *a, const char *b, size_t align_size, int count)
/* unsigned comparison of short integers */
{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(unsigned short int *) a, *(unsigned short int *) b);
return cmp;
}
int sig_shr_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of short integers */
{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(signed short int *) a, *(signed short int *) b);
return cmp;
}
int uns_lng_cmp(const char *a, const char *b, size_t align_size, int count)
/* unsigned comparison of long integers */
{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(unsigned long int *) a, *(unsigned long int *) b);
return cmp;
}
int sig_lng_cmp(const char *a, const char *b, size_t align_size, int count)
/* signed comparison of long integers */
{
int  i, cmp = 0;
for (i = 0; i < count && cmp == 0; i++, a += align_size, b += align_size)
  cmp = COMPARE(*(signed long int *) a, *(signed long int *) b);
return cmp;
}
int compare(const void *a, const void *b)
/* Compare all fields in order.  Return: -1 if a < b, 1 if a > b, 0 if a = b.*/
{
int                  i, result = 0;
FIELD_INFO           *cur;
const unsigned char  *aa = a, *bb = b;
for (i = 0, cur = Sort_def.field_info; i < Sort_def.num_fields; i++, cur++)
  {
  /* Make sure current field should be compared. */
  if (*cur->cmp_fcn != (CMP_FCN_PTR) NULL)
    {
    result = (*cur->cmp_fcn) ((char *) &aa[cur->offset],
                              (char *) &bb[cur->offset],
                              cur->size, cur->count);
    /* Invert result if field is sorted in descending order. */
    if (!cur->ascending)
      result = -result;
    if (result != 0)
      return result;
    }
  }
return 0;
}
typedef struct
{
char         *cmp_fcn_name;
CMP_FCN_PTR  cmp_fcn_ptr;
} CMP_FCN_INFO;
static const CMP_FCN_INFO cmp_fcn_info[] =
{
  {"uns_chr_cmp", uns_chr_cmp},
  {"sig_chr_cmp", sig_chr_cmp},
  {"uns_shr_cmp", uns_shr_cmp},
  {"sig_shr_cmp", sig_shr_cmp},
  {"uns_lng_cmp", uns_lng_cmp},
  {"sig_lng_cmp", sig_lng_cmp}
};
CMP_FCN_PTR get_cmp_fcn_ptr(const char *cmp_fcn_name)
/* Return a pointer to the named function. */
{
int  i;
for (i = 0; i < sizeof(cmp_fcn_info) / sizeof(cmp_fcn_info[0]); i++)
  if (stricmp(cmp_fcn_name, cmp_fcn_info[i].cmp_fcn_name) == 0)
    return cmp_fcn_info[i].cmp_fcn_ptr;
fatal_err("cannot find function %s", cmp_fcn_name);
}
</PRE>
<P>
<h4><a name="03d0_001c"><a name="03d0_001d"><B>[LISTING NINE]</B><a name="03d0_001d"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\ERROR.H \c AnsiText|void
                                            fatal_err(const char *fmt,...);
</PRE>
<P>
<h4><a name="03d0_001e"><a name="03d0_001f"><B>[LISTING TEN]</B><a name="03d0_001f"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\ERROR.C \c AnsiText|#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include "error.h"

void fatal_err(const char *fmt,...)
/* Give a fatal error message and exit. */
{
char            buffer[1024];
va_list         argptr;
va_start(argptr, fmt);
vsprintf(buffer, fmt, argptr);

va_end(argptr);
fprintf(stderr, "** FATAL ERROR ** %s\n", buffer);
exit(EXIT_FAILURE);
}
</PRE>
<P>
<h4><a name="03d0_0020"><a name="03d0_0021"><B>[LISTING ELEVEN]</B><a name="03d0_0021"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\GLOBVARS.H \c AnsiText|extern
                                                        SORT_DEF    Sort_def;
</PRE>
<P>
<h4><a name="03d0_0022"><a name="03d0_0023"><B>[LISTING TWELVE]</B><a name="03d0_0023"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\GLOBVARS.C \c AnsiText|#include <stdlib.h>
#include "compare.h"
#include "sort_def.h"
#include "globvars.h"

SORT_DEF  Sort_def;
</PRE>
<P>
<h4><a name="03d0_0024"><a name="03d0_0025"><B>[LISTING THIRTEEN]</B><a name="03d0_0025"></h4>
<P>
<pre>

{INCLUDE C:\\ARTICLES\\MMF_SORT\\MAKEFILE. \c AnsiText|!include <ntwin32.mak>

OBJS=mmf_sort.obj error.obj globvars.obj sort_def.obj compare.obj

all: mmf_sort.exe

mmf_sort.exe: $(OBJS)

.C.obj:
    $(cc) $(cflags) -Ge $(cvars) $*.c

mmf_sort.exe: $(OBJS)
    $(link) $(conflags) -out:mmf_sort.exe shell32.lib $(OBJS) $(conlibs)

<B>End Listings</B>


Copyright © 1993, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.