Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

Structured Programming


MAR88: STRUCTURED PROGRAMMING

Handling Huge Arrays

Let's face it, when IBM selected the 8088 as the engine for its first-generation PC, memory-segmented architecture officially became a bad idea whose time had come. Three-and-a-half generations of Intel chips later, we're still stuck with it, and there's no relief in sight. Credit for all the remarkable things done on PCs goes not to the architecture but to those with the ingenuity to find ways around the flaw in its addressing scheme: no single hunk of memory in a PC can be bigger than 64K.

Compilers for the PC have dealt poorly with the problem. You can have multiple code segments and in some cases multiple data segments, but the 64K limit still applies to any single extent. Probably someone will write to tell me that the latest release of Googlephonic Snobol or some such has made segmentation transparent, but I'm talking about the mainstream languages from the likes of Microsoft and Borland. They deal with memory segmentation by not dealing with it in any graceful fashion.

The solution is, of course, to keep all objects (code units, global variable sets, arrays, and so on) smaller than 64K. This is usually a workable solution because most objects are inherently smaller than the magic number. In some cases, though, it's simply not possible to remain within this arbitrary limit. Nobody gets hit harder in that regard than scientists and engineers, who routinely work with extremely large numeric arrays.

Many have tried the PC and found segmentation an insuperable barrier, even though their data would, given a decent addressing scheme, fit within the free memory of the average PC. This article is for them and for anyone else faced with the unsavory task of attempting to process arrays of several thousand elements.

Memory Segmentation

Memory segmentation works like this: It takes two elements to express a complete address. The first element is a base, or point of reference, and the second is an offset from the base. In the Intel chips, the segment registers (CS, DS, and ES) contain base addresses that remain relatively fixed so that the offsets of variables, entry points, and so on can be reliably measured from them.

A segment address is actually the upper four digits of a five-digit hex number. Therefore, each increment of the segment represents a 16-byte jump in real memory. In the endless quest for quaint terminology, someone has dubbed these 16-byte hunks "paragraphs." A segment always begins at a paragraph boundary, which is a multiple of 16 bytes.

The offset is another four-digit hex value that measures a distance in bytes toward the top of memory, using the segment as the reference point. Four hex digits, of course, can represent any of 64K values. Consequently, any addressing range in the PC is constrained to 64K from a paragraph boundary.

The CS register contains the base address for a code segment. When a program takes a long jump (or far call), the CS register changes to the base paragraph of the new code segment. A far return reinstates the old CS so that execution resumes in the original code space. Thus a program can have multiple code segments, each of up to 64K.

The DS (data segment) register is less flexible. Compilers usually enforce a stable DS throughout program execution so that globals are universally accessible, meaning that the total size of all globals cannot exceed 64K, lest they be beyond the range of the offset.

Auto variables-those allocated as locals within subprograms or passed as parameters-are even more constrained. They're allocated on the stack-another 64K-max structure so that space is limited to 64K less the stack space already occupied by other stuff.

So what's a body to do with a huge hunk of data? There are a couple of alternatives. One is to build and process array images on disk. That's an ugly way out. You'd better start a big matrix multiplication just before a long weekend. Another option, which I'll explore here, is using the heap.

Purists may quibble, but the heap is essentially all the memory that s left alter any software currently in memory has staked its claims. Unless specifically set up otherwise, .COM and .EXE programs consider the whole of uncommitted memory as their heap space. Given a moderately sized program and a couple of memory-resident utilities, a 640K machine can usually yield a heap of about 450K to 500K. That's a respectable amount of space: a quarter of a million integers, or upward of 60,000 to 80,000 reals depending on precision. The trick is to use it effectively.

What Can You Put into 64K?

Programming strategies for managing huge arrays inevitably involve some pencil sharpening. Let's first consider how much data can fit inside a 64K node allocated on the heap.

Compilers typically claim a few bytes of memory for record keeping. For a large array occupying a heap node, figure the overhead at 16 bytes, which leaves 65,521 for data. Divide that by the size of the array's data type to find out how many elements will fit, truncating any fractional result. Table 1, page 111, lists the sizes of some common data types.

Table 1: Some common data types and their storage sizes

   Type                     Bytes

Char, byte                  1
Integer, word               2
Long integer                4
Pointer                     4
Turbo Pascal real           6
IEEE single                 4
IEEE double                 8
IEEE extended              10
IEEE comp                   8

Taking Turbo Pascal reals as an example, 65,521/6 = 10,920 array elements. That's an approximately square matrix of 104 x 105, or one of 26 x 420, or any other combination of multiples and submultiples whose product is 10,920 or less. Similarly, a 64K array of IEEE doubles can have 8,190 elements, or 90 X 91, 182 x 45, 30 x 273, and soon.

Assuming that's enough for your matrix, it's easy to make the necessary declarations and allocate the space. You might define the array in Pascal as:

TYPE arrayPtr = ^bigArray;
    bigArray  =  ARRAY  [1. .90, 1. .91]  OF DOUBLE;

and declare a pointer variable as:

VAR arr1 : arrayPtr;

Allocation is then accomplished with the standard procedure NEW (arr1);, after which you can refer to elements with the notation arr1^ [row, col].

Pascal/Modula-2 pointer types such as arrayPtr are automatically 32-bit segment:offset objects. You can therefore allocate several 64K arrays on the heap and, using subscripted pointer notation as above, compare and combine them.

Listing One, page 86, illustrates this with a program named maddints.pas. It adds two integer matrices, each 180 X 180, and stores the results in a third heap node. An array of 180 X 180 encompasses 32,400 elements, or 64,800 bytes. There are three such arrays, occupying a total of 194,400 bytes of heap space.

Maddints is somewhat artificial in that it sets both addend arrays to the same values, where each element contains a quantity computed as:

D^  [row,  col]  : =  (row  *  10)  +  column;

Thus the elements of the first row of both A and B are 11, 12, 13, ..., of the second 21, 22, 23, ..., and so on, and the elements of the sum array C are double those of the sources (22, 24, 26, ...). To make this into a real-world program, revise the Acquire procedure to read the data from an external source such as a disk file and add a procedure to output the results to a suitable medium.

Matrices Bigger Than 64A?

Truly huge matrices require a bit more ingenuity to live within the means of 64K segments. In this case, each row of the matrix is a separate array of up to 64K in size, with the subscripted elements representing columns. A matrix of n rows consists of n such horizontal lists. The glue that holds it together is another array--a vertical list containing n pointers to the rows in sequence. Figure 1, this page illustrates this structure, where D points to an array of pointers, each of which points to an individual row.

Figure 1: Structure of a hugh array.

D -->    ^row1    ---> D[1, 1]       D[1, 2]        ...       D[1, z]
         ^row2    ---> D[2, 1]       D[2, 2]        ...       D[2, z]
         ^row3    ---> D[3, 1]       D[3, 2]        ...       D[3, z]
         ...
         ^rowN   ---> D[n, 1]       D[n, 2]        ...       D[n, z]

Conceptually the matrix elements can be regarded as D [1. .n, 1. .z], where n is the number of rows and z the number of columns. Because of pointer following, however, the actual Pascal/Modula-2 notation is more complex. The next couple of paragraphs develop the notation based on a model.

You can define the types for this matrix structure as:

TYPE dataObj = WORD; {or whatever type is appropriate}
      colPtr = ^colArray; { pointer to a row }
      colArray = ARRAY [1. .maxCols]

OF dataObj; { row }
      colNode = RECORD
      col : colPtr; { row pointer node }
           END;
      rowPtr = ^rowArray;
      rowArray = ARRAY [1. .maxRows] OF colNode; { list }

Figure 1: Structure of a huge array

The sanity of defining the colNode record with only one field will become apparent when you see the actual notation. Also, it seems contradictory to use the names colPtr, colArray, and so on in refering to a row, until you realize that you use these objects to identify a specific column within a given row using subscripts.

The variables declared from these type definitions are very simple:

VAR A, B, C : rowPtr;

That is, the only variables deriving from the types are pointers to the controlling lists for the three matri ces A, B, and C. After creating the matrices, you can refer to individual elements by following the pointer structure. For example:

A^ [25].col [287]

refers to the element in row 25, column 27.

An ever-present danger when working with huge arrays is that you'll run out of heap space. Most languages furnish an intrinsic function to inquire about space availability. Logitech Modula-2, for example, has an AVAILABLE function that returns a Boolean indicating if the requested number of bytes can be allocated; Turbo Pascal 4.0 has maxAvail, which returns the size of the largest contiguous block. Before attempting to allocate space for a huge array, inquire whether your request will be honored. If it won't be, terminate gracefully. This is preferable to simply letting the program crash with some arcane run-time error message.

Listing Two, page 86, is a program hugemats.pas, written in Turbo Pascal 4.0, that implements this discussion. It's similar to Listing One, except that the matrices are 250 rows by 300 columns, or 75,000 elements apiece. Because the data object is of type WORD (an unsigned integer), this comes to 150,000 byte per matrix for data, plus 1,000 byte for the 250 row pointers, for a total of 151,000 byte. There are three such matrices, claiming 453,000 byte of heap space.

That's too much data for the program to run in the Turbo Pascal interactive environment, which provides less than 300K of heap space on a 640K machine. Because of the space checking in the create procedure, the program will always terminate with an out-of-memory message if the Turbo Pascal environment is resident. On my machine, the compiled .EXE program running alone has only about 20K of heap space left after the three matrices are created, but that's enough for it to run.

Freeing Up More Space

Huge number crunchers such as those in Listings One and Two are typically bare-bones programs, avoiding slick user interfaces and other memory-gobblers. Remember, every litfle nicety in a program consumes precious memory and eats into the heap. If you're tight on space, cut out the gimmicks.

The stack is another place you can mine for memory. In Turbo Pascal 4.0, the default stack size is 8,192 byte; it's 8,000 byte in Logitech Modula-2. That's a waste of memory for programs such as those given here, which never have more than about 20 byte on the stack at once. Trim the stack down to a minimal size. The savings go into the heap, where you can use them productively.

Don't forget to free up dynamic space that's no longer needed. For structures such as those given in Listing One, you can loop through rowArray, calling freeMem in Turbo Pascal (or its equivalent in other dialects) to dispose of allocated nodes, and then free rowArray itself.

If your data requirements are still too huge to fit, you'll either have to give up on the PC or else resort to memory-stretching technologies such as EMS (which I'll discuss in a future column). The techniques shown here, however, should satisiy most requirements for working with huge arrays on the PC.

The suggestion for this month's column came from Dr. Don Walters, Professor of Physics at the U.S. Naval Postgraduate School in Monterey, Calif. If you have a programming issue you'd like to see addressed here, drop me a line (no calls, please) in care of DDJ. Or send an MCI message to KPORTER, Mountain View, Calif. No promises, but I'll tackle as many as I can.

[LISTING ONE]

<a name="0095_000b">


Program maddints;

  { This program illustrates addition of two 180 x 180 integer  }
  { matrices A and B, storing the results in a similar matrix C }

Const  max = 180;       { maximum columns/rows per square array }

Type   arrayPtr = ^bigArray;
       bigArray = array [1..max, 1..max] of integer;

Var    A, B, C : arrayPtr;
       x, y    : word;
{ ------------------------------------------------------------- }

Procedure acquire (var D : arrayPtr);

  { Allocates the array on the heap and fills it with data      }
  { This is a sample procedure for demo only. Replace it with   }
  {  the requisite code to acquire data from an external source }

Var   r, c : word;

Begin
  New (D);                                    { allocate a node }
  For r := 1 to max do
    For c := 1 to max do
      D^ [r, c] := (r * 10) + c;  { value of each array element }
End;
{ --------------------------- }

Begin     { main program }
  Acquire (A);                             { acquire array data }
  Acquire (B);
  New (C);                         { set aside space for result }
  For y := 1 to max do
    For x := 1 to max do
      C^ [y, x] := A^ [y, x] + B^ [y, x];   { add arrays into C }
  Writeln ('Proof:');
  Writeln ('A [1, 1] = ', A^ [1, 1]);
  Writeln ('B [1, 1] = ', B^ [1, 1]);
  Writeln ('C [1, 1] = ', C^ [1, 1]);
  Writeln;
  Writeln ('A [max, max] = ', A^ [max, max]);
  Writeln ('B [max, max] = ', B^ [max, max]);
  Writeln ('C [max, max] = ', C^ [max, max]);
End.



 <a name="0095_000c"><a name="0095_000c">
<a name="0095_000d">
[LISTING TWO]
<a name="0095_000d">

Program hugemats;

  { Demo program to add two huge matrices > 64K, giving a third }

Const  maxRows = 250;
       maxCols = 300;

Type   dataObj = word;
       colPtr = ^colArray;
       colArray = array [1..maxCols] of dataObj;
       colNode = record
           col : colPtr;
         End;
       rowPtr = ^rowArray;
       rowArray = array [1..maxRows] of colNode;

Var    A, B, C : rowPtr;
       x, y    : word;
       error   : Boolean;
{ ------------------------------------------------------------- }

Procedure create (var D     : rowPtr;
                  var error : Boolean);

  { Create huge array 'D' and pass back pointer to it }

Var   row : word;

Begin
  Error := false;
  If maxAvail > sizeof (rowArray) then     { if space available }
    GetMem (D, sizeof (rowArray))          { allocate row array }
  Else begin
    D := nil;
    Error := true;
  End;
  If D <> nil then
    For row := 1 to maxRows do begin        { allocate all rows }
      If not error then
        If maxAvail > sizeof (colArray) then         { if space }
          GetMem (D^ [row].col, sizeof (colArray))  { alloc row }
        Else
          Error := true;
    End;
End;
{ --------------------------- }

Procedure acquire (var D     : rowPtr;
                   var error : Boolean);

  { Load data into array 'D' after creating it }

Var   row, c : word;

Begin
  Create (D, error);
  If not error then
    For row := 1 to maxRows do
      For c := 1 to maxCols do
        D^ [row].col^ [c] := (row * 10) + c;   { modify to suit }
End;
{ --------------------------- }

Begin   { main program }
  Writeln ('Size of each array is ',
           sizeof (rowArray) + (sizeof (colArray) * maxRows),
           ' bytes');
  Writeln ('Initial heap space = ', memAvail);
  Writeln ('Setting up array A');
  Acquire (A, error);

  If not error then begin
    Writeln ('Remaining heap space = ', memAvail : 6);
    Writeln ('Setting up array B');
    Acquire (B, error);
  End;

  If not error then begin
    Writeln ('Remaining heap space = ', memAvail : 6);
    Writeln ('Creating target array C');
    Create (C, error);
  End;

  If not error then
    Begin
      Writeln ('Remaining heap space = ', memAvail : 6);
      Writeln ('Adding arrays');
      For y := 1 to maxRows do
        For x := 1 to maxCols do
          C^[y].col^[x] := A^[y].col^[x] + B^[y].col^[x];

      Writeln;
      Writeln ('Proof:');
      Write   ('A [1, 1] + B [1, 1] = C [1, 1] = ');
      Writeln (A^[1].col^[1] : 5, ' + ', B^[1].col^[1] : 5, ' = ',
               C^[1].col^[1] : 5);
      x := maxCols;
      y := maxRows;
      Write   ('A [m, n] + B [m, n] = C [m, n] = ');
      Writeln (A^[y].col^[x] : 5, ' + ', B^[y].col^[x] : 5, ' = ',
               C^[y].col^[x] : 5);
    End
  Else
    Begin
      Writeln ('Out of memory: program ended');
      Write (#7);                                     { beep }
    End;
End.









Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.