Barr uses high-performance computers to design pharmaceuticals for Schering-Plough Research Institute. He can be reached at 60 Orange St., B-1-3-85, Bloomfield, NJ 07003.
The primary barrier to using contemporary 386-based PCs for tackling large-data scientific and engineering problems is the artificial limitation imposed by conventional memory. All other factors considered, they are far faster, pack more memory and disk, and are substantially cheaper than standard platforms like the early Sun workstations and the MicroVAX-II of just a few years ago. Except for the lack of multitasking and virtual-memory support, even DOS is not a major limitation. Yet, the anachronistic 640-Kbyte conventional memory limit, a holdover from the original IBM-PC design, effectively blocks their use for all but the smallest of problems. You can access extended memory using a DOS extender, a PC version of UNIX, or Windows as a DOS extender for Microsoft Fortran 5.0--or you can use libraries like X-arRAY.
This article examines X-arRAY routines for handling megabyte-sized data arrays. The X-arRAY package is a tiny (84 Kbytes) Microsoft Fortran 5.0-compatible library of subroutines that manage access to extended memory and perform mathematical operations on data stored in arrays located within either extended or conventional memory. As such, X-arRAY is actually a combination of an extended-memory manager and a general-purpose array-manipulation package that sets it apart from DOS extenders.
X-arRAY Memory Management
The first call to the X-arRAY memory-management routines places the program in protected mode; the details of protected-mode operation are handled entirely by X-arRAY. Extended-memory access is either through XMS via the Microsoft HIMEM.SYS driver (preferred) or the modified LIM control block. HIMEM.SYS is standard with DOS 5.0 or Windows 3.0, making it a convenient choice. X-arRAY can use whichever memory manager is available or be forced to use a specific manager.
The extended-memory management routines (see Table 1) operate in a manner analogous to that of those used for C memory management: Memory blocks are requested by size, referenced through a key that serves as a pointer to the allocated memory, and freed. In contrast to memory management in C, getxtd returns both an integer*4 handle and a modified integer*4 key associated with the successfully allocated extended-memory block. The handle is used by relxtd and endxtd to free the memory allocation. The key is the absolute address of the first byte of the allocated memory block, with bits 30 and 31 set to mark it as a legitimate key referencing extended memory. All of the library routines use this key to access and manipulate extended memory. The key itself behaves like a pointer and can be conveniently manipulated by address arithmetic. The maximum allocation is 1 Gbyte, which s ould be enough for most applications. (If you really need huge amounts of memory, you ought to seriously consider relocating your application to a more appropriate computer.)
Table 1: X-arRAY extended-memory access routines.
Routine Description ------------------------------------------------------------ getxtd Allocate blocks of available extended memory bufxtd Allocate memory in memory-mapped hardware inqxtd Report status of extended memory allocations relxtd Free a single allocation endxtd Free all allocations rzmxtd Restore linkage to existing allocation(s) a2axtd Array-to-array copy a2fxtd Extended-memory allocation to file copy f2axtd File to extended-memory allocation copy sgtrnm Get a real*4 from extended memory sgtcnm Get a complex*8 from extended memory igt[1/2/4]im Get an integer*[1/2/4] from extended memory sptrnm Put a real*4 into extended memory sptcnm Put a complex*8 into extended memory ipt[1/2/4]im Put an integer*[1/2/4] into extended memory flashr Flash extended-memory access on console *[1/2/4] means either 1,2, or 4 at that position in the name corresponding to the variable type employed.
Allocation size can be specified by indicating the array dimensionality, width of each dimension (passed as an array), and the size of the variable in bytes. Alternatively, you can simply specify the total number of bytes desired. For example, the two getxtd calls in Figure 1 are equivalent. Both allocate enough extended memory for a 512x512 array of real*4 variables. The actual allocated memory is structureless--that is, not associated with any array dimensionality or variable type. Structure and variable types are imposed by the manipulation routines that themselves can use either mode to address specific array elements or subarrays in the allocated block. This turns out to be very handy (and makes accessing extended memory straightforward) when retention of array addressing is important. Also, the array can be manipulated in portions using address arithmetic.
Figure 1: Equivalent calls using getxd.
call getxtd(0,0,1048576,0,ihandle,key,kbytes,iret,ier) and iwidth(1) = 512 iwidth(2) = 512 call getxtd(2,iwidth,4,0,ihandle,key,kbytes,iret,ier)
Unlike C, program termination does not automatically deallocate extended-memory blocks. In fact, allocated memory blocks persist intact, including their data, until deallocated by another program or machine reboot. Memory allocations are under the control of the XMS or LIM memory manager, which is external to the program. endxtd provides convenient end-of-program allocation cleanup and ensures that all blocks are freed; see Listing Five and Listing Six (page 114).
The persistence of extended-memory allocations beyond program termination can be used to advantage. rzmxtd reestablishes the linkage to extended memory previously allocated by an earlier program. rzmxtd uses a snapshot of the active handles and keys (provided by inqxtd) passed between the programs in a binary file. inqxtd also determines free and allocated memory, memory management in use, and other useful data. Although I do not have a specific example of this, I can envision a large-data/large-code Fortran program broken into smaller modules that each operate on the data passed between the modules in extended memory.
Routines are provided to shuttle data between extended memory and conventional memory, either as blocks or as individual variables (Table 1). The block-copy routine a2axtd determines the data type by its size in bytes, while the individual element routines are specific to the variable types. Routines are also provided to copy data between extended memory and binary files.
a2axtd uses extended-memory keys and/or conventional memory array names to specify source and destination, thus requiring the MS-Fortran interface to directive in order to pass keys by value and to properly declare real and complex arrays. The multiple contexts for a2axtd in a program that shuttles blocks of data between extended and conventional memory created a problem that was solved by interfacing a2axtd twice. The first version of a2axtd was interfaced at the top of the example for copying data from extended memory (via a key) into a real*4 array in conventional memory. The second version of a2axtd was aliased by the subroutine putback in a separate source file (see Listing Four, page 113) and interfaced to copy from a real*4 array in conventional memory to extended memory pointed to by the key. Yes, Fortran has no alias (but should), so putback merely passes its arguments through to the different versions of a2axtd. When you see putback in the examples, think a2axtd.
Finally, data stored in extended memory can be manipulated in extended memory using a number of unary and binary routines (Table 2). The routines ssmrnm (array scaling) and smprnm (element-by-element product of two arrays) are used in Listing One (page 112). Note that the binary array product is not the normal array product. Each routine operates on specific variable types currently limited to integer*1, integer*2, integer*4, real*4 and complex*8. Of interest to those who do fast fourier transformations, for which X-arRAY is finely tuned, are access routines to handle floating-point numbers in a decimated form and to manipulate the bits of array elements. As with a2axtd, the keys must be passed by value necessitating the use of the interface to directive.
Table 2: Extended-memory data-manipulation array routines.
Routine Description ---------------------------------------------------------------------- sabcnm Absolute value of a complex*8 array scjcnm Conjugate a complex*8 array szicnm Zero the imaginary part of a complex*8 array szrcnm Zero the real part of a complex*8 array sngrnm Negate a real*4 array sngcnm Negate a complex*8 array ssmrnm Scalar multiply a real*4 array ssmcnm Scalar multiply a complex*8 array ism[1/2/4]sm Scalar multiply a signed integer*[1/2/4] array ism[1/2/4]um Scalar multiply an unsigned integer*[1/2/4] array imn[1/2/4]sm Location and value of min element of signed integer*[1/2/4] array imn[1/2/4]um Location and value of min element of unsigned integer*[1/2/4] array imx[1/2/4]sm Location and value of max element of signed integer*[1/2/4] array imx[1/2/4]um Location and value of max element of unsigned integer*[1/2/4] array sadrnm Element-by-element sum of real*4 arrays sadcnm Element-by-element sum of complex*8 arrays iad[1/2/4]im Element-by-element sum of integer*[1/2/4] arrays smprnm Element-by-element product of real*4 arrays smpcnm Element-by-element product of complex*8 arrays imp[1/2/4]sm Element-by-element product of signed integer*[1/2/4] arrays imp[1/2/4]um Element-by-element product of unsigned integer*[1/2/4] arrays ssbrnm Element-by-element difference of real*4 arrays ssbcnm Element-by-element difference of complex*8 arrays isb[1/2/4]im Element-by-element difference of integer*[1/2/4] arrays sflcnmp Product of dissimilar complex*8 arrays iln[1/2/4]sm Arbitrary linear combination of signed integer*[1/2/4] arrays iln[1/2/4]um Arbitrary linear combination of unsigned integer*[1/2/4] arrays *[1/2/4] means either 1,2, or 4 at that position in the name corresponding to the variable type employed.
Extended-memory Strategy
X-arRAY arrays located in extended memory are not arrays from a conventional Fortran-array point of view. The elements are stored in extended memory structured like an array, but cannot be manipulated except through the supplied access routines. One approach might be to replace all array element references with sgtrnm and sptrnm calls in your algorithm to shuttle element values into conventional memory for processing. Although this preserves algorithm structure, data stored in multidimensional arrays is generally accessed by nested loops, in which array-element access occurs in the innermost loop, and large arrays (the reason for using extended memory) will often have many iterations. The result of the overhead associated with the repeated sgtrnm or sptrnm calls on performance is cumulative and lethal.
The strategy shifts to moving blocks of array elements between extended and conventional memory. This dramatically diminishes the overhead, even though the block move done with a2axtd itself takes longer to complete. Because Fortran stores data in column-major order, the ideal unit of movement is a column vector. A 512x512 array in extended memory is read into conventional memory with 512 calls to a2axtd, each moving the nth column vector (,n) of 512 elements, rather than 262,144 calls to sgtrnm. The temporary array receiving the column vector is small enough to not tax the available conventional memory, but the use of a temporary array and pieces of the total array will force an algorithm change that might have to be made anyway for data arrays exceeding the size of conventional memory. Vector supercomputers use this same scheme to boost performance, the difference being that column-vector movement is from conventional memory into an array of special CPU registers. The savings, however, still accrue from moving groups rather than individual elements.
The block-move strategy implements smoothly using the X-arRAY primitives. The 2-D summation in Figure 2(a) becomes that shown in Figure 2(b). The extended memory can be conveniently and temporarily redimensioned from the viewpoint of a2axtd to access 1-D arrays of 512 real*4 elements. The address arithmetic is analogous to that routinely done in C--key1 points to the start of the next column vector to be accessed by the loop. This is perfectly legal as long as key1 points to a legitimate extended-memory allocation and the requested block resides within the allocation; otherwise, a2axtd reports an error.
Figure 2: (a) Summing a two-dimensional array; (b) using the block-move strategy to sum a two-dimensional array.
(a) sum = 0.0 do i = 1,512 do j = 1,512 sum = sum + arr(i,j) enddo enddo (b) iwidth(1) = 512 iwidth(2) = 512 ! declared as a 2D array call getxtd(2, iwidth,4,ihandle,key,kbret,iret,ier) : sum = 0.0 key1 = key ! used for address arithmetic ichunk = 4 * 512 ! size of 512 real*4 elements do i = 1,512 ! loop over column vectors call a2axtd (1,512,4,key1,temp,iret,ier) ! bring in as 1D do j = 1,512 ! loop down temp array doing sum sum = sum + temp (j) enddo key1 = key1 + ichunk ! advance to the next column vec enddo
Listing Two (page 112) tests this by performing the same summation twice, first by column-vector moves and second by individual-element accesses. The results are dramatic. The column-vector step processes the 1-Mbyte array in 3.16 seconds and produces sum=3.436025E + 10. The individual-element access pass done in row order such that the second index was associated with the inner loop and accesses were to noncontiguous array elements requires 126.4 seconds and produces sum=3.434290E + 10. These results are from a 16-MHz 386/387SX computer. Clearly, the column-vector approach works well with only a small restructuring of the algorithm.
The different sums produced are normal for floating-point calculations, but are also a concern. The difference is due to different cumulative round-off errors that are the result of elements being summed in a different order. Reverse the indexes in Listing Two into column order for the individual-element summation and it gives an answer identical to the column-vector version. Note that we are not talking about a correct or pure answer; the reality of floating-point calculations is that they have an unavoidable round-off error that manifests differently, depending on the order of calculations. If you need the same answer independent of method, be sure to process the array elements in column order to produce the same round-off error. Column ordering in arrays is, in my opinion, a flaw in Fortran (or the teaching of Fortran) because most programmers write multidimensional arrays with the index order following loop nesting; see Figure 3.
Figure 3: A multidimensional array with the index order following loop nesting.
do outer = 1,n do inner = 1,n sum = sum + a(outer,inner) enddo enddo
The above discussion does not address array elements stored contiguously in memory. For maximum performance, array indexing should be a(inner, outer). The inner loop references, contiguous array elements stored in memory, and the outer references the column vector. This facilitates easy conversion to the column-vector transfer strategy discussed here. It also makes vectorization and parallelization possible, but that is a story for another day.
A triply nested lower triangular array (see Listing Three, page113) in which the inner-loop bounds depend on the current value of an outer-loop index presents a challenge. Although only one array is used, two column vectors are manipulated, and the number of elements used in the column vector varies. The strategy is similar to that in Listing Two. Two column vectors (,k) and (,j) must be moved into their corresponding temporary arrays and processed. Then the (,j) column vector is put back into its original place in the array in extended memory. This is shown schematically in Figure 4 and completely in Listing Three.
Figure 4: Two column vectors (,k) and (,j) must be moved into their corresponding temporary arrays and processed. Then the (,j) column vector is put back into its original place in the array in extended memory.
keyj = key keyk = key ! both temporary pointers point to the same array do j = 1,512 ! get column vector (,j) pointed to by keyj into arrj () do k = 1,j-1 ! get column vector (,k) pointed to by keyk into arrk() do i = k+1, 512 arrj(i) = arrj(i) + arrk(i) *arrj(k) enddo ! increment keyk to next column vector enddo ! put arrj() back into extended memory pointed to by keyj ! increment keyj to next column vector enddo
The address arithmetic is kept simple by copying entire column vectors, even though only part of a vector may be used for any given iteration. Improved performance might be eeked out by moving only the required portion of the column vector but at the price of more overhead from the additional address arithmetic. Listing Three runs as expected, steadily slowing as the simulation proceeds, but still completing within 13 minutes. Note that the basic algorithm structure was not mangled beyond recognition.
The shuttling of array blocks into conventional memory for processing breaks down when the algorithm is fatally row oriented, as in the case of an array inversion using Gaussian elimination. I was interested in a megabyte-sized array-inversion routine for reconstructing 2-D stereo graphics projections into 3-D, as an example. The inverter I created was sadly too slow, due to the large amount of single-element shuttling to and from extended memory. The basic algorithm also became unrecognizable. When this happens, the best bet is to use a DOS extender, in my case, the Windows version of Microsoft Fortran 5.1; X-arRAY manipulation of extended memory should be targeted at contiguous array elements for the best performance, as demonstrated in Listing Two.
Manipulating Data in Extended Memory
Clearly, shuttling portions of a megabyte-sized array in and out of conventional memory for processing is feasible, even efficient. It is far more desirable to manipulate the data directly in extended memory wherever possible. Consider a case in which a megabyte-sized array is duplicated in extended memory, all members of the duplicate array are multiplied by a scale factor, and then the two arrays are multiplied element-by-element with the results placed into the third array. This was done in Listing One with the added wrinkle that the array copy was done by copying the source array from extended memory directly to a binary file, and then reading the file directly into the newly allocated destination in extended memory. I also used inqxtd to assess available extended memory and determine which extended-memory manager was active at the start of the example. All phases of the resulting program were quick: one to three seconds, even on my relatively slow 386SX.
Conclusion
Frankly, the ability to access extended memory from within a DOS program free of DOS extenders was refreshing. Compared to the Windows extensions to Microsoft Fortran, X-arRAY addresses more extended memory, memory can be managed in a manner familiar to C programmers, and the resulting programs run faster and are independent of Windows. I liked the performance delivered by X-arRAY even though effort was required to incorporate the extended-memory routines into programs. That effort will often lead to optimizations that might otherwise be overlooked. What I would like to see in future versions of the X-arRAY library is an expanded list of array primitives such as a true-array product, determinant, array inverter, swap elements or columns, and fill with value; all of course supporting all Fortran data types. I would even like to see this functionality in a C-language library.
Incorporation of X-arRAY into applications will depend on the application. I have found that programs ultimately intended for UNIX computers can be successfully developed and tested with their full-sized (multimegabyte) arrays using the Windows version of Microsoft Fortran. Performance is not great, but that is not the point of cross-platform program development. On the other hand, a large-memory, array-based Fortran application undergoing a one-way port onto a DOS-based PC will benefit from incorporation of X-arRAY routines.
Products Mentioned
X-arRAY 1.0 Release 2 Davis Associates Inc. 43 Holden Road West Newton, MA 02165 617-244-1450 $99.00 Minimum requirements: 80386 with 387 math coprocessor; MS-DOS 2.0 or higher; Microsoft Fortran 5.0
_ACCESSING LARGE ARRAYS WITH X-ARRAY_ by Barr E. Bauer[LISTING ONE]
<a name="015c_0010"> * Extended memory manipulation using X-arRAY Fortran Library. * Does the following: 1. allocates a 1 Mbyte real*4 array a(512,512); 2. loads * array a with real*4 values; 3. saves the data in array a to disk; * 4. allocates two 1 Mbyte real*4 arrays b and c; 5. loads data from file * (step 3) into array b; 6. scales all members of array b by 5.0; 7. does an * element-by-element array multiplication of arrays a and b, results into * array c; 8. sums all members of array c, reports results. * Compile with Microsoft Fortran 5.1 using: * fl /FPi87 /G2 example1.for putback.for bagit.for /link xarray * B. E. Bauer 3/20/92 interface to subroutine a2axtd(i1,i2,i3,i4[VALUE],r1,i5,i6) integer*4 i1,i2,i3,i4,i5 integer*2 i6 real*4 r1 end interface to subroutine sgtrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end interface to subroutine sptrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end interface to subroutine smprnm(i1,i2,i3[VALUE],i4[VALUE], + i5[VALUE],i6) integer*4 i1,i2,i3,i4,i5 integer*2 i6 end interface to subroutine ssmrnm(i1,i2,i3[VALUE],r1,i4) integer*4 i1,i2,i3 real*4 r1 integer*2 i4 end include 'bagit.inc' ! error codes and other symbols integer*4 kb_total, kb_unallocated, number_allocations integer*4 memory_manager, required_memory, shortage integer*4 handle_array(1), key_array(1) integer*4 ARRAY_SIZE(ARRAY_DIM), allocated_array(1) integer*4 handle, key, key1, kb_allocated integer*4 bytes_moved, increment integer*4 keyb, keyc, handleb, handlec real*4 temp, a(SIZE) integer*2 return_status, eflag character*13 tempfile data tempfile /'tempfile.dat'C/ ! C string format data ARRAY_SIZE / SIZE, SIZE / * enable extended memory routine flashing call flashr(ON,LOWER_RIGHT,eflag) if (eflag .ne. 0) call bagit(FLASHR_ERROR) required_memory = 3*SIZE*SIZE*REAL4/1024 ! need 3 Mbytes * determine status of extended memory call inqxtd(kb_total, kb_unallocated, number_allocations, + memory_manager, handle_array, key_array, + allocated_array, return_status, eflag) if (eflag .ne. 0) call bagit(INQXTD_ERROR) if ((memory_manager .eq. 0) .or. + (memory_manager .gt. 2)) then call bagit(WRONG_MMANAGER) else if (memory_manager .eq. 1) then print *,'XMS in use' else print *,'Modified LIM in use' endif print *,'Extended memory available ',kb_unallocated,' kb' if (kb_unallocated .lt. required_memory) then shortage = required_memory - kb_unallocated print *,'insufficient memory, need',shortage,'kb' call bagit(STOPPING) endif * enough memory present, allocate memory for 1st array print *,'just ahead of memory allocation' ! allocate a 2D array of real*4 dimensioned 512 by 512 call getxtd(ARRAY_DIM,ARRAY_SIZE,REAL4,XMS,handle,key, 1 kb_allocated,return_status, eflag) if (eflag .ne. 0) call bagit(GETXTD_ERROR) * load extended memory array (X,Y) with 1.0 using column vector approach print *,'at loading stage' key1 = key temp = 0.0 increment = SIZE*REAL4 do j = 1,SIZE do k = 1,SIZE a(k) = 1.0 ! fills the 1D array with values enddo ! move the 1D into extended memory by columns ! putback is a2axtd interfaced for ! conventional -> extended memory transfers call putback(1,SIZE,REAL4,a,key1,bytes_moved,eflag) if (eflag .ne. 0) call bagit(PUTBACK_ERROR) if (bytes_moved .ne. increment) then call bagit(PUTBACK_BADCNT) endif key1 = key1 + increment enddo * save a copy of this array to disk print *,'saving array to file' call a2fxtd(ARRAY_DIM,ARRAY_SIZE,REAL4,tempfile,key, + ibytes_moved,eflag) if (ibytes_moved.ne.SIZE*SIZE*REAL4) then call bagit(A2FXTD_BADCNT) endif if (eflag.ne.0) call bagit(A2FXTD_ERROR) * allocate extended memory for arrays b and c call getxtd(ARRAY_DIM,ARRAY_SIZE,REAL4,XMS,handleb,keyb, + kb_allocated,return_status, eflag) if (eflag .ne. 0) call bagit(GETXTD_ERROR) call getxtd(ARRAY_DIM,ARRAY_SIZE,REAL4,XMS,handlec,keyc, + kb_allocated,return_status, eflag) if (eflag .ne. 0) call bagit(GETXTD_ERROR) * read file into extended memory for array b print *,'reading tempfile' call f2axtd(ARRAY_DIM,ARRAY_SIZE,REAL4,tempfile,keyb, 1 ibytes_moved,eflag) if (eflag.ne.0) call bagit(F2AXTD_ERROR) if (ibytes_moved.ne.SIZE*SIZE*REAL4) then call bagit(F2AXTD_BADCNT) endif * scale array b by 5.0 print *,'scaling array b elements by 5.0' call ssmrnm(ARRAY_DIM,ARRAY_SIZE,keyb,5.0,eflag) if (eflag.ne.0) call bagit(SSMRNM_ERROR) * element-by-element mult of a and b, results to c print *,'ahead of array multiplication' call smprnm(2,ARRAY_SIZE,key,keyb,keyc,eflag) if (eflag .ne. 0) call bagit(SMPRNM_ERROR) * sum all elements of array c to check results by using column vectors to * bring data from extended into conventional memory, where sum is performed. key1 = keyc temp = 0.0 increment = SIZE*REAL4 do j = 1,SIZE call a2axtd(1,SIZE,REAL4,key1,a,bytes_moved,eflag) if (eflag.ne.0) call bagit(A2AXTD_ERROR) if (bytes_moved.ne.increment) call bagit(A2AXTD_BADCNT) do i=1,SIZE temp = temp + a(i) enddo key1 = key1 + increment ! advance to next column vector enddo print *,'done, sum = ',temp,' (correct = 1310720.000000)' * done, remove all allocations through ENDXTD in bagit call bagit(DONE) stop end <a name="015c_0011"> <a name="015c_0012">[LISTING TWO]
<a name="015c_0012"> * Performs a sum reduction first using column vector moves then individual * element accesses * Compile with Microsoft Fortran 5.1 * fl /FPi87 /G2 example1.for putback.for bagit.for /link xarray * B. E. Bauer 3/20/92 * interface to subroutine a2axtd(i1,i2,i3,i4[VALUE],r1,i5,i6) integer*4 i1,i2,i3,i4,i5 integer*2 i6 real*4 r1 end interface to subroutine sgtrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end interface to subroutine sptrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end include 'bagit.inc' integer*4 kb_total, kb_unallocated, number_allocations integer*4 memory_manager, required_memory, shortage integer*4 handle_array(1), key_array(1), allocated_array(1) integer*4 ARRAY_SIZE(2) integer*4 handle, key, key1, kb_allocated, increment integer*4 bytes_moved, index(2), keyj real*4 temp, a(SIZE), arrj(SIZE) integer*2 return_status, eflag data ARRAY_SIZE / SIZE, SIZE / ! 2D 512x512 array used * enable console flashing when extended memory is accessed call flashr(1,3,eflag) if (eflag .ne. 0) call bagit(FLASHR_ERROR) required_memory = SIZE*SIZE*REAL4/1024 * check for adequate XMS memory, quit if inadequate call inqxtd(kb_total, kb_unallocated, number_allocations, + memory_manager, handle_array, key_array, + allocated_array, return_status, eflag) if (eflag.ne.0) call bagit(INQXTD_ERROR) if (required_memory .gt. kb_unallocated) call bagit(NOT_ENOUGH) * allocate a 512 by 512 array of real*4 print *,'just ahead of memory allocation' call getxtd(2,ARRAY_SIZE,REAL4,XMS,handle,key, 1 kb_allocated,return_status, eflag) if (eflag .ne. 0) call bagit(GETXTD_ERROR) * load extended memory array (X,Y) using column vectors print *,'at loading stage' key1 = key temp = 0.0 increment = SIZE*REAL4 do j = 1,SIZE do k = 1,SIZE a(k) = float(k) + float(SIZE*(j-1)) enddo call putback(1,SIZE,REAL4,a,key1,bytes_moved,eflag) if (eflag .ne. 0) call bagit(PUTBACK_ERROR) if (bytes_moved .ne. increment) then call bagit(PUTBACK_BADCNT) endif key1 = key1 + increment enddo * column vector summation print *,'start column vector sum reduction' sum_col = 0.0 chunk = SIZE*REAL4 do j=1,SIZE keyj = key + chunk*(j-1) ! address arithmetic ! put (,j) into arrj call a2axtd(1,SIZE,REAL4,keyj,arrj,bytes_moved,eflag) if (eflag.ne.0) call bagit(A2AXTD_ERROR) if (bytes_moved.ne.chunk) call bagit(A2AXTD_BADCNT) do k=1,SIZE ! process the column vector sum_col = sum_col +arrj(k) enddo enddo print *,'done with column vector sum reduction' * individual element access print *,'start individual access sum reduction' sum_ind = 0.0 do i=1,SIZE do j=1,SIZE index(1)=i ! row of element index(2)=j ! column of element ! get the element into retval call sgtrnm(2,ARRAY_SIZE,key,index,retval,eflag) if (eflag.ne.0) call bagit(SGTRNM_ERROR) sum_ind = sum_ind + retval enddo enddo print *,'done with individual access sum reduction' print *,'column sum =',sum_col,', individual sum =',sum_ind call bagit(DONE) stop end <a name="015c_0013"> <a name="015c_0014">[LISTING THREE]
<a name="015c_0014"> * Triangular array manipulation of a single 1 Mbyte real*4 array arr(512,512) * using X-arRAY routines * Does the following: * do j=1,512 * do k = 1, j-1 * do i = k+1, 512 * arr(i,j) = arr(i,j) + arr(i,k) * arr(k,j) * enddo * enddo * enddo * Compile in Microsoft Fortran 5.1 using: * fl /FPi87 /G2 example2.for putback.for bagit.for /link xarray * B. E. Bauer 3/20/92 * interface to subroutine a2axtd(i1,i2,i3,i4[VALUE],r1,i5,i6) integer*4 i1,i2,i3,i4,i5 integer*2 i6 real*4 r1 end interface to subroutine sgtrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end interface to subroutine sptrnm(i1,i2,i3[VALUE],i4,r1,i5) integer*4 i1,i2,i3,i4 integer*2 i5 real*4 r1 end include 'bagit.inc' integer*4 kb_total, kb_unallocated, number_allocations integer*4 memory_manager, required_memory integer*4 handle_array(1), key_array(1), allocated_array(1) integer*4 ARRAY_SIZE(ARRAY_DIM) integer*4 handle, key, key1, kb_allocated, increment integer*4 bytes_moved, index(2), keyj, keyk real*4 temp, a(SIZE), arrj(SIZE), arrk(SIZE) integer*2 return_status, eflag data ARRAY_SIZE / SIZE, SIZE / call flashr(ON,LOWER_RIGHT,eflag) required_memory = SIZE*SIZE*REAL4/1024 call inqxtd(kb_total, kb_unallocated, number_allocations, + memory_manager, handle_array, key_array, + allocated_array, return_status, eflag) if (eflag.ne.0) call bagit(INQXTD_ERROR) if (kb_unallocated .lt. required_memory) then call bagit(NOT_ENOUGH) endif * allocate 1 Mbyte of extended memory print *,'just ahead of memory allocation' call getxtd(ARRAY_DIM,ARRAY_SIZE,REAL4,XMS,handle,key, + kb_allocated,return_status, eflag) if (eflag .ne. 0) call bagit(GETXTD_ERROR) print *,'loading extended memory' key1 = key temp = 0.0 increment = SIZE*REAL4 do j = 1,SIZE do k = 1,SIZE a(k) = 0.00025 enddo call putback(1,SIZE,REAL4,a,key1,bytes_moved,eflag) if (eflag .ne. 0) call bagit(PUTBACK_ERROR) if (bytes_moved .ne. increment) call bagit(PUTBACK_BADCNT) key1 = key1 + increment enddo * process triangular array print *,'processing triangular array' keyj = key keyk = key chunk = SIZE*REAL4 do j=1,SIZE print *,'outer loop j = ',j ! get arr(x,j) from extended into arrj(x) call a2axtd(1,SIZE,REAL4,keyj,arrj,bytes_moved,eflag) if (eflag.ne.0) call bagit(A2AXTD_ERROR) if (bytes_moved.ne.chunk) call bagit(A2AXTD_BADCNT) do k=1,j-1 keyk = key + (k-1)*chunk ! get arr(x,k) from extended into arrk(x) call a2axtd(1,SIZE,REAL4,keyk,arrk,bytes_moved,eflag) if (eflag.ne.0) call bagit(A2AXTD_ERROR) if (bytes_moved.ne.chunk) call bagit(A2AXTD_BADCNT) ! do the manipulation do i=k+1,SIZE arrj(i) = arrj(i) + arrk(i)*arrj(k) enddo enddo ! put arrj(x) back to extended memory call putback(1,SIZE,REAL4,arrj,keyj,bytes_moved,eflag) if (eflag.ne.0) call bagit(A2AXTD_ERROR) if (bytes_moved.ne.chunk) call bagit(A2AXTD_BADCNT) keyj = keyj + chunk enddo * sample selected members of the array in extended memory do i=1,SIZE,125 do j=1,SIZE,125 index(1)=i index(2)=j call sgtrnm(ARRAY_DIM,ARRAY_SIZE,key,index,retval,eflag) if (eflag.ne.0) call bagit(SGTRNM_ERROR) print *,i,j,retval enddo enddo call bagit(DONE) stop end <a name="015c_0015"> <a name="015c_0016">[LISTING FOUR]
<a name="015c_0016"> * putback.for--interface a2axtd for conventional to extended memory block moves * B. E. Bauer 3/20/92 * interface to subroutine a2axtd(i1,i2,i3,r1,i4[VALUE],i5,i6) integer*4 i1,i2,i3,i4,i5 integer*2 i6 real*4 r1 end subroutine putback(i1,i2,i3,r1,i4,i5,i6) integer*4 i1, i2, i3, i4, i5 real*4 r1(*) integer*2 i6 call a2axtd(i1,i2,i3,r1,i4,i5,i6) return end <a name="015c_0017"> <a name="015c_0018">[LISTING FIVE]
<a name="015c_0018"> * bagit.inc--symbols and declarations used for error handling and the examples. * B. E. Bauer 3/20/92 * integer*4 INQXTD_ERROR,WRONG_MMANAGER,STOPPING,GETXTD_ERROR integer*4 PUTBACK_ERROR,PUTBACK_BADCNT,A2AXTD_BADCNT integer*4 A2AXTD_ERROR,A2FXTD_BADCNT,A2FXTD_ERROR integer*4 F2AXTD_ERROR,F2AXTD_BADCNT,SSMRNM_ERROR integer*4 SMPRNM_ERROR,NOT_ENOUGH,SGTRNM_ERROR integer*4 FLASHR_ERROR,DONE integer*4 ARRAY_DIM,REAL4,XMS,SIZE,ON,LOWER_RIGHT parameter (INQXTD_ERROR=1) parameter (WRONG_MMANAGER=2) parameter (STOPPING=3) parameter (GETXTD_ERROR=4) parameter (PUTBACK_ERROR=5) parameter (PUTBACK_BADCNT=6) parameter (A2AXTD_BADCNT=7) parameter (A2AXTD_ERROR=8) parameter (A2FXTD_BADCNT=9) parameter (A2FXTD_ERROR=9) parameter (F2AXTD_ERROR=10) parameter (F2AXTD_BADCNT=11) parameter (SSMRNM_ERROR=12) parameter (SMPRNM_ERROR=13) parameter (NOT_ENOUGH=14) parameter (SGTRNM_ERROR=15) parameter (FLASHR_ERROR=16) parameter (DONE=99) parameter (ARRAY_DIM = 2) ! 2D array parameter (REAL4 = 4) ! size of real*4 parameter (XMS = -1) ! use available mmanager parameter (SIZE = 512) ! size of array parameter (ON = 1) ! convenient symbol parameter (LOWER_RIGHT = 3) ! where flashr flashes <a name="015c_0019"> <a name="015c_001a">[LISTING SIX]
<a name="015c_001a"> * bagit.for--error handler. Prints an appropriate message then calls endxtd * to ensure allocations are freed. * B. E. Bauer 3/20/92 * subroutine bagit(iflag) integer*4 iflag integer*2 return_status, eflag include 'bagit.inc' select case (iflag) case (INQXTD_ERROR) print *,'error reported by inqxtd' case (WRONG_MMANAGER) print *,'XMS or Mondified LIM memory manager not found' case (STOPPING) print *,'stopping...' case (GETXTD_ERROR) print *,'error reported by getxtd' case (PUTBACK_ERROR) print *,'error in putback(a2axtd)' case (PUTBACK_BADCNT) print *,'wrong number of bytes moved by putback(a2axtd)' case (A2AXTD_BADCNT) print *,'wrong number of bytes moved by a2axtd' case (A2AXTD_ERROR) print *,'error in a2axtd' case (A2FXTD_BADCNT) print *,'wrong number of bytes moved by a2fxtd' case (A2FXTD_ERROR) print *,'error in a2fxtd' case (F2AXTD_ERROR) print *,'error in f2axtd' case (F2AXTD_BADCNT) print *,'wrong number of bytes moved by f2axtd' case (SSMRNM_ERROR) print *,'error in ssmrnm (scalar multiply)' case (SMPRNM_ERROR) print *,'error in smprnm (el-by-el multiply)' case (NOT_ENOUGH) print *,'inadequate extended memory available' case (SGTRNM_ERROR) print *,'error in sgtrnm (real*4 get)' case (FLASHR_ERROR) print *,'error in flashr' case (DONE) print *,'freeing extended memory' end select call endxtd(return_status, eflag) stop 'done, exiting...' end
Copyright © 1992, Dr. Dobb's Journal