Inserting Elements Into a Basic Integer Array

Mixing assembly language with Basic yields some stunning performance improvements. And, as Bruce explains, it's not even that hard to do.


November 01, 1988
URL:http://www.drdobbs.com/parallel/inserting-elements-into-a-basic-integer/184408027

NOV88: INSERTING ELEMENTS INTO A BASIC INTEGER ARRAY

Bruce Tonkin develops and sells software for TRS-80 and MS-DOS/PC-DOS computers. Although he writes mostly in Basic, Bruce also writes some software in C and assembly language. You may reach him at T. N. T Software Inc. 34069 Hainesville Rd., Round Lake, IL 60073.


If you do much serious programming in Basic, you ought to use some assembler routines. Assembly language is widely used to enhance the performance of C, but it has even greater importance in Basic.

Although Basic is the most popular programming language in use today, Basic compilers generally lack the kind of optimization more easily provided in "small" languages like C. Many times, that optimization isn't necessary; Basic has enough built-in commands (already highly optimized) to make typical business applications programs run at a very acceptable speed.

No language contains all possible commands, however, when a needed command is not available, the programmer must add it via a subroutine. Because most Basics don't optimize very well (or at all), such added features are often slow. Therefore, it's an even bigger advantage to use selected assembler routines in Basic than in languages such as C, where high optimization is relatively common. I'll present an example of this from my personal experience with Microsoft's QuickBasic 4.0.

QuickBasic 4.0 is typical of modern Basics. It has hundreds of commands and functions built in, but some of the ones you most often need seem to be lacking. In particular, there is no command to allow quick insertion of a value into an array.

In both C and Basic, it's possible to write a loop that will move individual elements of an array up one position to make room for a new element. In C, there's a good chance that the core of such a loop will be optimized by the compiler to an efficient MOVSB or MOVSW instruction. That's not so in QuickBasic 4.0 or in Microsoft's Basic 6.0.

Recently I wrote a sort/merge program. That's probably old stuff to most readers of this magazine, but this particular program turned out to be very interesting. The sort had two critical parts: a search routine (I used binary search to make things fast) and an index insertion routine. There wasn't very much I could do to speed up a binary search, but index insertion proved different.

The indices were in a single-dimensioned integer array that indexed an array of strings to be sorted. When a new string was added, its index was added to the pointer array by inserting the value at the position determined by the binary search. To insert a new element at position K, you might write something like this:

   for i=lastelement+1 to k+1 step -1
         array(i)=array(i-1)
   next
   array(k)=insertvalue

For speed, all variables will be integers, of course. Still, this short routine isn't very fast; the compiled code multiplies repeatedly to calculate each array position in turn before moving any data in the array. That's inefficient, because the loop is doing nothing more than moving the array elements (starting with the last element and moving down to element K) up two positions in memory; it's an ideal use for the REP MOVSW instruction, but neither the QB4 nor the Basic 6.0 compiler generate the obvious code.

Once I identified the bottleneck in my sort, I decided to write an assembler routine to perform the move. To keep the routine generally useful, I decided to write the routine so that it would move elements in either a static or dynamic array (via segmented addresses). I was very conservative with all the registers used, because I wanted the result to be as portable as possible for future (and other) Basic compilers. See Listing One.

I was prepared for a modest speedup, but I was shocked when I prepared a simple benchmark. The performance increase was astounding! Table 1, this page, provides a summary.

Table 1: Time to do 10,000 insertions at position of 1 of a 10, 000 element integer array.

                         Benchmark Results
     QB4:                                     5765.63 seconds ( 3 tests)
     Assembler:                                 41.48 seconds ( 4 tests)
     Relative time, QB4/assembler         139.0

     (Times are for Tandy 4000 with 16MHz 80386)

If you're not familiar with assembler or just plain afraid of it, take a look at the listing of Iinsert (Listing Two). I've commented it heavily so that you can see what it does.

If you do serious programming in Basic you could probably benefit from this and similar assembler tools. A routine to move long integers or floating-point numbers should be pretty easy to write, given this example. Usually, the results won't be as striking as they were in this case. Typical speed-ups in my assembler routines average about a factor of 10, but in critical parts of your programs a factor of even 2 or 3 is nothing to sneeze at.

_INSERTING ELEMENTS INTO A BASIC INTEGER ARRAY_ by Bruce Tonkin

[LISTING ONE]


                        The Test Program

defint a-z
rem $dynamic
dim a(10000)
for i = 1 to 10000: a(i) = i: next i
t! = timer
for i = 1 to 10000
   b = 9999
   call iinsert(seg a(1), b)
   a(1) = -i
next i
t1! = timer - t!
for i = 1 to 10000: print a(i); : next i
print
t! = timer
for i = 1 to 10
   for j = 10000 to 2 step -1: a(i) = a(i - 1): next j
   a(1) = i
next i
t2! = timer - t!
print t1!; "seconds for 10,000 assembler insertions."
print t2!; "seconds for 10 QB4 insertions."
END




[LISTING TWO]


                     Iinsert Program Listing

 .MODEL MEDIUM
 .CODE
 PUBLIC IINSERT
;Iinsert by Bruce W. Tonkin on 8-12-88 for QB 4.0 & MASM 5.0
;Iinsert will move, in descending order, elements of any
;1-dimensional integer array to allow a new element to be
;inserted.  It is called with:
;call iinsert (seg arg1,count)
;where arg1 is the element where the new value is to be inserted,
;and count is the integer number of elements to move.  e.g.:
;call iinsert(seg x%(5),y%)
;will move y% elements up one within the array, and you can then
;assign a new value to x%(5).  The value of x%(5) will be in
;x%(6) and x%(6) in x%(7), and so on for a count of y% elements.
;If the last element were x%(1000), then y% should be 1000-5=995,
;since the 999th element will go to the 1000th and elements 5
;through 999 will move up to elements 6 through 1000.  This
;routine works with either static or dynamic arrays, regardless
;of the array's location in memory-- the DS register that
;indicates the current BASIC data segment is irrelevant.

Iinsert PROC
    push bp        ;save old BP
    mov  bp,sp     ;Set framepointer to old stack
    push ds        ;save data segment--altered by routine
    push es        ;and extra segment--altered by routine
    push di        ;and destination index--altered by routine
    push si        ;and source index--altered by routine
    push ss        ;and stack segment--just to be safe
    mov  bx,[bp+6] ;address of the count parameter
    mov  cx,[bx]   ;count is now in CX
    shl  cx,1      ;multiply that by two for the moment
    les  di,[bp+8] ;segmented address of destination parameter
    lds  si,[bp+8] ;segmented address of source parameter, too
    add  di,cx     ;es:di points to end of destination
    add  si,cx     ;and ds:si points to end of source
    inc  di        ;move up one element for destination for an
    inc  di        ;integer insert--that's two bytes
    shr  cx,1      ;restore original cx value
;default destination is es:di, default source is ds:si for movsw
    std            ;set direction flag for decremented copy
    rep  movsw     ;move word at a time, since elements are words
    movsw          ;and move the original element up one, too
    pop  ss        ;restore saved registers
    pop  si
    pop  di
    pop  es
    pop  ds
    cld            ;clear direction flag--a well-mannered routine
    pop  bp        ;restore old base pointer
    ret  0ah       ;clear 10 bytes of parameters on return
Iinsert ENDP
    END









Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.