Channels ▼
RSS

.NET

MS-DOS Assemblers Compared

Source Code Accompanies This Article. Download It Now.


JAN89: EXAMINING ROOM

Mike Schmit is the president of Quantum Software. He can be reached at 19855 Stevens Creek Blvd., Ste. 154, Cupertino, CA 95014.


I'll admit it. I am biased about languages; I write almost all of my code in assembly language. I do this for several reasons, but mostly I do it because the resultant code is both compact and fast. In this article, however, I avoid a debate over the merits of assembly language versus those of the various high-level languages. Rather, I explore the choices you have after you've already made the decision to program in assembly.

Programmers who are more familiar with high-level languages may be surprised to find out that modern assembly languages include an equally rich set of features. These often include powerful macros, data structures, defined procedures, dynamic stack variables, several memory models, floating point support, and a host of conditional assembly directives.

But to most programmers it isn't a matter of whether it can be done, but how easy it is to accomplish the end result. For years I have contended that assembly language need not be as tedious and difficult to learn as a typical bit-banger might lead you to believe. Today's crop of assemblers have lent much support to this contention. New products from Borland International and SLR Systems, and new versions from Microsoft, have gone far toward making assembly language programming more approachable, and the time spent doing it more productive.

This article compares three MS-DOS assemblers: Microsoft's Macro Assembler 5.1 (MASM), Borland's Turbo Assembler 1.0 (TASM), and SLR System's OPTASM 1.5. MASM is a well-establish standard that others must be measured against, the baseline against which others should be compared. In recognition of this fact, it's important to note that both OPTASM and TASM claim almost 100 percent compatibility with Microsoft's MASM.

I will assume some familiarity with MASM, and will not dwell too long on assembly language itself. Rather I'll concentrate on the differences between the assemblers, the performance of the assembler, and the enhancements offered in OPTASM and TASM.

Even though all three assemblers can translate the same source files, their internal workings are radically different. MASM is a conventional two-pass assembler, and TASM performs only one pass and then fixes forward references. OPTASM, on the other hand, is an n-pass assembler, performing as many passes as required to eliminate phase errors and extraneous NOPs.

Let's look at each separately.

MASM

Because of what might be best described as the least-common-denominator effect, most magazine articles have standardized on MASM 4.0 for example code. With no serious competition, Microsoft has, until recently, been slow to add features and increased power to MASM. But now that's changed. With the 5.x series Microsoft has added a number of nice features. For example, it added a simplified system for declaring segments that works for standalone assembly language programs and for interfacing to high-level languages. Other enhancements include better performance, automatic allocation for stack variables, local and global labels, continuation lines, OS/2 support and an HLL-like interface for Basic, C, Fortran, and Pascal.

A number of problems arise naturally because of the way that different languages pass arguments on the stack. MASM handles all of this automatically, but there is no easy way to allow for an assembly language routine to do this, unless the HLL model is duplicated exactly. It should be noted, however, that argument passing convention problems are not unique to MASM, because the calling conventions are defined by the HLL.

The reason for this is that some instructions have a variable number of bytes, such as the JMP instructions. A JMP can have a displacement of 1, 2, or 4 bytes corresponding to destination labels that are SHORT, NEAR, or FAR. It's even more complicated for a large segment on the 80386, because a NEAR label can have a 4-byte offset and a FAR label can have a 6-byte offset.

This design was apparently modeled after the original 8086 assembler (Intel's ASM-86), which assumes the most likely size for the operand (2 bytes). If only 1 byte is needed, then a NOP is filled into the other byte. If more than 2 bytes are needed, then an error is generated. ASM-86 keeps track of every assumption, however, and generates an error message corresponding to the exact instruction that is in error. The MASM designers took a shortcut and the assumptions are checked only indirectly by ensuring that labels have the same offset on each pass. If they don't, MASM generates the message "Phase error between passes."

This message has ended the assembly language programming careers of many programmers. The instruction with the bad assumption can be anywhere between the instruction where the message was generated and the previous label. This could be one instruction or a thousand. (You can examine a pass-one listing to help find the problem, but if you're experienced enough to know about pass-one listings, then you probably don't need help finding the problem.)

The manuals that come with MASM include, among others, a programmer's guide, a guide to Codeview, and a guide to utilities. But the one that I use most is the 150-page spiral-bound reference guide. This guide includes a list of every directive, instruction, and math coprocessor instruction; it also includes syntax, timings, brief explanations, flags affected, and more. The programmer's guide is quite an improvement over the version 4.0 manual in that there are examples on most every directive and instruction. There are also a numerical error message listing and an excellent index.

MASM 5.1 comes with a number of utilities, including the Microsoft Linker, Librarian, Cross-Reference utility, and Make utility. New with 5.1 is the Microsoft Editor, a programmer's editor that allows compiling (or assembling) from the editor. One major feature of MASM is a BIND utility, which allows the creation of a program that runs under DOS or OS/2.

TASM

TASM was originally designed for internal use only, providing Borland with a competitive advantage. Many of the familiar Borland products made heavy use of TASM, including Turbo Pascal and Quattro. Perhaps because of its roots, the Turbo Assembler is a departure from Borland's standard language product strategy; it is not an integrated environment like the familiar Turbo Pascal or Turbo C. Rather it is a standalone, command line program like MASM or OPTASM.

TASM is a single-pass assembler with forward reference resolution (which accounts for much of its speed advantage over MASM) and a number of enhancements that exceed the capabilities of MASM. In fact, TASM (and OPTASM) both claim to be more compatible with MASM than MASM is, based on the fact that they support previous versions of MASM. This is important if you are supporting older code or frequently use code from magazines, bulletin boards, or the like.

TASM handles forward references in roughly the same way as MASM, but with better error reporting for references that cannot be resolved. The manual, however, admits that "The truth of the matter is that all sorts of forward references can cause problems for Turbo Assembler, so you should avoid forward references--that is, references to labels farther on in the code--whenever possible." (According to Borland, however, forward references in TASM only cause probems when the programmer makes explicit use of the two-pass nature of MASM.ed.)

One nuisance that a programmer must deal with when writing code for the Intel processors is that conditional jumps have only a 1-byte displacement (+127 or -128 bytes). (Note: the 80386 has a 2-byte displacement, but most code is still written for DOS [8086] or OS/2 [80286].)

This means that MASM programmers get a lot of "relative jump out of range" error messages. If you can't adjust your design, then you must opt for the inelegant 5-byte sequence (see Figure 1, page 71), which consists of two jumps. Every veteran MASM programmer has done this hundreds of times. TASM has a new JUMPS directive that, in most cases, automatically handles this situation.

Figure 1: Expanded conditional jump example

  problem:
                 ...
                 cmp  ax, 1
                 je   near_label
                 ... > 128 bytes of code
  near_label:
                 ...

  correction:

                 ...
                 cmp  ax, 1
                 jne  $ + 3
                 jmp  near_label
                 ... > 128 bytes of code
  near_label:
                 ...

One of the most notable features of TASM is that it has two modes: MASM and Ideal. The MASM mode can also be MASM51, which then allows some of the new features in MASM 5.1. Based on the names you can guess which mode the Borland programmers consider to be the better one.

Ideal mode, as defined by Borland, makes the expression parser accept only a more rigid, type-checked syntax. Advantages and disadvantages of this mode are listed in Table 1, page 78. One key feature is the fact that the same source file may switch back and forth between MASM and Ideal modes. Borland claims a 30 percent speedup when using the Ideal mode. I believe that this is due to the fact that the parser can isolate assembly directives and addressing modes more quickly in the Ideal mode. Also I think that the assembler was written for Ideal mode and MASM compatibility was added later; thus MASM compatibility was not part of the primary design goal. All tests were run using the default MASM mode.

Table 1: Highlights of Turbo Assembler's MASM and Ideal Modes

Description               Examples                         Comments
---------------------------------------------------------------------------
Directives that begin     .model small     ;MASM
   with a period have    .code
   been renamed
                         model small      ;TASM Ideal
                         codeseg

Directives such as PROC,  func1 PROC near  ;MASM           - Causes
   ENDP, SEGMENT, and    PROC func1 near  ;TASM Ideal     all
   ENDS are reversed                                      directives
                                                           to be the
                                                           first token     
                                                           except data       
                                                           declarations, 
                                                           EQUs, and =s

Square brackets []        mov  ax, var1    ;MASM           - In some     
   required for memory    mov  cl, array1[bx]             instances      
   references             mov  ax, [var1]  ;TASM Ideal    this removes
                         mov  cl, [bx+array1]            ambiguity, but
                                                         also prevents   
                                                          writing code
                                                         that looks  
                                                          like an HLL
                                                         array index

Structure fields are      mov  ax, [bx].field_1 ;MASM     - Allows the      
   not global                                             re-use of
                                                         structure 
                                                          field names
                                          ;TASM Ideal    - Allows   
                   mov cx, [(struc_a PTR BX)].field_1]    UNIONs (two
                                                         STRUC's for 
                                                          the same
                                                         data items)
                                                         - Messy  
                                                          notation when
                                                         you want to  
                                                          overlay a
                                                          structure onto        
                                                          data not  
                                                          explicitly  
                                                          declared,  
                                                          (i.e. data  
                                                          allocated from  
                                                          DOS)

EQUs are always text      A = 1                           - Fixes an  
   based (MASM does not   B = 2                          inconsistency
   handle EQUs and =s     C EQU A + B                    in MASM
   in a consistent        B = 3
  manner)                var DW C         ;MASM = 3
                                    ;TASM Ideal = 4

SIZE returns actual       msg1 DB 'message 1', 0         - Makes the  
   size of first item     len DB SIZE msg1               SIZE operator
   in a list                            ;MASM = 1     much more useful
                                  ;TASM Ideal = 9

Floating point            num1 DT 1E7  ;in MASM could    - Prevents  
   constants must                ;be a radix 16 value    confusion if
   include a decimal      num2 DT 1.0e7 ;required for    you use radix  
   point to prevent                       ;TASM Ideal    16 in
   ambiguous values                                      programs using  
                                                         floating point
                           
                       

One of the key features of Ideal mode is a consistent syntax for the use of directives, such as the declaration of a procedure. The Ideal syntax is:

  PROC     Func1        near
  ...
  ENDP     Func1        ; optional
                        repeating of Func1

This allows a faster assembler, and the syntax makes sense because the first token on the line always defines directive type, unlike the MASM syntax. But you could make an equally sensible argument for the MASM syntax. The PROC directive defines a program label, just as a label followed by a colon does, or a label followed by a DW defines a data word. It would make no sense to reverse the order of these. Then again, the MASM syntax for the ENDP doesn't fit into any of these rules.

Perhaps the biggest difference between MASM and Ideal is that Ideal uses square brackets ([]) in expressions as shown in Figure 2, page 76. In Ideal mode you use square brackets to reference memory. This is easy to remember and makes a lot of sense. (Ideal mode doesn't require square brackets, but it will warn you if you don't use them. However, this warning can be disabled with the NOWARN directive.ed.) For example, in MASM you would use:

  item_1  DW  10
  item_2  equ 20
  array1  DB   256 dup(0)

  mov   ax, item_1
  mov   dx, item_2
  mov   cx, 3[bx]
  mov   al, array1[bx]

Without the data declaration and equate in clear view, it is difficult to tell the difference between the first two MOVs. There are times when you may not care, but precise control is what assembly language is all about. Mathematically the third MOV looks like three times something, when in fact it is not. The fourth MOV looks just like what it is, using BX to index into an array.

Figure 2: Turbo Assembler's Ideal mode example

  title MASM mode example program  ; comment in title
  .model small

  stdin   = 0
  stdout  = 1
  buf_len = 128

  dosint MACRO function     ; macro to invoke a DOS interrupt 21h function
    mov ah, function
    int 21h
  ENDM

  .code

  main PROC

    mov  ax, @data     ; load data segment
    mov  ds, ax
    mov  es, ax
    mov  dx, OFFSET inbuf     ; dest for input
    mov  bx, stdin            ; source of input
    mov  cx, buf_len          ; len for input
    dosint 3fh                ; read file
    cmp  ax, 2
    jle  fin
    mov  bx, ax
    mov  inbuf[bx-2], 0       ; null at end (remove CR LF)
    mov  cx, ax               ; len for example
    sub  cx, 2                ; removes CR LF
    call  lower_line
    mov  dx, OFFSET outbuf    ; source of output
    mov  bx, stdout           ; dest for output
    dosint 40h                ; write file
  fin:
    dosint 4ch                ; exit to DOS

  main ENDP

  lower_line PROC near

    push  cx
    mov  si, OFFSET inbuf
    xor  di, di               ; for example no STOS for DI
    cld
  loop1:
    lodsb                     ; read a char
    or  al, 20h               ; convert to lower case
    mov  outbuf[di], al       ; store in outbuf
    inc  di
    loop  loop1               ; loop til end of string
    pop  cx
    ret

  lower_line ENDP

  .data

  inbuf  DB buf_len DUP (?)
  outbuf  DB buf_len DUP (?)

  stk SEGMENT STACK           ; reserve space for stack
   db 100 dup (0)             ; using old segment method
  stk ENDS

  END main                    ; specify starting address

  ---------------------------------------------

  IDEAL
  %title  "TASM IDEAL mode example program"   ; comment not in title
                              ; all directives affecting listing
                              ; file begin with a percent sign (%)

  model small                 ; no periods in Ideal directives

  stdin   = 0
  stdout  = 1
  buf_len = 128

  MACRO dosint function       ; label and MACRO reversed
    mov ah, function
    int 21h
  ENDM

  codeseg                     ; segmentation directive renamed

  PROC main                   ; label and PROC reversed

    mov  ax, @data
    mov  ds, ax
    mov  es, ax
    mov  dx, OFFSET inbuf
    mov  bx, stdin
    mov  cx, buf_len
    dosint 3fh
    cmp  ax, 2
    jle  fin
    mov  bx, ax
    mov  [bx+inbuf-2], 0      ; memory reference must be in []
    mov  cx, ax
    sub  cx, 2
    call  lower_line
    mov  dx, OFFSET outbuf
    mov  bx, stdout
    dosint  40h
  fin:
    dosint  4ch

  ENDP main                   ; label and ENDP reversed

  PROC  lower_line near

    push  cx
    mov  si, OFFSET inbuf
    xor  di, di
    cld
  @@loop1:
    lodsb
    or  al, 20h
    mov  [di+outbuf], al      ; memory reference in []
    inc  di
    loop  @@loop1             ; local labels available
    pop  cx
    ret

  ENDP                        ; matching PROC name optional

  dataseg                     ; segmentation directive renamed

  inbuf DB buf_len DUP (?)
  outbuf DB buf_len DUP (?)

  SEGMENT stk STACK           ; label and SEGMENT reversed
   db 100 dup (0)
  ENDS                        ; matching SEG name optional

  END main

By contrast, in Ideal mode you would use:

   item_1 DW 10    item_2 equ 20    array1 DB 256 dup(0)

   mov     ax, [item_1]    mov     dx, item_2    mov     cx, [bx+3]    mov     al, [bx+array1]

Keep in mind that MASM allows many of the Ideal mode expressions, but TASM (in Ideal mode) requires them, possibly preventing an accidental construct that is legal but does not get the intended insults. MASM veterans may have a hard time accepting the Ideal mode. Beginners will probably learn assembly language quicker.

There are about 25 differences between the Ideal and MASM modes (see Table 2, page 80). The manual gives a description and an example of each difference covering about 30 pages in all.

Table 2: TASM operating modes and quirks

                  Features comparison

                                             MASM 5.1     OPTASM     TASM
-------------------------------------------------------------------------

  MASM 3.x compatible                                        x
  MASM 4.x compatible                                        x         x
  MASM 5.x compatible                          x             1         x

  80286 code                                   x             x         x
  80287 code                                   x             x         x
  80386 code                                   x                       x
  80387 code                                   x                       x
  8087 emulation                                             x         x

  Expand conditional jumps > 128 bytes                       x         x
   away by adding an extra jump
  Expand LOOP's > 128 bytes away by                          x
   adding two additional jumps
  Remove extra NOPs                                          x
  Eliminates phase errors                                    x

  Externals can include size info                                      x
  Global symbols (combines Public/Extrn)                     2         x
  UNION directive (nested STRUC's)                                     x
  Additional modes                                           3         4
  Length of data item                                        5         6
  Generation of Group offsets{7}                             x         x
  Local labels                                 8             9        10
  Multiple PUSH and POP pseudo-op                            x
  Short Extrn directives (i.e. EXTB)                         x

  Wildcard filenames on cmd line                                       x
  Built-in make                                              x

     predefined symbols                                      x         x
  time
  date                                                       x         x
  filename                                                             x
  version                                                              x

Notes:

1) OPTASM supports up to MASM 5.0 and some of MASM 5.1 features.

2) OPTASM has a Soft Extrn directive that allows an internal definition to override the Extrn.

3) OPTASM has about 40 features that can be individually enabled or disabled to override the normal MASM features.

4) TASM defaults to MASM 4.0 and has directives for MASM 5.0, MASM 5.1, Quirks, and an Ideal mode (see text).

5) OPTASM has an option to allow the LENGTH operator to return the total length of all items defined for a given label.

6) In Ideal mode the SIZE operator returns the total size for the first data item in a data list.

7) When using the SEG operator and the OFFSET operator, MASM can generate incorrect addresses when using simplified segmentation.

8) MASM allows local labels to be defined as "@@:" which can then be referenced as @F (forward to next @@:) or @B (backwards to previous @@:). In additional, when using an HLL model all labels are local, unless defined with two colons.

9) OPTASM allows local labels by prefixing any label with a # sign or suffixing it with a dollar ($) sign.

10) TASM allows local labels by prefixing with two @ signs (@@).

There is an additional submode called Quirks (see Table 3, page 83), which, according to the manual, "allows you to assemble a source file that makes use of one of the true MASM bugs." Although this statement is a shot at Microsoft, more important is the fact that it refers to enabling several well-documented features that Borland was apparently reluctant to include. The Manual lists three main quirks:

    1. Local labels are defined with @@ and referred to with @F and @B. 2. There is a redefinition of variables inside PROCs. 3. C language PROCs are all PUBLIC with leading underscores.

I agree that the first Quirk is a quirk. In my opinion, MASM's implementation of local labels is brain damaged (both TASM and OPTASM have different implementations). I believe that the last two quirks are actually a benefit for anyone writing HLL utilities in assembly language. The second quirk allows you to use the same names for arguments being passed from an HLL. The third quirk is that all PROCs are declared public. When MASM51 is turned on and the C language is specified as part of the .MODEL statement, leading underscores are appended. This serves a quirk of the C language itself that requires that functions have a hidden leading underscore in the public name passed to the linker. (See Figure 3, page 84.).

The Turbo Assembler documentation includes a 580-page user's guide and a 300-page reference guide. The user's guide includes a 200-page tutorial that is one of the best beginner's guides to the Intel architecture that I have read. There are also chapters on the specifics of interfacing with the other Borland languages: Turbo C, Turbo Basic, Turbo Pascal, and Turbo Prolog. Much of this information is common with other languages.

The reference guide is primarily a description of all the directives and operators. Each directive and operator is marked as to which modes (MASM or Ideal) it is available in and any differences in syntax. But it is not possible to tell what MASM mode syntax is a super set of Microsoft's MASM. In other words, it is possible to write code in Turbo Assembler's MASM mode that will not assemble with MASM, and the manual does not readily note the differences. This is not a major concern if you are planning on using TASM exclusively, but would be a problem if you are sharing code with other programmers not using TASM.

Turbo Assembler comes with a number of additional programs. One program is the Debugger. Turbo Debugger is a source-level debugger that supports debugging from a remote system and has multiple windows, menus, on-line help, and virtual 8086 debugging on an 80386 system. It is a significant product in its own right; any further discussion is beyond the scope of this article.

Turbo Assembler also provides a utility that allows source debugging of files linked for Codeview debugging. Additional utilities include Turbo Linker, Turbo Librarian, TCREF (a cross-reference utility), and MAKE. The Turbo Linker is not a complete replacement for the Microsoft Linker. The manual states, "TLINK is lean and mean; ... it lacks some of the bells and whistles of other linkers."

OPTASM

OPTASM, by SLR Systems, is a high-performance optimizing assembler that is nearly 100 percent compatible with MASM. OPTASM supports incompatibilities between MASM, Versions 3, 4, and 5. OPTASM is the clear winner in terms of performance and code size, but, as is always the case, has some limitations.

These limitations are either minimal or severe, depending upon your needs. For example, OPTASM does not support 80386 instructions, nor does it support some of the MASM features gained in the jump from version 5.0 to 5.1. Another downfall is that no linker is provided. Although MASM doesn't provide a linker either, the high performance of OPTASM leads you to expect that it will. (Note: OPTASM, Version 1.6, which will be released this month includes a linker and debugger. No changes have been made to the assembler.) But then after assembling at lightning speed, you must wait for the Microsoft Linker. This is like using a 20-MHz 80386 and then shifting to a 4.77-MHz PC.

Table 3: Features conparison

                           TASM Operating Modes

a) Normal MASM        Emulates MASM 4.0, 5.0 without minor quirks.
b) QUIRKS             Emulates MASM 4.0, 5.0 with minor quirks in those
                      versions.
c) MASM51             Emulates those features of MASM 5.1 that conflict
                      withMASM 4.0 and 5.0 operation but do not conflict
                      with the operation of Borland's extensions that
                      perform the similar functions.
d) MASM51 and Quirks  Emulates MASM 5.1 fully.

Quirks as explained in the TASM HELPME!  .DOC disk file

Mode                  Operations
Quirks                Allows FAR jumps to be generated as NEAR or SHORT if
                      CS assumes agree.
                      Allows all instruction sizes to be determined in a
                      binary operation solely by a register, if present.
                      Destroys OFFSET, segment override, etc., information
                      on '=' or numeric 'EQU' assignments.
                      Forces EQU assignments to expressions with "PTR"
                      or ":" in them to be text.
MASM51                Instr, Catstr, Substr, Sizestr, and "\" line
                      continuation are all enabled.
                      EQU's to keywords are made TEXT instead of ALIASes.
                      Leading whitespace is not discarded on %textmacro in
                      macro arguments.
MASM51 and Quirks     Everything listed under Quirks above.
                      Everything listed under MASM51 above.
                      @@, @F, and @B local labels are enabled.
                      Procedure names are PUBLICed automatically in
                      extended MODELs.
                      Near labels in PROCs are redefinable in other PROCs.
                      "::" operator is enabled to define symbols that can
                      be reached outside of current proc.
MASM51 and Ideal      Ideal mode syntax and the MASM51 text macro
                      directives are supported, i.e., Instr, Catstr,
                      Substr, and Sizestr.

Figure 3: New MASM to C calling interface

  Old method (MASM 5.0 and before):

           Public _sample_func
           _sample_func proc near
           push   bp
           mov    bp, sp
           push   bx
           push   cx
           push   dx

           mov    bx, [bp+41] ; get
             ptr to var1
           ...

           pop   dx
           pop   cx
           pop   bx
           pop   bp
           ret
           _sample_func endp

  New method:

           .model small, C
           .code

           sample_func proc near USES BX
              CX DX, var1:PTR

           mov   bx, var1 ; get
             ptr to var1
           ...

           ret
           sample_func endp

SLR Systems was the only company willing to discuss future products with me. It is working on 80386 support, a linker, and full compatible MASM 5.1.

As mentioned before, OPTASM is an n-pass assembler. This unique design allows OPTASM to perform as many passes as required to prevent all phase errors, and it never inserts extra NOPs into your code. Although SLR calls OPTASM an optimizing assembler, it really does not do optimizations in the sense, for example, that a C compiler does optimizations. The truth is that MASM and TASM perform unoptimizations by inserting NOPs where the programmer did not ask for them; all OPTASM does in the way of optimizations is not to make this mistake.

OPTASM also contains a number of extensions to the basic MASM language. But it's important to note that every feature can be either enabled or disabled.

The Make capability in OPTASM is not really a Make utility in the traditional sense, because the MAKE routines are built into the assembler. There are both advantages and disadvantages to this approach. You can still use your old Make in the same way, or you can use your old Make along with the OPTASM Make.

In its simplest form the OPTASM Make will just process a file that has a list of filenames. Actually you are required to enter the command tail, just as it would appear in a DOS batch file, minus the OPTASM program name. The advantage here is that OPTASM does not need to be reloaded by DOS between each file. OPTASM is also smart enough to keep include files in memory, if possible. Just as with a regular Make, you can add dependencies, so that files are assembled only if they are out of date.

Version 1.0 of OPTASM came with no other utilities, but version 1.5 is now shipped with a unique utility: OPTHELP. This is a memory-resident utility that provides help on the instruction set and OPTASM. The help is fairly detailed, including bit encodings for the various instructions. For example, the MOV instruction has 15 pages of help.

OPTASM does not come with a librarian, but OPTLIB is available separately. OPTLIB is ten times faster than the Microsoft Librarian.

OPTASM comes with one 320-page spiral-bound manual. The manual does not include a tutorial, but is a complete description of the language, with all features unique to OPTASM appearing in boxes marked as OPTASM features. When reading about any feature, there is no doubt as to whether the exact syntax is also available in MASM.

Local Labels

Each of the three assemblers implements local labels in a different manner. (See Figure 4, below.) They are all incompatible with each other. In other words, if you write a program that uses any of the methods for local labels, then the other assemblers will not allow it. The only exception is that TASM will accept the MASM methods if you use both the MASM and Quirks directives.

Figure 4: Local labels examples

                ; MASM example

                @@:
                    inc   ax
                    cmp   ax, bx
                    je    @f
                    loop  @b

                @@:
                    ret

                ; TASM example

                @@loop:
                    inc   ax
                    cmp   ax, bx
                    je    @@exit
                    loop  @@loop
                @@exit:
                    ret

                ; OPTASM example

                #1:
                    inc   ax
                    cmp   ax, bx
                    je    #2
                    loop  #1

                #2:
                    ret

Local labels are a new feature in MASM 5.1. The idea is simple: you define a local label with two at signs (@@) followed by a colon. To reference the nearest preceding local label, use @B (back); to reference the next local label, use @F (forward). When writing assembly language routines for use in HLL programs, a much more structured feature is enabled. The manual describes it as a local variable scope when you use the extended form of the .MODEL directive. Labels ending with a single colon (and procedure arguments passed on the stack and local stack variables) are considered local to the procedure where they are defined. To make a label available from another procedure, it must be defined with two colons.

This is an excellent feature of MASM, it promotes well-structured code and highlights any labels that are used external to a procedure. The bad news is that you must be using the extended form of the .MODEL directive, which is not always appropriate. With TASM, on the other hand, local symbols are available whether you are using the extended form of the .MODEL statement or not.

As stated earlier, TASM can emulate the MASM local labels. But in addition you can define a local label by preceding a label with two at (@@) signs. The scope of these local labels is in between any two regular labels. This feature is automatically available in the Ideal mode and can be enabled in the MASM mode with the Locals directive. The two at signs can be replaced with any other characters that can start a label by using the Locals directive.

OPTASM 1.5 does not support the MASM local labels. However, OPTASM has a unique feature called Procedure Local Labels. There are two formats:

   # n label
   n label $

The first format begins with a # sign and is followed by a series of digits (n) or a standard label (label). The second format begins with a series of digits (n), is optionally followed by a standard label (label), and is terminated by a dollar sign ($). The assembler ignores the # and $ characters when evaluating a label, and thus #10 and 10$ are considered the same. All these labels are automatically local only to the procedure in which they are defined. Therefore the same labels can be reused in other procedures.

The Results

I conducted a number of tests on all three assemblers on several machines, including a 4.77-MHz XT, a 10-MHz AT, and a 16-MHz 80386. The results were proportional for each class of machine, and so were based only on the 10-MHz AT clone with a Seagate ST-251 drive (40 Mbyte, 38-ms access time).

The first test was based on a program that consisted of 15 source files, ranging in size from 2K to 100K, totaling 535K. There were also four include files totaling 16K. Based on how many times the include files were read, the assembler had to read more than 600K of source to build the object files. All times were in seconds and did not include linking or generating listing files.

The second set of tests consisted of assembling six 100K source files. The third and fourth tests used the unique features of TASM and OPTASM to obtain more speed. These tests were the same as the first two, except that I used TASM's wildcard feature and OPTASM's internal make. The effect was that both TASM and OPTASM obtained a speed advantage, because reloading the assembler between files was not required. See Figure 3 for the test results.

Figure 5

Both Borland's Turbo Assembler and SLR System's OPTASM provided clear advantages over Microsoft's Macro Assembler. The TASM/Debugger package seems to be the best value, but both were much faster, with OPTASM clearly the fastest. Plug and play, OPTASM also produced the most compact code, maybe 1 percent smaller than the other. While this is significant, view these results in context. For example, with a bit of additional work, compaction could have been improved in TASM through the use of overrides. Likewise, using a different linker, like Phoenix's PLINK, is likely to shrink code size.

Figure 5: Test data

  Test description                         MASM  TASM  OPTASM

  test 1 (15 files, 600K total)              85    50     35
  test 2 (6 files, 100K each)                80    40     26
  test 3 (test 1 with wildcards/make)        85    34     24
  test 4 (test 2 with wildcards/make)        80    32     22

  Notes:
        1) Times are in seconds
        2) All tests on 10 MHz AT clone,
           zero wait states, Seagate
           ST-251 hard disk, no cache

There were four test cases run, as described in the article. Each test consisted of just three data points, one for each of the assemblers.

Both TASM and OPTASM also provided a number of nice extras, such as fixing conditional jumps that were out of range by putting in two jumps. OPTASM clearly outperformed TASM based on the fact that it never inserted extra NOPs and it handled all combinations of conditional jumps and loops. TASM, on the other hand (if automatic jump sizing is used with forward references), automatically assumed that the jump is going to be far and reserved 5 bytes for the jump, padding the code with extra NOPs.

It is also worth nothing that both MASM and TASM support 80386 code, as will OPTASM, sometime in the future, and for the present, at least, MASM is the only assembler that runs in OS/2 protected mode.

Now that there is finally some competition in the macro assembler market, it's about time users began sending their wish lists to the manufacturers. Keep in mind that a job is only as easy as the tools make it.


Copyright © 1989, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.