Channels ▼
RSS

C/C++

Container Object Types in Turbo Pascal

Source Code Accompanies This Article. Download It Now.


NOV89: LINKING WHILE THE PROGRAM IS RUNNING

Andrew is a software engineer doing CD-ROM and network programming at a large software firm in Cambridge, Mass. He can be reached at 32 Andrew St., Cambridge, MA 02139.


"In theory anyway linking to a target <t> can be achieved at the earliest whenever it becomes feasible to make <t> known. It is, in fact, even more interesting to consider until how late linking can be postponed."

-- Elliott I. Organick The Multics System: An Examination of its Structure (1972)

"The evolution of loaders is interesting because it is an example of a trend common to many areas of both software and hardware, the trend to delay binding as long as possible."

-- Robert M. Graham

Principles of Systems Programming (1975)

Once upon a time, there was no such thing as linking, or at least no such thing as a "linkage editor." As long as there was no separate or independent compilation, there was no need for a program that combines fragments of programs. But as soon as there were "linkers," it became interesting to see how long linking could be deferred. ("Never do today what you can put off until tomorrow.")

Every professional programmer has heard that OS/2 is built on "dynamic linking," a standardized mechanism for attaching add-ins to the operating system. OS/2, itself, consists largely of add-ins; for example, the Presentation Manager's (PM) graphical windowed environment is a collection of OS/2 dynamic-link libraries (DLLs), as is the OS/2 kernel application program interface (API).

The operation of dynamic linking is largely transparent to the programmer. As David Cortesi explained in his December 1987 DDJ article "Dynamic Linking in OS/2: Built-in facilities for the third-party extension of OS/2," a programmer using a procedure in a DLL declares it just as he would declare any other external reference; calls to such a procedure look the same as calls to a routine in a "normal" static-link .LIB library. The only immediately visible difference is a smaller executable file. The actual code for the procedure is kept in the DLL, not copied into your executable file as in normal linking. OS/2 does all the work of dynamically linking to the DLL code.

But there is another form of dynamic-linking that is not transparent. Called "run-time dynamic linking," this method of linking requires you carry out "by hand" the work normally done by LINK.EXE and OS/2. In exchange, you get to defer linking until very late -- in fact, linking takes place while the program is running!

Why Would You Want to Do That?

Because computing consists largely of "solutions in search of a problem," and of building features that no one has yet asked for, you frequently encounter facilities that are nifty but not useful. So, before we get into how they work, we'd better determine what run-time dynlinks are for in the first place.

The ability to defer linking so that it occurs during run-time can be used in any application in which users can do programming while the application is running. Examples include any interpreted language, an Emacs text editor with an embedded language, a database manager or spreadsheet with a LOAD/CALL add-in facility, or a debugger with an "execute statement" menu option. A straight object-code compiler is not an example because the programming is done before the compiler is run.

There is a close parallel between "delayed linking" and what in object-oriented programming (OOP) is called "late binding." (The two are not synonymous, however.) With OOPs, late (delayed) binding turns a selector into the actual method to handle a message. In OS/2, delayed linking takes commands a user issues at run-time and turns them into actual function calls.

In both OOPs and OS/2, it is definitely not the program's responsibility to maintain the table that associates symbolic names with actual code, because that would constitute early binding. Your program's responsibility is to pass the symbolic names through, without interpreting them. This means that with both OOPs and OS/2 run-time dynamic linking, a program can use facilities that didn't even exist when the program was compiled.

There is more than an analogy between OS/2 delayed linking and OOPs delayed binding. Some object-oriented systems, such as the Andrew Toolkit and Class C preprocessor developed at Carnegie-Mellon University, use delayed linking in order to implement dynamic demand loading and linking of classes in running applications. (For a discussion of the connection between dynamic binding and dynamic linking, see Philippe Gautron and Marc Shapiro, "Two Extensions to C++: A Dynamic Link Editor and Inner data," Usenix Proceedings C++ Workshop, Santa Fe, New Mexico 1987). Because of its support for run-time dynlinks, OS/2 is a perfect platform for building OOPs environments.

The Importance of Being ASCIIZ

"Our future plans include: ... Programmatic interface: Some programs, particularly based on interpretive languages such as Lisp, can dynamically generate dynamic references. We would like to support the handling of such references through a common mechanism, and thus wish to provide a program-accessible interface to the services now provided invisibly."

-- Robert A. Gingell, et al., Shared Libraries in SunOS (1987).

OS/2's programmatic interface for dynamic linking under program control is built upon the functions listed in Table 1 that comprise the OS/2 module manager. Of these, DosLoadModule and DosGetProcAddr are the crucial functions. In fact, with these two functions alone we can access the others, for with these two functions we can in fact access any function in an OS/2 DLL.

Table 1: The OS/2 module manager

        DosLoadModule
        DosGetProcAddr
        DosFreeModule
        DosGetModName
        DosGetModHandle

Those of you familiar with Microsoft Windows will detect the similarity to LoadLibrary( ) and GetProcAddress( ), and those familiar with the Macintosh can consider OS/2's DosGetProcAddr combination of the toolbox a Named Resouvce( ) Get and GetTrapAddress( ). And programmatic access to shared libraries is planned for the forthcoming release 4.0 of Unix System V (Unix Review, August 1989).

Run-time dynlinks illustrate the importance of ASCII strings in OS/2. Nearly any object in OS/2 can have an ASCIIZ (zero-terminated ASCII string) name associated with it. The well-known "named pipes" of the LAN Manager are one example. Just as most systems make it easy to access a file if you know its name, OS/2 also makes it easy to access pipes, semaphores, queues, or shared memory blocks if you know their names (see Ray Duncan, "Interprocess Communications in OS/2," June 1989 DDJ). Likewise, with run-time dynlinks, if you have the ASCIIZ name of a routine, you can load the code for the routine and derive its equivalent function pointer.

Given the ASCIIZ name of a DLL, DosLoadModule loads the DLL and returns a handle to it. This module handle can then be passed, along with the ASCIIZ name of a procedure exported from the DLL, to DosGetProcAddr, which returns the function pointer corresponding to the procedure name.

A function which, given an ASCIIZ string, returns a function pointer, is typical of interpreters and debuggers. Here we find it at the core of an operating system.

Furthermore, OS/2 is providing more than just a functional interface to a symbol table. Ralph Griswold, inventor of the string-oriented programming languages SNOBOL and Icon, makes the point that run-time dynamic-linking "takes two forms: 1. Connecting the string name of a function with resident code for it -- that is, doing at run-time what usually is done at compile-time, and; 2. Actually loading non-resident code for a named function. The latter is, of course, more difficult to implement than the former." OS/2 provides the second variety of delayed linking, which is a superset of the first.

While OS/2 provides a method for turning ASCIIZ names into function pointers, possibly by loading the code for the function, it does not provide any method for actually calling the function. It provides "directory assistance," but won't dial the number for you. Instead, the function pointer returned from DosGetProcAddr should be passed, together with any arguments the procedure expects, to whatever facility your language provides for indirect far calls.

Using the OS/2 Module Manager

CALLDLL1.C (Listing One, page 102) is a short example of how this works. When this rather contrived example runs, it links to the file ALIAS.DLL. (ALIAS is a handy command-line editor for OS/2, written by Andrew Estes, and modelled after Chris Dunford's CED for MS-DOS.) CALLDLL1.C calls two procedures provided by ALIAS: First, a command-line synonym is added to ALIAS's synonym table, then we ask ALIAS to display the table.

If DosLoadModule returns anything other than 0, it was unable to load the requested module. It will also fill a buffer with the name of the offending module if you provide such a buffer (in CALLDLL1.C, I didn't). This sounds silly, because the offending module is the same as the module you asked to be loaded -- but that's not actually the case; DLLs can call other DLLs. Note also that DosLoadModule should be passed the name of the module, (that is, ALIAS), not the name of the file (for example, ALIAS.DLL), except when accessing a DLL not located along your LIBPATH. In that case use the full pathname. (For example,

     "C:\\OS2\\ALIAS\\ALIAS.DLL"

In Listing One, I'm using hard-wired strings instead of hard-wired function pointers (LIST_SYN instead of List_Syn( )). This helps to illustrate how run-time dynamic linking works, but generally it is pointless to trade one form of inflexibility for another. Instead of embedding string literals like ALIAS and LIST_SYN directly in the code, usually we'll use a string-valued variable or an expression that yields a string.

This example isn't quite as foolish as it first appears. Had this program accessed List_Syn( ) using load-time dynamic linking, OS/2 would refuse to even load the program on a machine that didn't have ALIAS: "SYS1804: The system cannot find the file ALIAS." But by using run-time dynamic linking, the program itself detects the absence of the DLL and can respond in some appropriate way. This is an additional benefit of run-time dynamic linking over load-time ("eager") dynamic linking. Load-time dynamic links are hard-wired; they're actually not all that dynamic.

There is one "gotcha" involved with calling DosGetProcAddr, which trips everybody up at least once: It is case-sensitive. In Listing One, I passed the function name ADDSYN to DosGetProcAddr. DosGetProcAddr would have failed on "AddSyn" or "addsyn," returning ERROR_PROC_NOT_FOUND. The name you use must match exactly the name exported from the DLL. To get a listing of these names, you can use Microsoft's EXEHDR utility. Unfortunately there is no OS/2 API call to enumerate the procedures a DLL exports (though you can write enumproc ( ) yourself).

Generally, routines that use the Pascal calling convention will be exported in ALL CAPS, though Modula-2 compilers for OS/2 produce export names that look like Module$ProcName or Module_ProcName. DLL routines using the cdecl calling convention will generally be exported in_lowercase, with a leading underscore.

There's one other trick to using DosGetProcAddr. Unfortunately, while the OS/2 kernel masquerades as DOSCALLS.DLL, this pseudomodule does not export ASCIIZ names. Every other OS/2 module (including KBDCALLS and VIOCALLS) exports them, but DOSCALLS provides only "ordinal numbers."

Once you know the ordinal number for a DOSCALL function, you can pass it to DosGetProcAddr in one of two forms: Take the example of DosGetProcAddr itself, whose ordinal number is 45

     DosGetProcAddr(module, "#45",
       &dosgetprocaddr);
   dosgetprocaddr(module,
       MAKEP(0,45), &dosgetprocaddr);

Aside from the annoying special case of DOSCALLS, DosGetProcAddr is, in effect, a "named" equivalent to the MS-DOS GetVect operation. But while DOS GetVect is complemented by SetVect, and likewise GetTrapAddress on the Macintosh is complemented by SetTrapAddress, in OS/2 there is no DosSetProcAddr. Instead, individual modules are responsible for supplying their own mechanism for plug-in replacements. The KBD-, MOU-, and VIO-based subsystems in OS/2 can be replaced using KbdRegister, MouRegister, and VioRegister. There is no such thing as DosRegister.

When finished with a module a program can call DosFreeModule. This is unnecessary if the program is about to terminate anyway. DosFreeModule decrements a reference count that OS/2 keeps for DLLs: When the count goes down to zero, the DLL is released from memory (remember that multiple applications may be using a DLL at the same time). After calling DosFreeModule, a program still has the function pointers returned from DosGetProcAddr, but they are no longer valid.

While C is generally used to illustrate run-time dynlinks, similar programs can be written in any other language for which an OS/2 version is available. CALLDLL.MOD (Listing Two, page 102) is a Modula-2 equivalent to Listing One. I used JPI TopSpeed Modula-2 here. Because of the non-standardization of Modula-2 equivalent code for Stony Brook Modula-2 or Logitech Modula-2 would look slightly different. CALLDLL.LSP (Listing Three, page 102) is the same program again, this time written in OS2XLISP. This version is so short not only because of the expressive power of Lisp, but because OS2XLISP itself is built around run-time dynamic linking.

DosLoadModule and DosGetProcAddr are equally important. Usually the function name is viewed as more of a variable than the module name, but in fact it often works the other way around; the function name can stay the same while the module name varies. For instance, you might have the same function BitBlt( ) implemented in a VGA.DLL, an EGA.DLL, and so on. There is an analogy here to polymorphism in object-oriented programming, where the same generic operation can be applied to objects of different classes in an inheritance chain: The OS/2 module is like an OOPs class, and the functions within a module are like the selectors implemented by that class. Just as in OOPs, the run-time system transforms a class/selector pair into a single output, the method, so OS/2 run-time dynlinks transform a module/function name pair into a single output, the function pointer.

A Higher Level

It's convenient to say that DosGetProcAddr takes an ASCIIZ name and returns the corresponding function pointer but, as Listings One and Two show, that's not exactly how it looks. Instead, as with all OS/2 kernel functions, DosGetProcAddr returns only an error code. Because most high-level languages provide only one retval, any other information, such as the function pointer we are actually interested in, must be passed back in VAR parameters.

We think of DosGetProcAddr in this way:

FUNCPTR DosGetProcAddr
(HANDLE module, ASCIIZ procname);

But the way OS/2 provides DosGetProcAddr actually looks like this:

   ERRCODE DosGetProcAddr
              (ASCIIZ failbuf, WORD size, ASCIIZ procname, FUNCPTR *p_procname);

Compared with MS-DOS functions, OS/2 functions look high-level. Whereas DOS functions are invoked by stuffing registers and doing an INT 21, OS/2 functions are invoked by putting arguments on the stack and doing a FARCALL. But this should not delude us into thinking that the OS/2 API actually is high-level, only the parameter passing mechanism is high level.

There is no rule saying we have to use the OS/2 facilities in the "raw" form in which Microsoft and IBM provide them, however. In fact, aside from demo programs like those in Listings One and Two, OS/2 facilities should never be used directly. For reasons of portability, readability, maintainability, and various other "bilities," which I've forgotten, code that directly depends on OS/2 should be isolated from application-level code. Present company excluded, distrust any programmer who puts #include "os2.h" in the main module of their program. OS/2 may have that HLL look, but putting DosGetProcAddr( ) in the middle of main( ) is the same as putting int86(0x21, &r, &r) in the middle of main( ).

PROC1.C (Listing Four, page 102) provides a level of abstraction on top of the OS/2 Module Manager. This file #includes "os2.h," but a program that uses run-time dynlinks doesn't. Instead, it #includes "procaddr.h" (see Listing Five, page 102).

While the functions in Listing Four are only two lines each, they add considerable expressive power over the native OS/2 versions. Now, we can say things as shown in Example 1.

Example 1: Expressive functions

       WORD alias = loadmodule("ALIAS");
       PFN listsyn = (PFN) getprocaddr (alias, "LIST_SYN");
       if (listsyn)
           (*listsyn) ();
       freemodule (alias);

or:

       PFN listsyn;
       if (listsyn = (PFN) procaddr ("ALIAS", "LIST_SYN"))
           (*listsyn) ();

PFN is the typedef that OS/2 provides for pascal function pointers. Why does getprocaddr() in Listing Four instead return a ULONG (4-byte unsigned number)? The answer is: Partially as a reminder that a far function pointer is just a 4-byte number, but mainly because the Microsoft C compiler dislikes casting pascal function pointers to cdecl or vice versa, though it accepts casting between a function pointer and a ULONG (it correctly warns about "different levels of indirection").

The C program in Listing Six, page 102 (CALLDLL2.C) uses the higher-level routines in Listing Four; it dynamically links to _printf in CRTLIB.DLL, the "C run-time library in a DLL" provided with Microsoft C 5.1. The file size is a sure sign that something strange is going on here: We're calling printf() but CALLDLL2.EXE is only 1817 bytes! One odd property of CRTLIB.DLL is that you can run-time dynamically link to it only if you are also load-time dynamically linked to it; this is because an undocumented function in CRTLIB.DLL (_CRT_INIT) is used to initialize the C run-time library.

Note that CALLDLL2.C does not #include <stdio.h>, even though it calls printf( ). printf( ) is coming to us only at run-time. There's no point in telling the linker about it, much less the compiler and preprocessor! This is a difficult point to sink in: the only reason CALLDLL2.C is able to use printf( ) is because it passed the string CRTLIB to loadmodule( ) and the string _printf to getprocaddr( ).

Some of the lines in CALLDLL2.C are numbered to make discussion easier. Line 1 shows the simple call (*printf)( ). Line 2 uses the newer ANSI C style for calling through a function pointer, in which pfn( ) is the same as (*pfn)( ). This matches Modula-2, in which the syntax for indirect calling through a procedure pointer is indistinguishable from a "normal" call. In Line 3, printf displays its own address, using %Fp. printf( ) returns the number of characters printed. Line 4 makes sure that we really can get this retval.

String Invocation

"The general format of a procedure call is

     expression(parameters)

Usually we use the declared procedure name as the expression, but there is nothing to prevent us from writing an arbitrarily complicated expression."

-- Martin Richards and Colin Whitby-Strevens BCPL: The Language and its Compiler (1980)

Line 5 of Listing Six is a little odd, but it best represents what we're actually doing:

  ((CFN) getprocaddr(loadmodule ("CRTLIB"),"_printf"))("Goodbye");

The pointer to printf is just the retval from getprocaddr; we call printf through this function retval. The assembly equivalent is shown in Example 2. What's going on here is "string invocation." Essentially, we're not calling printf through a pointer (which is just a temporary unnamed value on the stack), we're calling printf through its ASCIIZ name. The string function names that appear in "string invocation" are part of the larger class of objects, "executable strings" (the DDE_EXECUTE strings used in OS/2 and Windows Dynamic Data Exchange are executable strings). Syntactic sugar for this might be: "CRTLIB._printf' ("Goodbye");

Example 2: Assembly language equivalent

  PUSH "Goodbye"
  PUSH "_printf"
  PUSH "CRTLIB"
  CALL loadmodule
  ; loadmodule consumed "CRTLIB"
  ; and produced handle to crtlib
  CALL getprocaddr
  ; getprocaddr consumed crtlib-handle and "_printf"
  ; and produced pointer to printf on top of stack
  ; "Goodbye" is still on stack
  CALL [top of stack]
  POP retval from _printf

Because () is a binary operator between a function and the set of its arguments, this weird syntax could be implemented in C++ by overloading the ( ) function-invocation operator. Actually, the Icon programming language (successor to SNOBOL) offers just this syntax, in which "write" ("Goodbye") is equivalent to write("Goodbye").

At this point, we had better ask again, "Of what possible use is that?" Because run-time dynamic linking means that an ASCIIZ string can be used in place of the name of a procedure, any expression that yields such a string can also be used. The procedure to be invoked is just as much a variable as the arguments it takes.

The module name and function name are just strings, why hard-wire them into the executable? Why not pass them on the command line? Let argv[1] be passed to loadmodule( ) and argv[2] be passed to getprocaddr( )! Let argv[3..argc] hold any arguments expected by the function whose name we've put in argv[2]. The result is a general-purpose program in which the function to be invoked and the arguments to be passed to it are completely up in the air until the program runs. This is exactly what we're going to do: Write a mini C interpreter, in very few lines of code, using OS/2 run-time dynlinks.

Mini Tiny Small C

The mini-interpreter's syntax is shown in Figure 1where [args . . .] are zero or more arguments to the function, and each one is either a string, a character, an unsigned word, a long or a float. CALLDLL.C (Listing Seven, page 102) uses some dumb, but generally effective, rules to figure out the type of an argument. [%mask] is an optional printf( ) mask that both designates the type of the function's return value, and displays that return value. For instance, %s tells CALLDLL that a function returns a 4-byte value, which should be printed out as a string. The default retval mask is %u, which gets a two - byte retval and displays it as an unsigned word. Example 3 lists some legal calls to the interpreter.

Figure 1: Syntax for the mini-interpreter

  calldll <module name> <function name or ordinal number> [args...] [%mask]

Example 3: Legal calls to the interpreter

  calldll viocalls VIOWRTTTY "hello world" 11 0
  calldll doscalls DosBeep 2000 300
  calldll doscalls 50 2000 300                     ; DOSBEEP
  calldll doscalls DosMkDir \foobar 0L
  calldll doscalls DosRmDir \foobar 0L
  calldll pmwin WINQUERYACTIVEWINDOW 1L 0 %lu
  calldll crtlib _printf "goodbye world: %lu" 666L
  calldll crtlib SQRT -1.0 %f
  calldll crtlib _toupper 'b' %c
  calldll jpilib FIO$Exists 12 CALLDLL.EXE

This is truly a general-purpose program: It can write to the screen, beep the speaker, get the square root of -1, get the HWND of the active window in PM, or perform pretty much any action put in a DLL.

For the most part, CALLDLL is an ignorant "pass through." The intelligence for associating the strings VIOCALLS and VIOWRTTTY with the function VioWrtTty( ), for example, is located entirely within OS/2's module manager. CALLDLL blindly passes the first two arguments on the command-line to DosLoadModule and DosGetProcAddr.

Ignorance is bliss: The more ignorant the pass-through is, the more powerful. Because CALLDLL does no interpretation on module and function names, it can be used to call functions that didn't even exist when CALLDLL was compiled. This is the same flexibility as provided by "late binding" in OOPs: Instead of using a switch( ) statement associating specific actions with various run-time symbols, a program simply passes the symbol through and the system determines what action to call.

Now, if CALLDLL represents the ideal dumb pass-through, what are those switch( ) statements in Listing Seven?

While the module and function names are passed through to OS/2, the functions args and retval are a different matter. Once OS/2 has given us back a function pointer, we're on our own. Almost all the code in CALLDLL.C is devoted to pushing arguments on the stack before the function is called, and getting back a retval afterwards.

Here we run into a problem with all high-level languages: There is no completely general procedure pointer. C (actually, pre-ANSI C without prototypes) can easily handle functions with different numbers and types of arguments gated through the same function pointer, but even C won't multiplex functions with different types of return values through the same function pointer. Pushing a function's arguments, calling the function and getting its return value are too tightly coupled.

In the underlying assembly language there is a completely general function pointer, because the various components of function calling are clearly separated:

   PUSH param(s)
   PUSH func
   CALL [top of stack)
   RETRIEVE retval(s)

Forth is probably the only HLL that naturally splits function calls up in this way. In order to write the mini-interpreter, I needed to simulate this in C.

Jiggling the Stack

The first step is to write a function to push its argument on the stack and to leave it there. In his book, The Programmer's Essential OS/2 Handbook, David Cortesi shows how to write such a function for Pascal: "The gimmick is simple: return without clearing the stack!" To do this in C, the function uses the Pascal calling convention. Here is the entire function:

     VOID NEAR PASCAL push( ) {}

Given C's relaxed type checking, this can be called with any type of argument. The compiler dutifully pushes the argument onto the stack and there it stays.

To handle multiple arguments, CALLDLL.C uses a "push loop:" push( ) is called for each command-line argument, working upwards for the Pascal calling convention and downwards for cdecl. Note that push() must be called from within the same function that's going to make the indirect call to consume the arguments on the stack. For this reason, CALLDLL.C calls push( ) from within a PUSH_ARG( ) macro that gets expanded inline.

Now the arguments are on the stack. To get the correct return value, our generic function pointer f must be cast to the appropriate type of function pointer. It is called with no arguments because its arguments are already on the stack as shown in Example 4.

Example 4: Arguments already on the stack

  switch (retval_typ)
  {
      case typ_string:    printf(mask, ((STRFN) f) ()); break;
      case typ_word:      printf(mask, f()); break;
      ...
  }

CALLDLL does nothing with f( )'s return value except print it (using whatever printf mask the user supplied). The invocation of f( ) takes place as an argument to printf.

Finally, it is the responsibility of the caller of a cdecl function to pop the arguments off the stack, so we need a pop( ) function. I couldn't write this in C, so it is supplied by POP.ASM (Listing Eight, page 104).

Naming DOSCALLS

I said earlier that the DOSCALLS pseudomodule (the API exported by the OS/2 kernel) is an anomaly in that its routines (such as DosGetProcAddr, DosAllocSeg, or DosOpen) do not have ASCIIZ names present at run-time.

You can easily correct this deficiency. Instead of: calldll DOSCALLS 50 2000 300, you can say: calldll DOSCALLS DosBeep 2000 300.

PROC2.C (Listing Nine, page 104) shows a new version of the higher-level run-time dynamic-linking routines. Solving the DOSCALLS problem is another reason to have our own level on top of OS/2 -- we can use it to iron out such inconsistencies. PROC2.H is the external interface to PROC2.C, and is shown in Listing Ten (page 106).

In the new version of getprocaddr( ), if an ASCIIZ name is passed in for a function in DOSCALLS, the function getdoscall( ) does a binary search of a table that associates ASCIIZ names with function pointers.

Where does this table come from? (Warning! Extremely boring material ahead!) Listing Eleven (page 106) is an AWK script that massages the file BSEDOS.H into a table of DOSCALLS. The AWK output, DOSCALLS.C, is not reproduced here, because it is boring. If you don't have AWK but want to produce your own DOSCALLS.C, you can use any decent text editor on a copy of BSEDOS.H. The thing is to turn lines like:

  USHORT APIENTRY DosGetProc-Addr(HMODULE, PSZ, PPFN);

into lines like:

"DosGetProcAddr", DosGetProcAddr,

Unfortunately this table introduces a touch of "early binding" into an otherwisewait-until-the-last-possible-moment-to-do-it program.

A DLL for Handling DLLs

There is one last step in the development of the routines in PROC2.C: Put them in a DLL. Listing Twelve (page 106) shows PROCADDR.DEF, which defines the DLL for the OS/2 linker. All the data in PROCADDR.DLL is shared -- multiple clients of PROCADDR.DLL use the same version of the DOSCALLS table.

Once PROC2.C is in a DLL, we can even call these routines from the CALLDLL command line. For those who like this sort of thing, this is the sort of thing that they like:

C > calldll PROCADDR PROCADDR
      PROCADDR PROCADDR %
      Fp044F:00A0

This calls the function procaddr( ) in PROCADDR.DLL, passing it the parameters PROCADDR and PROCADDR, so that the address of procaddr( ) itself is printed out.

Because it is extremely stupid, CALLDLL is general-purpose and powerful. On the other hand, because CALLDLL knows nothing about the routines that it calls, it can't check the arguments passed on the command-line. For example, CALLDLL DOSCALLS DOSGETINFOSEG 1L 2L is wrong. But because OS/2 runs under protected mode this isn't too bad. Instead of crashing your machine, OS/2 just terminates CALLDLL and displays the ever-popular GP Fault dump.

If this were a genuine interpreter that tried to execute more than one line of user code, this would be a big problem. Even though the error lies in the user's code, OS/2 terminates the interpreter because, as far as it's concerned, that's what caused the general-protection violation. It would seem the only choice for OS/2 interpreters is either to verify all user input, or to let OS/2 close you down because of bad user input. In fact, there is a way to catch GP faults in OS/2, using DosPTrace( ), but this is a subject for another article.

Other improvements could be made to CALLDLL: Interpreting multiple lines of code from a file, putting function results into variables, taking the address of variables (for example, how would one properly call DosGetInfoSeg?). These are left as an exercise for the reader.

Extensible OS/2

Richard Stallman's original article on Emacs ("EMACS: The Extensible, Customizable, Self-Documenting Display Editor," Proceedings of the ACM SIGPLAN SIGOA Symposium on Text Manipulation, June, 1981) is sort of the manifesto of extensible systems. One of the requirements he sets forth is that "an on-line extensible system must be able to accept, and then execute, new code while it is running." Stallman goes on to argue that what one needs is a good language (Lisp, for example) rather than a good operating system with dynamic linking (such as Multics). In fact, there is no conflict here, because OS/2 is as extensible an operating system as Lisp is a language.

Compiling and linking are processes of throwing away information. But dynamic linking relies on the presence at run time of names usually discarded in compiling or linking. Keeping the ASCIIZ names of functions around in DLLs means that executables under OS/2 have moved a little closer to the type of "object code" found in interpreted environments and debuggers.

Should OS/2 succeed, run-time dynlinks will play an important role in the construction of extensible languages and of products with embedded languages and add-in facilities. For a system built around add-in DLLs, run-time dynamic linking is the ultimate Add-In Manager.

_LINKING WHILE THE PROGRAM IS RUNNING_ by Andrew Schulman

[LISTING ONE]

<a name="0236_0015">

/*
calldll1.c -- run-time dynamic linking to Estes's ALIAS.DLL
cl -Lp calldll1.c
*/

#include <stdlib.h>
#include <stdio.h>
#define INCL_DOSMODULEMGR
#include "os2.h"

#define NIL             ((void far *) 0)

void fail(char *msg)    { puts(msg); exit(1); }

void main()
{
    unsigned (far pascal *addsyn)(char far *msg);
    unsigned (far pascal *listsyn)(void);
    unsigned alias;

    if (DosLoadModule(NIL, 0, "ALIAS", &alias) != 0)
        fail("can't find ALIAS");
    DosGetProcAddr(alias, "ADDSYN", &addsyn);
    DosGetProcAddr(alias, "LIST_SYN", &listsyn);
    (*addsyn)("ep \\os2\\eps\\epsilon");
    (*listsyn)();
    DosFreeModule(alias);
}






<a name="0236_0016"><a name="0236_0016">
<a name="0236_0017">
[LISTING TWO]
<a name="0236_0017">

MODULE calldll;
(* JPI TopSpeed Modula-2 for OS/2 *)
(* run-time dynamic linking to Estes's ALIAS.DLL *)

FROM InOut IMPORT WriteString, WriteLn;
IMPORT Dos;

PROCEDURE fail (msg : ARRAY OF CHAR);
BEGIN
    WriteString(msg); WriteLn; HALT;
END fail;

VAR
    addsyn : PROCEDURE (ADDRESS) : CARDINAL;
    listsyn : PROCEDURE () : CARDINAL;
    alias : CARDINAL;
    ret : CARDINAL;     (* ignored retval *)

BEGIN
    IF (Dos.LoadModule(NIL, 0, "ALIAS", alias) # 0) THEN
        fail("can't find ALIAS");
    END;
    ret := Dos.GetProcAddr(alias, "ADDSYN", PROC(addsyn));
    ret := Dos.GetProcAddr(alias, "LIST_SYN", PROC(listsyn));

    (* In the next line, the string _must_ be passed as an
    ADDRESS, not as an ARRAY OF CHAR:  Modula-2 passes open
    arrays as _six_ bytes on the stack -- two bytes for the
    length, followed by the address of the array itself --
    but OS/2 DLL's generally expect only the string itself
    (zero-terminated of course). *)

    ret := addsyn(ADR("ep \os2\eps\epsilon"));
    ret := listsyn();
    ret := Dos.FreeModule(alias);
END calldll.







<a name="0236_0018"><a name="0236_0018">
<a name="0236_0019">
[LISTING THREE]
<a name="0236_0019">

; calldll.lsp
; OS2XLISP run-time dynamic linking to Estes's ALIAS.DLL

(define alias (loadmodule "ALIAS"))
(if (zerop alias)
    (error "can't find ALIAS"))
(call (getprocaddr alias "ADDSYN") "ep \os2\eps\epsilon")
(call (getprocaddr alias "LIST_SYN"))
(freemodule alias)






<a name="0236_001a"><a name="0236_001a">
<a name="0236_001b">
[LISTING FOUR]
<a name="0236_001b">

/*
proc1.c -- implements higher-level access to OS/2 run-time dynlinks
cl -c -Lp proc1.c
*/

#define INCL_DOSMODULEMGR
#include "os2.h"
#include "procaddr.h"

#define NIL         ((void far *) 0)

WORD loadmodule(ASCIIZ name)
{
    WORD h;
    return DosLoadModule(NIL, 0, name, (PHMODULE) &h) ? 0 : h;
}

ULONG getprocaddr(WORD module, ASCIIZ name)
{
    ULONG pf;
    return DosGetProcAddr(module, name, (PPFN) &pf) ? 0 : pf;
}

ULONG procaddr(ASCIIZ module, ASCIIZ name)
{
    return getprocaddr(loadmodule(module), name);
}

BOOL freemodule(WORD h)
{
    return (! DosFreeModule(h));
}






<a name="0236_001c"><a name="0236_001c">
<a name="0236_001d">
[LISTING FIVE]
<a name="0236_001d">

/*
procaddr.h -- higher-level access to OS/2 run-time dynlinks
*/

typedef unsigned WORD;
typedef unsigned short BOOL;
typedef unsigned long ULONG;
typedef char *ASCIIZ;

WORD loadmodule(ASCIIZ name);
ULONG getprocaddr(WORD module, ASCIIZ name);
ULONG procaddr(ASCIIZ module, ASCIIZ name);
BOOL freemodule(WORD handle);








<a name="0236_001e"><a name="0236_001e">
<a name="0236_001f">
[LISTING SIX]
<a name="0236_001f">

/*
calldll2.c -- run-time dynamic linking to CRTLIB.DLL, using PROC1.C
requires MSC 5.1 CRTEXE.OBJ

cl -AL -c calldll2.c proc1.c
link /nod/noi crtexe.obj calldll2 proc1,calldll2,,crtlib.lib os2;

output:
    Hello from calldll2
    Hello again, using new ANSI C style
    _printf lives at 03EF:1098
    printf returned 27
    Goodbye
*/

#include "procaddr.h"

typedef ULONG (far cdecl *CFN)();

main(int argc, char *argv[])
{
    WORD (far cdecl *printf)();
    WORD crtlib;
    WORD ret;

    crtlib = loadmodule("CRTLIB");
    printf = (CFN) getprocaddr(crtlib, "_printf");
    (*printf)("Hello from %s\n", argv[0]);                            /* 1 */
    printf("Hello again, using new ANSI C style\n");                  /* 2 */
    ret = printf("_printf lives at %Fp\n", printf);                   /* 3 */
    printf("printf returned %d\n", ret);                              /* 4 */
    ((CFN) getprocaddr(loadmodule("CRTLIB"),"_printf"))("Goodbye");   /* 5 */
    freemodule(crtlib);
}







<a name="0236_0020"><a name="0236_0020">
<a name="0236_0021">
[LISTING SEVEN]
<a name="0236_0021">

/*
calldll3.c -- run-time dynamic linking from the command-line
requires MSC 5.1 CRTEXE.OBJ, uses proc2.obj or procaddr.dll
doesn't include "os.h"

to use proc2.obj:
cl -AL -c -Gs2 -Ox -W2 calldll3.c proc2.c
link /nod/noi crtexe.obj calldll3 proc2,calldll3.exe,,crtlib.lib os2.lib;

to use procaddr.dll (IMPLIB procaddr.lib):
cl -AL -c -Gs2 -Ox -W2 calldll3.c
link /nod/noi crtexe.obj calldll3,calldll3,,procaddr.lib crtlib.lib os2.lib;

to run:
calldll3 <module name> <function name or ordinal number> [args...] [%mask]
examples:
calldll3 VIOCALLS VIOWRTTTY "hello world" 5 0
calldll3 doscalls DosMkDir \foobar 0L
calldll3 doscalls DosRmDir \foobar 0L
calldll3 DOSCALLS DosBeep 2000 300
calldll3 DOSCALLS 50 2000 300                       ; DosBeep
calldll3 CRTLIB _printf "goodbye world: %lu" 666L " [%d]"
calldll3 CRTLIB ACOS -1.0 %.15f
calldll3 CRTLIB SQRT -1.0 %f
calldll3 CRTLIB _toupper 'b' %c
calldll3 PROCADDR LOADMODULE PROCADDR %X
*/

#include <mt\stdlib.h>
#include <mt\stdio.h>
#include <mt\string.h>

#include "local.h"
#include "proc2.h"

typedef enum { typ_string, typ_byte, typ_word, typ_long, typ_float } TYPE;

TYPE NEAR type(char *arg);
TYPE NEAR retval_type(char *s);

VOID fail(char *msg) { puts(msg); exit(1); }

/*
    push() : see Cortesi, Programmer's Essential OS/2 Handbook, pp.136-137
*/
VOID NEAR PASCAL push() { }
extern WORD pop(void);

#define PUSH_ARG(arg)   \
{   \
    switch (type(arg))  \
    {   \
        case typ_string:    push(arg);          c += 2; break;  \
        case typ_byte:      push(arg[1]);       c += 1; break;  \
        case typ_word:      push(atoi(arg));    c += 1; break;  \
        case typ_long:      push(atol(arg));    c += 2; break;  \
        case typ_float:     push(atof(arg));    c += 4; break;  \
    }   \
}

#define SYNTAX_MSG  \
    "syntax: calldll3 <module name> <func name or ord#> [args...] [%mask]"

main(int argc, char *argv[])
{
    FN f;
    TYPE retval_typ = typ_word;
    char *mask = "%u";
    WORD module;
    BOOL is_cdecl;
    int i, c;

    if (argc < 3)
        fail(SYNTAX_MSG);

    /* handle optional printf mask */
    if (strchr(argv[argc-1], '%'))
        retval_typ = retval_type(mask = argv[--argc]);

    if ((module = loadmodule(argv[1])) == 0)
        fail("can't load module");

    /* pass ASCIIZ string or ordinal number */
    f = getprocaddr(module, isdigit(argv[2][0]) ? atol(argv[2]) : argv[2]);
    if (! f)
        fail("can't get function");

    is_cdecl = ! (strcmp(strupr(argv[1]), "CRTLIB"));

    /* push in reverse order for cdecl */
    if (is_cdecl)
    {
        for (i=argc-1, c=0; i>=3; i--)
            PUSH_ARG(argv[i]);
    }
    else
    {
        for (i=3; i<argc; i++)
            PUSH_ARG(argv[i]);
    }

    /* args are on the stack : call (*f)() and print retval */
    switch (retval_typ)
    {
        case typ_string:    printf(mask, ((STRFN) f)()); break;
        case typ_byte:      printf(mask, ((BYTEFN) f)()); break;
        case typ_word:      printf(mask, f()); break;
        case typ_long:      printf(mask, ((LONGFN) f)()); break;
        case typ_float:     printf(mask, ((FLOATFN) f)()); break;
    }

    if (is_cdecl)
        for (i=0; i<c; i++)
            pop();

    freemodule(module);
    return 0;
}

/*
    type() uses some dumb rules to determine the type of an argument:
        if first character of arg is a digit or '-'
            and if arg contains '.' then it's a floating-point number
            else if last character is an 'L' then it's a long
            else it's a unsigned word
        else if first character is an apostrophe
            it's a single-byte character
        otherwise
            it's a string
*/
TYPE NEAR type(char *arg)
{
    if (isdigit(arg[0]) || (arg[0] == '-' && isdigit(arg[1])))
    {
        char *p = arg;
        while (*p)
            if (*p++ == '.')
                return typ_float;
        return (*--p == 'L') ? typ_long : typ_word;
    }
    else
        return (arg[0] == '\'') ? typ_byte : typ_string;
}

/*
    retval_type() uses a printf() mask (e.g., %s or %lX) to determine
    type of return value
*/
TYPE NEAR retval_type(char *s)
{
    while (*s)
    {
        switch (*s)
        {
            case 's' :  return typ_string; break;
            case 'c' :  return typ_byte; break;
            case 'p' : case 'l' : case 'I' : case 'O' : case 'U' :
                        return typ_long; break;
            case 'e' : case 'E' : case 'f' : case 'g' : case 'G' :
                        return typ_float; break;
        }
        s++;
    }

    /* still here */
    return typ_word;
}






<a name="0236_0022"><a name="0236_0022">
<a name="0236_0023">
[LISTING EIGHT]
<a name="0236_0023">

; pop.asm

DOSSEG
.MODEL large
.CODE pop_text
PUBLIC _pop, _sp

_pop proc far
    ; save away far return address
    pop cx
    pop bx
    ; pop word off stack and return it in AX
    pop ax
    ; push far return address back on stack
    push bx
    push cx
    ret
_pop endp

; useful for testing
_sp proc far
    mov ax,sp
    ret
_sp endp

end





<a name="0236_0024"><a name="0236_0024">
<a name="0236_0025">
[LISTING NINE]
<a name="0236_0025">

/*
proc2.c

to make procaddr.dll:
cl -Alfu -c -Gs2 -Ox -W2 -DDLL proc2.c
link /nod/noi proc2,procaddr.dll,,llibcdll.lib os2,procaddr.def;
implib procaddr.lib procaddr.def
copy procaddr.dll \os2\dll
*/

#include <string.h>

#ifdef DLL
int _acrtused = 0;
#endif

#define INCL_DOS
#include "os2.h"

#include "local.h"
#include "proc2.h"

typedef struct {
    char *name;
    USHORT (APIENTRY *f)();
    } DOSCALLS;

/*
    include table generated from BSEDOS.H with AWK script DOSCALLS.AWK
    table looks like:
        LOCAL DOSCALLS NEAR dos[] = {
            "", 0,
            ...
            "DosGetHugeShift", DosGetHugeShift,
            "DosGetInfoSeg", DosGetInfoSeg,
            ...
            } ;
    DOSCALLS.C also contains #define NUM_DOSCALLS
*/
#include "doscalls.c"

LOCAL FN NEAR getdoscall(ASCIIZ name);
LOCAL USHORT NEAR doscalls = 0;

WORD pascal loadmodule(ASCIIZ name)
{
    WORD h;
    return DosLoadModule((void far *) 0, 0, name, (PHMODULE) &h) ? 0 : h;
}

/*
    if name is actually a four-byte ordinal number, use it as is
    otherwise if module is not DOSCALLS, use it as is
    otherwise if module is DOSCALLS, get ordinal number and use it instead
*/
FN pascal getprocaddr(WORD module, ASCIIZ proc)
{
    FN f;

    if (! doscalls) doscalls = loadmodule("DOSCALLS");

    if ((module == doscalls) && FP_SEG(proc))
        return getdoscall(proc);
    else
        return DosGetProcAddr(module, proc, (PPFN) &f) ? 0 : f;
}

FN pascal procaddr(ASCIIZ module, ASCIIZ proc)
{
    return getprocaddr(loadmodule(module), proc);
}

BOOL pascal freemodule(WORD h)
{
    return (! DosFreeModule(h));
}

/*
    do binary search of table, looking for name, returning function ptr
 */
LOCAL FN NEAR getdoscall(ASCIIZ name)
{
    signed cmp, mid;
    signed base = 1, top = NUM_DOSCALLS+1;

    name = strupr(name);

    for (;;)
    {
        mid = (base + top) / 2;
        cmp = strcmp(name, strupr(dos[mid].name));

        if      (cmp == 0)      return (FN) dos[mid].f;
        else if (mid == base)   return 0;
        else if (cmp < 0)       top = mid;
        else if (cmp > 0)       base = mid;
    }
}





<a name="0236_0026"><a name="0236_0026">
<a name="0236_0027">
[LISTING TEN]
<a name="0236_0027">

/*
proc2.h
*/

extern WORD pascal loadmodule(ASCIIZ name);
extern FN pascal getprocaddr(WORD module, ASCIIZ proc);
extern FN pascal procaddr(ASCIIZ module, ASCIIZ proc);
extern BOOL pascal freemodule(WORD handle);






<a name="0236_0028"><a name="0236_0028">
<a name="0236_0029">
[LISTING ELEVEN]
<a name="0236_0029">

# doscalls.awk
# creates doscalls.c from bsedos.h
# doscalls.c is #included by proc2.c
# C>sort -b +2 \os2\inc\bsedos.h | awk -f doscalls.awk > doscalls.c

# bsedos.h contains prototypes such as:
#   USHORT APIENTRY DosCreateThread(PFNTHREAD, PTID, PBYTE);
# doscalls.awk turns these into string name/function ptr pairs:
#   "DosCreateThread", DosCreateThread,

BEGIN                           { init() }

END                             { fini() }

$2 ~ /APIENTRY/ && $3 ~ /Dos/   { doscall($3) }

function init() {
    print "/* doscalls.c */"
    print "LOCAL DOSCALLS NEAR dos[] = {"
    print "\"\",\t0,"
    }

function fini() {
    print "} ; "
    print "#define NUM_DOSCALLS\t", num_doscalls
    }

function doscall(s) {
    gsub(/\(/, " ", s)                  # replace open paren with space
    split(s, arr)                       # tokenize
    print "\"" arr[1] "\", " arr[1] "," # print with and without quotes
    num_doscalls++
    }







<a name="0236_002a"><a name="0236_002a">
<a name="0236_002b">
[LISTING TWELVE]
<a name="0236_002b">

; procaddr.def

LIBRARY PROCADDR
DESCRIPTION 'Run-Time Dynamic Linking'
DATA SINGLE SHARED
PROTMODE
EXPORTS
    LOADMODULE
    GETPROCADDR
    PROCADDR
    FREEMODULE




<a name="0236_002c"><a name="0236_002c">
<a name="0236_002d">
[LISTING THIRTEEN]
<a name="0236_002d">

/*
local.h -- miscellaneous definitions
*/

typedef unsigned short WORD;
typedef unsigned short BOOL;
typedef char far *ASCIIZ;
typedef unsigned long ULONG;
typedef double FLOAT;
typedef WORD (far *FN)();
typedef ASCIIZ (far *STRFN)();
typedef char (far *BYTEFN)();
typedef WORD (far *WORDFN)();
typedef ULONG (far *LONGFN)();
typedef FLOAT (far pascal *FLOATFN)();

#define FP_SEG(p)   ((WORD) ((ULONG) (p) >> 16))
#define FP_OFF(p)   ((WORD) (p))

#define isdigit(c)  ((c) >= '0' && (c) <= '9')

#ifndef NEAR
#define NEAR        near
#define PASCAL      pascal
#define VOID        void
#endif

#define LOCAL       static


Example 1: Expressive functions

        WORD alias = loadmodule("ALIAS");
        PFN listsyn = (PFN) getprocaddr(module, "LIST_SYN");
        if (listsyn)
            (*listsyn)();
        freemodule(alias);

or:

        PFN listsyn;
        if (listsyn = (PFN) procaddr("ALIAS", "LIST_SYN"))
            (*listsyn)();



Example 2: Assembly language equivalent

        PUSH "Goodbye"
        PUSH "_printf"
        PUSH "CRTLIB"
        CALL loadmodule
        ; loadmodule consumed "CRTLIB"
        ; and produced handle to crtlib
        CALL getprocaddr
        ; getprocaddr consumed crtlib-handle and "_printf"
        ; and produced pointer to printf on top of stack
        ; "Goodbye" is still on stack
        CALL [top of stack]
        POP retval from _printf


Example 3: Legal calls to the interpreter

        calldll viocalls VIOWRTTTY "hello world" 11 0
        calldll doscalls DosBeep 2000 300
        calldll doscalls 50 2000 300                       ; DOSBEEP
        calldll doscalls DosMkDir \foobar 0L
        calldll doscalls DosRmDir \foobar 0L
        calldll pmwin WINQUERYACTIVEWINDOW 1L 0 %lu
        calldll crtlib _printf "goodbye world: %lu" 666L
        calldll crtlib SQRT -1.0 %f
        calldll crtlib _toupper 'b' %c
        calldll jpilib FIO$Exists 12 CALLDLL.EXE


Example 4: Arguments already on the stack

        switch (retval_typ)
        {
            case typ_string:    printf(mask, ((STRFN) f)()); break;
            case typ_word:      printf(mask, f()); break;
            ...
        }


Example 5: Associating ASCIIZ names with function pointers

        LOCAL DOSCALLS NEAR dos[] = {
            "", 0,
            ...
            "DosGetProcAddr", DosGetProcAddr,
            "DosGetPrty", DosGetPrty,
            ...
            } ;











Copyright © 1989, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video