A DynaCall() Function for Win32

By Ton Plooy, August 01, 1998

August 1998/A DynaCall() Function for Win32

You can use LoadLibrary() and GetProcAddress() to dynamically load a DLL, obtain the address of one of its functions, and call that function:
typedef int (*MSGBOX)(HWND,LPCSTR,LPCSTR,UINT);
HINSTANCE Lib = LoadLibrary("user32.dll");
PROC      Func= GetProcAddress(Lib, "MessageBoxA");
MSGBOX    MsgBox = (MSGBOX)Func;
MsgBox(NULL, "Dynamic message!", "test", MB_OK);
But you must know in advance what function you plan to call in order to correctly declare its arguments; that’s the purpose of the typedef in this example. That’s not normally an important restriction, since most programs don’t need to make a runtime call to a function that was not anticipated at compile time.

There is one class of application that needs to make calls to functions that were unknown at compile time: interpreted Windows environments. For example, both WordBasic (the interpreted language that Microsoft Word supports) and WinHelp (which supports a very simple interpreted language) let you declare and call functions in external DLLs. Those interpreters must execute runtime calls to DLL functions without knowing in advance their names and arguments.

Most Windows programmers won’t create their own interpreted Windows language, but you might want to add the ability to call external DLL functions to an existing interpreted language. The best example of this is VBScript, a subset of Visual Basic that programmers can incorporate into their Windows programs without paying license fees. VBScript lacks the ability to declare and call external DLL functions on the fly, but you could add this ability — if you could construct arbitrary function calls at runtime.

Calling arbitrary DLL functions with arbitrary arguments at runtime requires you to do work that the compiler normally does for you — you must manually place arguments on the stack in the correct order and, depending on the calling sequence of the target function, clean up the stack after the call. This is work that is most conveniently done in assembly language, since C and C++ do not provide explicit enough control over the stack. This article describes how to perform that task, and provides a DLL with a reusable function called DynaCall() that uses inline assembly language to take a function pointer and an array of arguments and execute a call to that function.

Overview

The goal is to write a function which, given a function pointer and an array of arguments of varying sizes (e.g., char, int, double, and so on), pushes the arguments onto the stack in whatever form or order the callee expects, invokes the function, and then obtains the return value from that function. Arguments are always passed on the stack, so DynaCall() will have to be able to move data from its input array of arguments onto the stack, adjusting the stack pointer appropriately. This is of course function-specific code, since the number of stack pushes as well as what must be pushed depends on the specific function being called and in which context. The way a function returns information also depends on the type of its return value. Return values up to four bytes in size are always passed through the (32-bit) EAX register. But other types require different handling, and specific code must be generated for it.

Unfortunately, these issues are compiler-dependent to some degree. I analyzed the involved mechanisms for 32-bit Microsoft and Borland compilers. If you need support for calling functions from DLLs generated by other compilers, you need to handle any differences yourself. Most standard items are handled the same way though, so I don’t expect to see major implementation differences in this area. Most languages that can generate DLLs with exported functions will try to be compatible with the ad hoc standard of the Win32 API itself (which is exposed as a set of DLL functions).

Here’s an overview of the various issues involved in function calling.

Parameter Passing. This is rather straightforward. All parameters are pushed onto the stack. Parameters are aligned on a four-byte boundary, so, for example, a char parameter uses four bytes of stack space. Floating-point and structure parameters are copied directly to the stack; they each consume some multiple of four bytes.

Calling Sequence. DynaCall() supports the two types of calling sequences used by the Win32 API: __stdcall (WINAPI) and __cdecl (WINAPIV). Figure 1 shows two functions that are identical except for calling sequence, along with a disassembly of the code that calls them. The calling sequence specifies three things: argument order, stack cleanup, and naming convention.

First, under Win32, both __stdcall and __cdecl require arguments to be pushed on the stack from right to left — reverse order, in other words. As you can see in Figure 1, when you pass arguments a and b to a function, the compiler generates code that pushes first b and then a onto the stack.

Second, __stdcall requires the callee to clean up the stack (pop the pushed parameters off), while __cdecl requires the caller to handle stack cleanup. Because of this, __cdecl can support functions that take a variable number of arguments (e.g., printf()), since in that case the callee cannot know at compile time how many parameters need to be popped. In Figure 1, you can see that the __stdcall function ends in a RET 8 instruction (popping two four-byte parameters), while the __cdecl function ends in a simple RET (since the caller has to adjust the stack pointer).

Finally, the calling sequence affects the actual exported name generated by the compiler and linker, and Microsoft and Borland use different conventions for this. Microsoft mangles __stdcall names by default, so a function declared as:
int __stdcall Foo(int a)
would by default get assigned a name of “_Foo@4” by Visual C++, but would be named “Foo” by Borland C++. With __cdecl, the same function would be named “Foo” by Visual C++, but “_Foo” by Borland C++. With both compilers, you can use a .def file to alias the exported function name to whatever you want, and most DLLs intended for wide reuse follow the convention of the Win32 API: __stdcall function names are not mangled, and __cdecl functions (the less common case) have a “_” prepended to their name.

This discussion of mangling assumes you’re either compiling with C, or are exporting functions from C++ using the extern "C" directive. Otherwise, C++ will impose its own name mangling scheme.

Returned Values. Most C compilers return smallish data types in some machine register, and larger types (such as structures) by requiring the caller to pass a hidden pointer to an appropriate amount of memory. The Win32 API is no different in this respect. All four-byte return values are returned through the EAX register and all eight-byte integer values (__int64) are returned through the EAX/EDX register pair. For floating-point return values the situation is a little different. They are returned on the math coprocessor’s stack and a special instruction is needed to move the return value into main memory. You might wonder what happens if you run on a machine that doesn’t have a floating-point coprocessor installed. Even in this (rather theoretical) case the floating-point instructions are used. Windows NT and 32-bit Windows 95 provide floating-point emulation from the Win32 kernel so the application itself doesn’t have to deal with it.

For non-integer, large return types (structures), Microsoft’s compiler pushes a hidden argument on the stack, which is a pointer to a temporary buffer (located on the stack). The function itself then copies the return value to this buffer. Upon return, additional code in the caller copies the temporary buffer to the assigned variable. Borland’s code is more efficient; it passes the pointer to the assigned variable directly so there’s no need for a temporary structure and second copying. There’s one notable exception to these general rules. The Microsoft compiler recognizes structures that are less than or equal to eight bytes and uses EAX/EDX register passing instead of the stack data copying mechanism in these cases. (Ironically, it was Borland and not Microsoft that implemented this additional optimization under Win16 — the two compilers have managed to remain incompatible under Win32 by copying each other’s Win16 behavior.)

Implementing DynaCall()

The issues of __stdcall versus __cdecl and Borland C++ versus Visual C++ all play a part in the implementation of DynaCall(). The prototype for DynaCall() is:
RESULT DynaCall(
    int      Flags,
    DWORD    lpFunction,
    int      nArgs,
    DYNAPARM DynaParm,
    LPVOID   pRet,
    int      nRetSize
    );
RESULT is a union of standard types, ranging from a four-byte int to an eight-byte double. If you need to call a function that returns a structure, then you have to use pRet and nRetSize to supply a buffer to hold the return value.

The Flags parameter specifies various options: whether the return value is a floating-point value (since that requires fetching the value from the math coprocessor), whether the calling sequence of the target function is __stdcall or __cdecl, and whether the callee is a Borland C++ or Visual C++ function (which affects how some structures are returned, as described previously). In most cases, the return value will be a simple data type, in which case you don’t have to know whether the callee was compiled with Visual C++ or Borland C++.

You need to specify the function’s address (retrieved via GetProcAddress(), for example) in the lpFunction parameter.

The third and fourth parameters contain the argument count and an array of DYNAPARM structures that describe the individual function arguments:
typedef struct DYNAPARM {
    DWORD       dwFlags;        // Parameter flags
    int         nWidth;         // Byte width
    union {                     //
        DWORD   dwArg;          // 4-byte argument
        void   *pArg;           // Pointer to argument
    };
} DYNAPARM;
The fourth parameter is an array of DYNAPARM structures preceded by the number of elements (arguments) in the array. The DYNAPARM type is a structure containing argument flags, argument size, and the argument itself. Currently there’s only one possible flag that specifies if the argument is supplied by reference instead of directly. (If the parameter is larger than four bytes, a pointer to the argument needs to be specified.)

docall.c (Listing 1) contains some sample calls that demonstrate how to use DynaCall(). Be sure to specify all parameters and options correctly. One wrong value can misbalance your own application stack and (especially on Win95/98) crash the system. With a normal function call, the compiler can verify that the caller and callee agree on how the call should take place, but DynaCall() relies on you to get it right.

The Implementation

DynaCall() is declared in dynacall.h (Listing 2) and implemented in dynacall.c (Listing 3). This month’s code archive contains a DLL that contains DynaCall() as an exported function. The code has been tested and compiled with Visual C++ v5.0, though you can call the function from any language that lets you access external DLL functions.

The main chore DynaCall() has to perform is to get all the arguments pushed onto the stack. Pushing parameters onto the stack is done in (Intel) assembler code with a PUSH <operand> instruction. However, you can’t just go ahead and start pushing all specified parameters on the stack. Why not? Because DynaCall() is written in inline assembler mixed with C, it has to worry about what the generated C code might be doing with the stack. For example, the compiler would be completely within its rights to allocate space for variables inside the for loop in DynaCall() by adjusting the stack pointer down (x86 stacks grow downward) upon entering the loop, and back up after leaving the loop. If you’re pushing data on the stack at the same time the compiler-generated code is tinkering with the stack pointer, disaster is likely.

You could avoid this problem by constructing the desired stack image in a separate buffer, then copying that buffer to the stack in one fell swoop (with no intervening C code). Instead, I use a small trick. I first lower the stack pointer by 256 bytes (“allocating” 256 bytes), after saving the current position in a pointer variable. The generated C code then uses the stack 256 bytes lower and won’t overwrite anything I store in that 256-byte area of the stack. Using the saved pointer, I copy all arguments in the correct order and with proper four-byte alignment to the stack. Next, if the function requires a hidden pointer parameter for the return value (Borland structure return or Microsoft structure return larger than eight bytes), I add that hidden parameter to the stack. Finally, it’s time to call the function, so I re-adjust the stack pointer by moving it up again with 256 bytes minus the total size of the parameters pushed.

When the function returns, it may have set the EAX or EAX:EDX equal to a return value, so the problem of interference from the C-generated code arises again — the compiler might generate code that uses either of those registers. To avoid that problem, I immediately save the EAX and EDX to stack variables, making sure there is no intervening C code between the function call and the code to save these registers.

After saving EAX and EDX, I must adjust the stack pointer again if the called function had a __cdecl calling type. The only thing left to do is obtain the correct return value. I check for various cases, beginning with floating-point return values. These are returned on the FPU stack, and the called function used the FLD instruction to push a value onto the stack. Since floating-point types can have different sizes, I have to specify the correct type with the FSTP instruction to pop the return value from the FPU stack.

For standard return values, I just copy the EAX/EDX register pair to the eight-byte RESULT union. Four-byte or shorter types (char, short, int, and long) are always returned in EAX, so these automatically end up at the correct place in the RESULT union.

Summary

DynaCall() lets you dynamically access almost any exported DLL function, correctly handling the most common calling sequences, return types, and compiler incompatibilities. You could use this function to create a small interactive environment for trying out calls to unfamiliar Win32 routines, for example. A future WDJ article will show how to use DynaCall() to create an OLE automation object that can be extended at runtime to call arbitrary DLL functions. That object, in turn, can be used to give VBScript the ability to easily declare and access arbitrary DLL functions.

Ton Plooy is an independent software developer working on system and programming utilities at his company Crunch Technologies in The Netherlands. You may email Ton at [email protected].

Get Source Code

1 2 3 4 5 6 7 8 9 10 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

A DynaCall() Function for Win32

Overview

Implementing DynaCall()

The Implementation

Summary

Get Source Code

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

A DynaCall() Function for Win32

Overview

Implementing DynaCall()

The Implementation

Summary

Get Source Code

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content