Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Spying on COM Objects


July 1999/Spying on COM Objects

Spying on COM Objects

Dmitri Leman


Intercepting and tracing messages and function calls is very important for debugging applications under Windows. Most Windows programmers use Spy to trace messages. Other popular tools, such as BoundsChecker, are able to trace messages and API function calls. They may also perform parameter validation, check for resources leaks, and do other debugging tasks. There is a variety of more specialized spy tools available, such as serial port monitors, network sniffers, Internet tracers, disk monitors, registry monitors, etc.

Sometimes, these ready-to-use tools are insufficient for solving real-world problems. Additional trace filtering may be necessary to eliminate thousands of unwanted lines of output. Some programs may require specific parameter validation or resource handling. In other cases, it is necessary to redirect the trace from the screen to some other device (a debug monitor, a file, another computer, etc.). Fortunately, source code examples of some tracing tools are available. The Win32 SDK includes a sample Spy application. Matt Pietrek presents a sample API spy program in his Microsoft Systems Journal article [1] and in his book [2]. In most cases, these samples provided a good starting point for building a specific tracing solution. Unfortunately, the situation changed because of COM. More and more APIs are implemented as COM interfaces instead of traditional DLL exports. For example, a regular API spy can help trace a program that draws on the screen using GDI, but is useless for tracing another application that uses Direct Draw.

Some commercial applications such as BoundsChecker can trace COM interface calls, but I've not yet seen any articles or a books presenting this technique. The Microsoft Systems Journal article "Building a Lightweight COM Interception Framework"[4] presents a method of intercepting COM interfaces. But since this method requires changes in the source code, it is still necessary to develop a generic tracing tool, that, like the original API Spy, can work with other applications without modifying them. In this article I will present a COM tracing tool that can load other Win32 applications, and intercept and trace calls to COM interface functions as well as to traditional exported functions.

DLL Exports vs. COM

A traditional method of providing an API under Windows uses DLLs and exported functions; for example, the Win32 API is provided this way. Dynamic link libraries (DLLs) for Win32 are modules containing functions and data. A DLL is loaded at runtime by other modules (EXEs or DLLs). When a DLL is loaded, it is mapped into the address space of the calling process. A DLL may provide a set of exported functions that can be called by other modules. The calling can be done with either load-time or runtime dynamic linking. The load-time linking requires building the caller module with an import library. An import library supplies the operating system with the information needed to load the DLL and locate the exported DLL functions when the caller module is loaded. In runtime linking, a caller uses LoadLibrary() or LoadLibraryEx() to load the DLL and GetProcAddress() to get a pointer to the exported function. To support both these methods, DLLs have export tables that contain data sufficient for obtaining exported functions' addresses by their names or ordinals. To support load-time linking, the caller module should also have an import table containing the names of DLLs, names (or ordinals) of imported functions, and pointers to these functions. Windows prepares these pointers during module load, and uses them to perform calls.

COM defines mechanisms that let software components interact as objects. A software object contains of data and functions for using that data. You access a COM object's data via a set of related functions called an interface. An instance of an interface implementation is a pointer to an array of pointers to methods specified in the interface. Interfaces may inherit other interfaces. If one interface inherits from another, it includes all the methods that the other interface defines. All interfaces inherit from the important interface IUnknown that contains three vital methods: QueryInterface(), AddRef(), and Release(). All methods of the derived interface are addressed by one array of pointers. First, several array members correspond to the base interface, and pointers to new members are added after them. This means that the derived interface can be passed to code that expects only the base interface. COM objects may reside in a separate thread apartment, another application address space, or on a separate computer from the client code that uses these objects. In any case, a client accesses the object through in-process pointers. If the object cannot be reached by the direct call, the call is handled by what is called a "proxy" object provided either by COM itself or by the object.

There are several ways to obtain an interface pointer to a given object. The first method is using an API function (in the COM library or some other DLL) that creates an object (or connects to an existing one) of a predetermined type. In other words, the function will return a pointer to only one specific interface for a specific object class; for example, CoGetMalloc(). The second method to obtain an interface pointer to a given object is using an API function that can create an object based on a class identifier (CLSID). This API function returns any type of interface pointer requested by an interface identifier (IID), for example CoCreateInstance(). The third method is using some interface that creates or connects to another object and returns an interface pointer on that separate object (it may be of a predetermined type or specified by a CLSID and IID). The fourth method is an interface implemented by a client through which other objects pass their interface pointer to the client directly. In any case, once the first interface to an object is obtained, pointers to other interfaces can be obtained by calling QueryInterface().

The Tracer Design

Since COM interfaces are just arrays of pointers to functions, it should be easy to replace all or some of these pointers with pointers to tracing functions. (Another solution might be placing breakpoints at the beginning of each method.) Such interceptor functions should perform necessary tracing and call the original interface method. It is important to take control again after the original method completes to record the return value as well as values returned through parameters. You can do this by using another interceptor function to replace a return address on the stack. To preserve the original return address, the tracer should maintain its own stack of return addresses for every thread in the process. For more details on this technique see [1], [2], and [4]. One of the dangers of using this method is its vulnerability to exceptions, which may jump out of the called method without executing the interceptor code. This will lead to the tracer's stack overflowing. Fortunately, COM interface methods do not normally throw an exception (unless a crash happens) but prefer to return an error code or a special exception structure (like IDispatch::Invoke()). Therefore, the exception issue is not handled in the tracer presented in this article.

Because of the large variety of COM interfaces and methods, it is impossible to make separate interceptor functions for each COM method. A simpler solution is to provide a single interceptor function capable of tracing any COM method. The tracer should accept a configuration file describing all interfaces and methods the user wants to trace, along with their parameters. The interceptor function should use this information to print values of input and output parameters, as well as return values from the method. For the single interceptor function to be able to determine which method of which interface it is handling, I allocate a small block of data (a stub) for each method being intercepted. This block of data contains information about the interface and the method name, parameters, and a small piece of code that transfers control to the single interceptor function. The interceptor function can take all necessary data from the stub, print interface names, the method, and parameters from the thread's stack according to the parameter information encoded in the stub. Then it can replace the return address (as described in the previous paragraph) and call the original method. After the return interceptor function regains control, it can print the return value and return values passed through parameters using the parameter descriptor from the stub.

Now that I know how to intercept a method of an interface, I next need to figure out when and where to do it. In order to replace function pointers, the interceptor should reside in the address space of the target process (another solution, involving breakpoints, does not require that). It is better to intercept an interface as early in an object's life as possible to provide a more complete trace. Unfortunately, COM object creation is a private business of the server, unlike DLLs, which are loaded at known moments by the loader. C++ code calls the new operator to allocate an instance of a class. Initialization can be done either in the constructor or in some other way. In any case, it is not clear how to intercept a COM object at its inception without changing its source code. Therefore, the solution is to intercept interfaces when a client obtains them.

I presented several ways of obtaining COM interface pointers in the previous section. Most of them return a COM interface from an exported DLL function or another COM interface function. The last one passes an interface as a parameter to the client's interface. The key to COM interception is to start by intercepting some exported DLL functions. The best-known functions are CoCreateInstance() and CoGetClassObject() exported from ole32.dll, the COM API's main DLL. But other functions in this and other DLLs may return interfaces. The tracer should be prepared to intercept any exported DLL function and examine the values returned through the function's parameters for an interface pointer. After the interface is obtained, it should be intercepted as requested by the configuration file. Some methods of the interface may return interface pointers also (for example, QueryInterface()). The tracer should be prepared to handle this as well. To make the tracer flexible, specify information about functions or methods capable of returning interfaces in the configuration file.

Assuming that COM tracers should be able to intercept (and trace) exported DLL functions as well as COM interfaces, I can now recycle the ideas in the original API spy program. This includes using Win32 debug API to start the target application and waiting for some debug events. When an exception event signals the beginning of program execution, the tracer injects a tiny piece of code into the address space of the application and forces the application to execute this code by changing the instruction pointer. This piece of code loads the tracing DLL, which reads the configuration file and intercepts imported API functions in the executable module and other loaded DLLs. It also intercepts GetProcAddress() to prevent the application from obtaining a real, non-intercepted address of an API function. When, later during the program execution, some other DLLs may be loaded, the debug API will generate another event to the debugger code. The debugger code will inject a breakpoint at the entry point of the DLL. When the breakpoint is triggered, the debugger code will change the instruction pointer to point to a function in the tracing DLL. This function will perform the interception process for this new DLL, then transfer the control to the original entry point.

Implementation

The tracer consists of an executable and a DLL. The executable has a simple dialog box with controls that let the user enter the path to the target application and a configuration file. WinMain() and the dialog implementation are located in comxtrc.cpp. (All the source files mentioned in this article are in this month's code archive.) Once the dialog completes, the parser processes the configuration file specified by the user. The parser is located in parsecfg.cpp and it stores information about interfaces, functions, and their parameters in collection objects defined in InterMap.cpp. After the parsing completes, the program creates a memory-mapped file where it stores all necessary data (including the processed configuration data) to be used by the spy DLL.

Next, the debugger code (in debugger.cpp) starts waiting for debug events. It handles almost all debug events. When CREATE_PROCESS_DEBUG_EVENT, EXIT_PROCESS_DEBUG_EVENT, CREATE_THREAD_DEBUG_EVENT, and EXIT_THREAD_DEBUG_EVENT arrive, the debugger stores process handles and threads for later use. Thread handles are stored in the ThreadStrorage class implemented in threads.cpp. EXCEPTION_DEBUG_EVENT signals when the target process hits a breakpoint. This happens first when the process begins execution, which is consequently the best time to inject the spy DLL using bInjectAgent(). bInjectAgent() searches the target process address space for a writable page and writes an agent to it. The agent is a small piece of code prepared in the structure ASM_Agent. This code uses LoadLibrary() to load the spy DLL. After the call to LoadLibrary(), the agent receives a breakpoint instruction. After injecting the agent, the debugger changes the instruction pointer of the target process to start execution at the beginning of the agent. Then the debugger continues its loop until the next breakpoint, which must be the breakpoint at the end of the agent. Then vOnAgentFinished() is called, which restores the modified page and restores the execution point to the original location. Also, the debugger handles LOAD_DLL_DEBUG_EVENT to be able to intercept functions in newly loaded DLLs. bOnLoadDLLEvent() determines the DLL's entry point and places a breakpoint instruction there. The original byte, which was replaced by the breakpoint, is stored in a table. When the target application hits this new breakpoint, the EXCEPTION_DEBUG_EVENT event arrives again. The debugger removes the breakpoint and replaces the original byte. Then the debugger changes the target thread execution point to the address of vHookModule() inside the spy DLL, which intercepts the newly loaded DLL. The debugger pushes the DLL entry point to the target's stack and, after the interceptor function completes, control returns to the DLL's original entry point and the execution continues normally.

The spy DLL's main module is comxtrcd.cpp, and contains DllEntryPoint(), which is the DLL's main entry point. To reduce the possibility of harmful interference with the target application, the DLL does not use the C library. This means that DLL avoids using exception handling and global C++ objects with constructors or destructors. The DLL keeps its global data in the SpyDllGlobalData class. bComXTrcDLLInit() initializes this data in response to the DLL_PROCESS_ATTACH event, which involves opening the memory-mapped file created by the main executable. After reading necessary parameters from the file, bHookAllModules() is called, which enumerates all loaded DLLs and intercepts functions in these DLLs (according to the configuration data). vHookModule() (also in comxtrcd.cpp) is used by the debugger code to intercept new DLLs, which may be loaded later.

The actual interception happens in intrcpt.cpp. It has several procedures for intercepting functions imported from DLLs and other functions for COM interface methods. In any case, interception occurs by replacing some address in memory with a pointer to a stub. Each function (or method) to be intercepted has a corresponding stub (in the structure ASM_APIFunctionStub), which contains pieces of code and data. The data describes the function and its parameters, and points to the original function location. The code redirects execution to a common entry point of the logging code in log.cpp (in Logger::IntrerceptEntry() — see Figure 3). Originally, all these stubs are prepared by the configuration file parser in the executable and are stored in memory-mapped file. Each function imported from a certain DLL requires exactly one stub, even if this function is called many times from many DLLs. But a function in an interface may require more than one stub because each interface may have many implementations. For example, there are many implementations of IUnknown::AddRef(). Because the stub can store only one original function location, it is necessary to allocate additional stubs when new implementations are discovered later. This is done in Interceptor::bInterceptVTableFunction() (in intrcpt.cpp — see Figure 2).

After all necessary functions are intercepted, the DLL starts seeing calls. Code in one of the stubs is executed first and transfers control to the main entry of the interceptor, Logger::InterceptEntry(). This function accepts two parameters from the stub: a pointer to the stub itself and a pointer to a stack frame where parameters of the original function are stored. Then ParamProcContext::iProcAllParams() (in log.cpp; see Figure 3) is called to process these parameters. Each parameter is processed according to its type encoded in the stub. Values that are passed from the caller to the function (IN parameters) should be formatted and logged to the output log. Pointers to the return values (OUT parameters) should be preserved in my private stack until the function returns. Pointers to COM interfaces should be passed to Interceptor::bProcPInterface() (in intrcpt.cpp — see Figure 2) to intercept their methods. After all this processing is finished, execution returns to the original function. Before that, though, ReturnMgr::pInterceptFunctionReturn() (in return.cpp — see Figure 1) is called to replace the return address in order to take control after the return from the original function. The control is returned to ReturnMgr::NakedReturnPoint() (in return.cpp — see Figure 1). Logger::ProcessReturn() performs parameter processing similar to the job done during the time of call. The difference is that now pointers to the return values should be restored from my private stack, and output parameters processed, including interception of all returned COM interfaces. Then the control is returned to the original caller.

Debugging the Tracer

The executable comxtrc.exe can be debugged by an integrated Visual C++ debugger. The spy DLL, on the other hand, requires a system-level debugger such as SoftICE. In order to simplify the tracer debugging, its code contains numerous trace statements. These statements use the macro MYTRACE, whose first parameter is a DWORD value which has one bit flag, and whose second parameter is a printf()-style format; the list of arguments follows. The bit flag is different for different groups of trace operators. All possible flags are listed in enum TraceFlags in trace.h. These DWORD values will be matched at runtime against a DWORD mask specified in comxtrc.ini. This allows me to enable and disable different trace groups by changing values in the INI file without recompiling. The INI file contains four different masks: PrintFileMask= enables printing of the trace messages to the output log file; FlushFileMask= forces log file flushing after printing the corresponding trace messages; OutputDebugMask= enables or disables printing to a debug monitor; and MessageBoxMask= works only in the EXE module and invokes messageboxes for each MYTRACE statement, which has a matching flag. In order to receive trace of intercepted function calls, you must set bit 1 in PrintFileMask= or OutputDebugMask=. Other bits are useful for debugging.

Reference Count Checking

One source of frequent errors in COM programming is failure to follow reference-counting rules, which may produce lost COM objects. It is usually difficult to detect these leaks. As an extension of the tracer, I implemented a simple reference-count checker. It operates by keeping a list of all seen COM interfaces and their reference counts. It increments/decrements reference counts in response to intercepted AddRef() and Release() functions. After the program terminates, it prints lost objects and reports errors when too many Release() methods were called. Unfortunately, this simple solution has a serious defect: it is impossible to determine a true reference count of an object when it was first encountered by the tracer. This implementation assumes it to be 1, which is correct for many newly-created objects. But other objects (e.g., IMalloc returned from CoGetMalloc()) are returned with the reference count equal to or more than 2 (because the IMalloc is a global object). This makes the checker see more Release() calls for IMalloc() than it expected, then report an error. Still, the checker can be used to check reference-counting on some interfaces. The tracer uses flag 4 in its trace statements, so it is necessary to specify PrintFileMask=5 in order to see its output along with the main log of intercepted functions. To enable the checker, put CheckRefCounts=1 in comxtrc.ini.

Usage

To use the tracer, you must first prepare a configuration file containing descriptions of all functions and COM interfaces that should be intercepted. To intercept COM interfaces, at least a few API functions should be described. It is usually necessary to describe CoCreateInstance() and CoGetClassObject() to intercept new COM objects. This description should specify function parameters and which of those parameters will return interfaces. There are many other functions in ole32.dll capable of returning COM interfaces. It may be necessary to describe them as well if they are expected to generate an object that must be intercepted and traced.

comxtrc.cfg contains descriptions of most functions defined in OLE2.H. Other DLLs may contain functions capable of generating COM objects. For example, DirectDrawCreate() (exported from ddraw.dll) is used to create DirectDraw interfaces. It may be helpful to describe DirectDrawCreate() in the configuration file in order to trace these interfaces.

Besides exported functions, the configuration file should also describe interfaces. The description of the interface includes the name of the interface, the IID, the optional base interface, and a list of methods. Not all methods of an interface need to be listed. Each method should specify its ordinal number in the interface vtable. It is especially important to describe methods capable of returning interfaces (like QueryInterface()).

After the configuration file is prepared, I then start comxtrc.exe. You'll need to specify names for the executable to debug and for the configuration file in the dialog. You can then use the "Run" button to start the process. The log file's name will be comxtrc.log, and it will be created in the same directory as comxtrc.exe. You can change its name in comxtrc.ini, which may contain the following entries:

CommandLine — the path of the executable.

ConfigFile — the path to the configuration file.

DllLogFileName — the output log file.

CheckRefCounts — using values of 1 or 0, this enables or disables the reference-count checker.

PrintFileMask — a bit mask to enable or disable logging to the file.

FlushFileMask — a bit mask to force flushing of the file after some traces.

OutputDebugMask — a bit mask to enable output to a debug monitor.

MessageBoxMask — a bit mask to show some traces in a messagebox.

PrintThreadID — using the values 1 or 0, adds a thread ID to the output log (or not).

MaxStrLen — limits the string size (function parameters) printed to the log file.

Configuration File Syntax

This is the syntax for describing imported functions:

EXP:ModuleName:FunctionName
Param1
..
ParamN

This is the syntax for an interface:

INTERFACE:Name:{IID}
{
Method1
..
Method2
}

or

INTERFACE:Name:{IID}:BaseInterfaceName
{
Method1
..
Method2
}

The syntax for each method is:

FUNC:MethodName:Ordinal
Param1
.
.
.
ParamN

where the ordinal specifies the place of the function in the vtable. The parameters will be one of the following:

DWORD — a 4-byte integer.

WORD — a 2-byte integer.

BYTE — a 1-byte integer.

DWORD64 — an 8-byte integer.

LPSTR — a pointer to an ASCII string.

LPWSTR — a pointer to a Unicode string.

LPDATA — a pointer.

HANDLE — a handle (4-byte integer).

HWND — a window handle (4-byte integer)

BOOL — a Boolean (4-byte integer)

LPCODE — a pointer to a code.

LPIUNK — a pointer to a COM interface (whose type is determined at runtime by the IID).

LPINTERF — a pointer to a COM interface (whose type is specified in the configuration file).

REFGUID — a pointer to a GUID.

BSTR: a BSTR-type string.

VARIANT — a variant structure.

If a function has no parameters, the keyword VOID should be specified in place of the first parameter. Each parameter may have the following space-separated modifiers specified right after the parameter and on the same line:

*: The parameter is a pointer. For example, "DWORD *" means that the address of the DWORD is passed.

OUT: The parameter is passed from the function to the caller. This parameter should be a pointer (e.g., REFGUID or DWORD *).

IN: The parameter is regular and is passed from the caller to the function. This is the default type. The IN keyword is meaningful only if added to OUT to declare parameters passed in both directions.

iid_is number: A declaration that must be added to the LPIUNK type of parameter to specify which other parameter contains the IID. The number is an ordinal number of another parameter in the parameter list of the same function.

TYPE InterfaceName: must be added to the LPINTERF parameter type to specify the interface type.

For examples, see Figure 4. As a simple example, I wrote a small program that creates a file link (shortcut file) by using IShellLink and IPersistFile. This program is called shelllnk.exe. The sample includes shelllnk.cfg, which describes the necessary interfaces. comxtrc.ini provides settings for running the tracer.

Shortcomings

The greatest difficulty in using the tracer is the requirement to write large configuration files to describe all the functions and the interfaces. You could automatically extract this information from header files, browser information, or type libraries to automate this job.

The tracer does not support structures as function parameters, making it impossible to see interfaces returned by CoCreateInstanceEx(). This can be easily fixed, if necessary.

The tracer lacks support for OLE automation. To overcome this, you could create special handling of IDispatch methods, especially Invoke().

If you implement the tracer using one named memory-mapped file, running two or more tracers simultaneously is impossible. You can fix this by creating separate files and passing their handles to the spy DLL.

References

[1] "Learn System-Level Win32 Coding Techniques by Writing an API Spy Program." Matt Pietrek, Microsoft Systems Journal, December 1994.

[2] Windows 95 System Programming Secrets. Matt Pietrek, IDG Books, 1995.

[3] "Load Your 32-Bit DLL into Another Process's Address Space Using INJLIB." Jeffrey Richter, Microsoft Systems Journal, May 1994.

[4] "Building a Lightweight COM Interception Framework." Keith Brown, Microsoft Systems Journal, January and February 1999.

Dmitri Leman is a software engineer in Silicon Valley. He has been developing Windows, Windows NT, and DOS applications and device drivers for 8 years. You can contact Dmitri at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.