Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Tools

COM Object Reference Counting


Mar01: COM Object Reference Counting

Finding unbalanced reference count of COM objects

Noam is chief technology officer at Vsoft. He can be contacted at [email protected].


COM Cyclic Reference Counting


Using COM objects involves maintaining the reference count for each object. Smart pointers or Visual Basic can save you from this labor, but for those of us who use pointers to COM interfaces, reference counting is required. The penalty for incorrect reference counts is severe. If an object is not released — that is, more AddRef calls than Release calls--a memory leak (and possibly resource leaks) results. If there are more Releases than AddRefs, you crash.

In simple systems, reference counting is manageable by intuition. But when there are more than a few objects, problems such as cyclic reference can occur (see the accompanying text box entitled "COM Cyclic Reference Counting").

During the development and debugging of a complex subsystem of Vsoft's VideoClick server, it occurred to me that life would be easier if I had a log of each call to AddRef/Release. Consequently, I built just such a tool, which I call "RefCatcher" (see Figure 1). For each COM object I build, RefCatcher (source code available electronically; see "Resource Center," page 5) tracks the number of AddRef/Release calls, the source file, line number, and function it was called from (if available). When the application terminates, RefCatcher reports unbalanced reference counts.

To build the tool, I first had to resolve a few issues, including how to:

  • Hook into the AddRef/Release.
  • Get the function name, line number, and source-file name of the caller to AddRef/Release.

  • Test the count balance at the program's end.

Hooking Into AddRef/Release

The general solution requires a wrapper around the interface--tedious work indeed. However, it was relatively easy for us to implement the wrapper because most of our COM objects are created in-house and are built using a framework based on classes from Dale Rogerson's Inside COM (Microsoft Press, 1997).

As Listing One shows, all of our COM classes inherit from CUnknown, so all I had to do is modify the AddRef/Release implementation.

Getting Symbolic Information

The fun part is actually getting the symbolic information. Visual C++ lets you store debugging information in a separate file. You can then use the Program Data Base (PDB) in run time to associate between addresses and symbolic information. Contrary to common belief, this information does not degrade performance and is available in release builds as well. To get symbolic information, I used DbgHelp.dll (documented in the MSDN).

Testing at the End of the Application

You have a couple of alternatives when calling a function at program end: calling atexit() or using the destructor of a global object. I used the latter. You only need to make sure that this global object is allocated before any COM objects are instantiated.

Participating Classes

As Figure 2 illustrates, implementing the tool requires three classes:

  • The COM object itself is modified to register the calls to AddRef/Release (see Listing Two).
  • A SymbolHandler that is responsible for supplying symbolic information (function name, line number, and so on).

  • RefCountingRot, a repository of the calls is kept in a singleton object, which tests at the end of the program for a mismatch between the calls. (ROT is short for "Running Object Table," a term borrowed from the COM standard.)

Finding Who Called AddRef/Release

Looking at the stack is not for the faint of heart. In this case, the task at hand is simple: The last value pushed into the stack before calling a function is always the return address. Supplying the return address to the SymbolHandler object gives the source-file name, line number, and function name.

The magic lies in getting the ebp register. I used assembly code for this:

unsigned int _ebp;

__asm { mov _ebp, ebp }

The ebp register contains the stack pointer at the moment the function enters. Taking the stack content just above the _ebp gives the return address.

Before finding symbol information, the symbol table for the module has to be loaded by calling SymLoadModule(). This function can be called after the executable is loaded. Because I was dealing with in-process COM objects (DLLs), I called the function in the PROCESS_ATTACH event in DllMain(); see Listing Three.

Using the data acquired by the RefCountingRot, each object is identified by its pointer value. It is much more informative if, instead of seeing "object with this == 0x2761C30," you can have "TS demuxer(object # 4)" as the title for each report. This can be achieved if the object implements another interface — INamedObject2. This optional interface lets you query the object for its user-friendly name, CLSID, and unique object ID; see Listing Five. When the RefCountingRot adds object data for the first time, it checks to see if the object implements this interface. If it does, the interface is used to get the name data. If it does not, the this pointer is used as the object name.

Internal Data Structure

Each COM object is identified by its IUnknown pointer. The current implementation simply uses this pointer as the key in a map. A more precise solution is to always use the IUnknown pointer. Imagine an object with several interfaces, all calling AddRef() at some time. How would RefCountingRot know it is the same object? Figure 3 illustrates the structure of the value in the map, while Listing Four shows how the map is constructed.

Performance Considerations

RefCatcher is to be used only during the debugging phase. Although it can be used in release build, it is not intended to be part of the final product.

To use RefCatcher, define a macro DBG_REF_COUNT in the project settings. This macro enables the correct compilation of the DllMain() and some CUnknown class declaration. You then rebuild all COM objects participating in the project. Next, link with VsoftDbgCom.lib, which replaces the usual DllMain and CUnknown implementations. Finally, run the program. When it ends, if there are interface leaks, a message box opens showing all calls to the offending interfaces.

Future Enhancements

Among the enhancements you could make to RefCatcher are:

  • Inlining AddRef/Release. If the compiler inlines these functions, the return address calculation is invalidated. The simple cure is to force these functions not to inline; after all, performance is not an issue at this stage.
  • RefCatcher reports unbalanced AddRef/Release. If there are more Release calls, the application usually crashes after referencing an invalid interface pointer. RefCatcher can be augmented to report the current list of calls by calling ReferenceCountingRot::CheckBalance() from a crash handler. (A crash handler is a function called when the application crashes; it is set by calling SetUnhandledExceptionFilter().) Using this technique, you get the reports even if the application crashes.

  • Write the information to a file immediately, avoiding data loss during crashes.

DDJ

Listing One

// code snippet from CUnknown.h from "Inside COM" by Dale Rogerson, 
// Microsoft Press.
// Modified by Noam Cohen.
///////////////////////////////////////////////////////////
class CUnknown : public INondelegatingUnknown
{
public:
    // Nondelegating IUnknown implementation
    // ...
protected:
    // ...
    // helper functions to register the parameters of the caller
    // of AddRef and Release()
    void RegisterAddRef(DWORD _ebp);
    void RegisterRelease(DWORD _ebp);
private:
    // ...
    // Reference count for this object
    // helper function. get information on the caller of 
    // the function whose ebp == _ebp
    HRESULT ParamsOfCaller(IN DWORD _ebp, OUT ULONG &line,
            OUT char* szSymbolName, IN  UINT    nSymNameLen,
            OUT char* szFileName, INT nFileNameLen,
            OUT ULONG &displacement);
}; // CUnknown

Back to Article

Listing Two

///////////////////////////////////////////////////////////
// Delegating IUnknown
//   - Delegates to the nondelegating IUnknown, or to the
//     outer IUnknown if the component is aggregated.
#define DECLARE_IUNKNOWN                                     \
    virtual HRESULT __stdcall                                \
        QueryInterface(const IID& iid, void** ppv)           \
    {                                                        \
        return GetOuterUnknown()->QueryInterface(iid,ppv) ;  \
    } ;                                                      \
    virtual ULONG __stdcall AddRef()                           \
    {                                                          \
        unsigned int _ebp;                                   \
        __asm {mov _ebp, ebp}                                \
        RegisterAddRef(_ebp);                                \
        return GetOuterUnknown()->AddRef() ;                 \
    } ;                                                      \
    virtual ULONG __stdcall Release()                          \
    {                                                          \
        unsigned int _ebp;                                   \
        __asm {mov _ebp, ebp}                                \
        RegisterRelease(_ebp);                               \
        return GetOuterUnknown()->Release() ;                \
    } ;

Back to Article

Listing Three

BOOL APIENTRY DllMain(HINSTANCE hModule, DWORD dwReason, void* /*lpReserved*/)
{
    if (dwReason == DLL_PROCESS_ATTACH)
    {
        CFactory::s_hModule = hModule ;
        // even if we fail, business as usual
        (void)ghSym.LoadSymbols(); 
    }
    return TRUE;
}

Back to Article

Listing Four

// ObjData contains all the data relevant to one key (the interface pointer)
typedef struct { 
ObjectName objName;
    ParamList addRefed;
    ParamList released;
}  ObjData;
// the container of all the <interface pointer, value> associations
typedef std::map<void*, ObjData> ObjectsMap;

Back to Article

Listing Five

/* This interface can be exposed by a COM object that exposes 
some identifying properties. */
interface INamedObject2 : IUnknown
{
    /*------------------------------------------------------------------
    Get a name for this object's clas ( e.g. "TS demuxer" )
    Return: 
        null terminated string
    */
    virtual const char* GetClassName() = 0;
    /*-------------------------------------------------------------------
    Get the CLASS id of this COM object.
    It is recommended to return the CLSID used when creating this object
    Return: 
        S_OK
        E_NOTIMPL
    */
    virtual HRESULT GetClassId(const GUID &) = 0;
    /*-------------------------------------------------------------------
    Get a name for this object ( e.g. "object in server 34" )
    Return: 
        null terminated string
    */
    virtual const char* GetObjectName() = 0;
    /*-------------------------------------------------------------------
    Get the INSTANCE id of this COM object.
    Return: 
        S_OK
        E_NOTIMPL
    */
    virtual HRESULT GetObjectId(const GUID &) = 0;
}; // INamedObject2

// helper macro for those lazy people who need minimal functionality
#define CLASS_NAME_IMPL(szClassName) \
    const char* GetClassName(){return szClassName;} \
    HRESULT GetClassId(const GUID &){ return E_NOTIMPL;}\
    const char* GetObjectName(){return "";}\
    HRESULT GetObjectId(const GUID &){ return E_NOTIMPL;}
#endif




Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.