Callback Enumeration APIs & the Input Iterator Concept

Matthew uses collaborative context switching when adapting callback enumeration APIs to an STL Iterator Concept.


February 01, 2004
URL:http://www.drdobbs.com/callback-enumeration-apis-the-input-ite/184401766

February 04:

Input Iterators

Performance

Win32 Fibers


When adapting callback enumeration APIs to an STL Iterator concept, you usually have no choice but to enumerate the entire sequence results into a temporary storage container, then iterate through that container. However a novel approach, which enumerates only those items as are requested by operator ++(), advance(), or similar, is to use collaborative context switching. This approach works by conducting the callback enumeration in one context and the decision making (that is, completion or abandonment of the range iteration) in the context in which the client code is executing.

In this article, I demonstrate this approach using a technique based on the Win32 Fiber API. This technique provides an example container and associated iterator class along with a client code program to show how the technique can be implemented. The source code at http://www.cuj.com/code/ presents an example use of predicates and functionals applied to the adapted sequence, thereby demonstrating the utility of STL compliance.

Enumeration Models

The are many types of enumeration models, including Get-First/Get-Next (for instance, opendir()/readdir()), Get-Nth (RegEnumValue()), and Reset/Get-Next (such as COM's IEnumXXXX), to mention a few. All have trade-offs in terms of convenience, flexibility, and efficiency. An efficient but inflexible model is the callback enumeration model (or callback model).

The callback model is made up of an enumeration function and a callback function type. It is used by passing a function matching the callback function type and (usually) some search information to the enumeration function, which then conducts its internal enumeration (by whatever manner dictated by implementation and/or efficiency requirements) and, for each enumerated item, calls back the given callback function. In many implementations, the callback function is able to indicate (via the return value or by adjusting flag references) that the enumeration should either continue or be cancelled, thereby letting the caller control the extent of the enumeration. Also, it is common for the enumeration function to take an additional user-data parameter that is passed through to each invocation of the callback function, so that the caller may pass some context information (for example, a buffer in which to write information obtained from each individual enumerated element).

Callback enumeration APIs are easy to use and can be efficient. Here's a typical use, which simply prints all the local peer hosts to stdout:

bool CALLCONV ip_list_proc(char const *address, void *user_data)
{
  FILE *stm = reinterpret_cast<FILE*>(user_data);
  fprintf(stm, "%s\n", address);
  return true;
}
int main(...)
{
  enum_peer_hosts(ip_list_proc, IPPEERHOSTF_LOCAL, stdout);
}

Adapting to STL Iterators

Since the callback model works by returning results to the caller until told to do otherwise (in contrast with other models where each result is explicitly requested by the caller), it is not a straightforward matter to adapt it to the STL concept of iterators. There are usually two options to handle this. The first option (Listing 1) is to enumerate throughout the full range, storing items in a container that is passed via the user-data parameter. The implementation is simple, but can be inefficient because it requires the full sequence to be enumerated and stored. When the sequence is large, this can be a waste of time and space and the code using the container may only interrogate the first few elements in the sequence. Also, by taking a snapshot of the iteration sequence, it is possible to present an out-of-date set to the caller by the time the full enumeration is completed. On the other hand, this technique can, depending on the container type, support Forward, Bidirectional, and Random-access Iterator concepts (see Generic Programming and the STL: Using and Extending the C++ Standard Template Library, by Matthew Austern, Addison-Wesley, 1998).

The second option is to enumerate only as far as the current iterator position in the sequence, and hold enumeration state in a custom wrapper container class that exposes the container concept semantics. Each time its increment operator is called, the whole callback enumeration operation is restarted and run until 1 past the current position, or until no more elements are available. This solution can also be inefficient when the sequence is large and (in contrast to the first option) most or all of the sequence will be iterated. Furthermore, if the underlying sequence can change between enumeration function calls, then it is possible for the container to behave erroneously and miss the (now deleted) current position value, arriving at end() prematurely (see the sidebar entitled "Input Iterators").

Using Fibers

A novel solution to the problem was presented by John Robbins in his article "A COM Symbol Engine Aids Debugging" (MSDN Magazine, August 2000), which described a technique for adapting a callback enumeration API to a COM enumeration model. This technique is based on collaborative (nonpreemptive) context switching. The approach I present here similarly uses the Win32 Fibers API (see the accompanying sidebar "Win32 Fibers").

The implementation is straightforward; see the child_window_ sequence class in Listing 2. The container's begin() method creates an enumeration context structure (shared between the fibers) that relates the fibers and the current iteration point, then passes it as user data to the CreateFiber() call to create the worker fiber. After this, the implementation of begin() and the iterator's operator ++() are identical. They call SwitchToFiber() to pass control to the worker, and test for the end-of-sequence marker (enumeration HWND is NULL) to delete the enumeration context structure once control returns.

The bulk of the work is carried out in the static WorkerProc() and EnumChildProc() methods of the container. WorkerProc() — the worker fiber procedure — calls the Win32 function EnumChildWindows(), after which it sets the end-of-sequence marker and returns to the main fiber. Inside EnumChildWindows(), the child windows are enumerated and passed back to EnumChildProc(). It is this function that updates the enumeration context structure and switches control to the main fiber (returning either inside the begin() or operator ++() methods), for each enumerated child window.

In this way, the execution switches between the worker and main fibers, resulting in precise input iterator semantics. When EnumChildWindows() completes its iteration, it returns control to the main fiber one last time. The iterator at that point contains the end-of-sequence marker (as does that returned by end()), resulting in correct semantics for the != operator.

In effect, fibers can be used to assist in the unified expression of a variety of enumeration models in the concepts and techniques of the STL.

The Problem with Win32's Fibers

All this sounds great, and it is — up to a point. Unfortunately, there is a drawback. I was implementing a version of this technique for the WinSTL project (http://winstl.org/) and discovered that, unfortunately, there is no way to close all the fibers without closing the thread (that is, there is no CloseThreadFiber() function). In other words, you cannot "de-fiber-ize" the thread. It is possible (at least on Windows 2000) to ignore a previous initialization and call ConvertThreadToFiber() again, but the previously allocated fiber-management block remains allocated. Although the MSDN Knowledgebase article Q185231 implies that you can simply call LocalFree() on this block, this is sailing far too close to the wind for me to recommend.

Neither is it possible to call GetFiberData() to determine whether fiber support has been initialized, as this raises an access violation if it has not (and I don't like the hack of catching that exception as the indicator), rather than returning NULL. Even if it were, the problems of interposing one fiber subsystem on another would be inhibitive.

The hitch is: What if some other part of the process is already using fibers, or starts to use them during/after the initialization of your use of them? The simple (and unappealing) answer is that it crashes. If two fiber subsystems run concurrently (within the scope defined by ConvertThreadToFiber() and the exit from the calling thread), then the second one to be initialized destroys the execution contexts of the first. If they run consecutively, then there is a memory leak (although it is only experience, as opposed to explicit documentation on the subject, that lends this understanding).

In the child_window_sequence implementation, this problem is partially ameliorated by having the main fiber context (obtained from calling ConvertThreadToFiber()) in a static method, which enables multiple instances of this class to be used. However, this is still vulnerable to failure if any other code in the thread makes use of the Fiber API for its own purposes. Indeed, as implemented in the form shown here, the class would fail if instances were used in two or more threads.

Listing 3, an example client program for the class, demonstrates use of the child_window_sequence class in an STL-compliant fashion. It uses a conversion shim to extract the window text from the window handle (see my article "Generalized String Manipulation: Access Shims and Type Tunnelling," CUJ, August 2003). The program works perfectly in translating the callback model to the sequential one, but only because it respects the fragilities described. (It's worth noting that another reason this approach is not widely used is that it based on the Win32 Fibers API, and fibers are not available on Windows 95/98/Me.)

Summary

This article has demonstrated a novel approach to the adaptation of callback enumeration APIs to the STL Input Iterator concept. Although the inadequacies of the Win32 Fiber API prevent production use at this time, there is no in-principal reason why this technique could not be rendered robust and efficient given the availability of suitable cooperative multitasking infrastructure.


Matthew Wilson is a software development consultant for Synesis Software and author of the STLSoft libraries and the upcoming Imperfect C++ (Addison-Wesley, 2004). He can be contacted via [email protected], or at http://stlsoft.org/.


February 04:

Listing 1: Enumerating through the full range, then storing items in a container.

#include <windows.h>
#include <list>

typedef std::list<HWND>   HWND_list_t;
BOOL CALLBACK ChildWindowProc(HWND hwnd, HWND_list_t &wndlist)
{
  return (wndlist.push_back(hwnd), TRUE);
}
void main()
{
  HWND_list_t  wndlist;
  if(::EnumChildWindows(0, (WNDENUMPROC)ChildWindowProc, (LPARAM)&wndlist))
  {
    HWND_list_t::const_iterator  begin = wndlist.begin();
    HWND_list_t::const_iterator  end   = wndlist.end();
    for(; begin != end; ++begin)
    {
      fprintf(stdout, "HWND: 0x%08x\n", *begin);
    }
  }
}




February 04: 

Listing 2: child_window_sequence class.

/* /////////////////////////////////////////////////////////////
 * Extracts from child_window_sequence_test.cpp
 * www:   http://www.synesis.com.au/winstl
 *        http://www.winstl.org/
 * Copyright (C) 2002, Synesis Software Pty Ltd.
 * (Licensed under the Synesis Software Standard Source License:
 *  http://www.synesis.com.au/licenses/ssssl.html)
 * ////////////////////////////////////////////////////////// */

/* Holds state of enumeration and relationship between fibers */
struct cws_fiber_data
{
  LPVOID  main_fiber, worker_fiber;
  HWND    hwndParent, hwndChild;

  cws_fiber_data()
    : main_fiber(0), worker_fiber(0)
    , hwndParent(0), hwndChild(0)
  {}
};
/* The iterator class */
class child_window_sequence_const_iterator
{
public:
  typedef child_window_sequence_const_iterator  class_type;
// Construction
public:
  child_window_sequence_const_iterator(cws_fiber_data *data)
    : m_data(data)
  {}
  // Only Input-Iterator Concept supported, so  implement move semantics.
  child_window_sequence_const_iterator(class_type &rhs);
  class_type &operator =(class_type &);
  ~child_window_sequence_const_iterator()
  {
    // Clean up here, in case enumeration not completed.
    if(m_data != 0)
    {
      ::DeleteFiber(m_data->worker_fiber);
      delete m_data;
    }
  }
  HWND operator *() const
  {
    return m_data->hwndChild;
  }
  child_window_sequence_const_iterator &operator ++()
  {
    // Switch to the enumeration fiber
    ::SwitchToFiber(m_data->worker_fiber);
    // Having switched back, hwndChild will contain next
    // child, or NULL, in which case enumeration complete.
    if(m_data->hwndChild == 0)
    {
      ::DeleteFiber(m_data->worker_fiber);
      delete m_data;
      m_data = 0;
    }
    return *this;
  }
// Comparison
public:
  bool operator ==(class_type const &rhs) const
  {
    // == if both NULL, or both non-NULL and have same child.
    return  ( m_data == 0 && 
              rhs.m_data == 0) || 
            ( m_data != 0 && 
              rhs.m_data != 0 && 
              m_data->hwndChild == rhs.m_data->hwndChild);
  }
  bool operator !=(class_type const &rhs) const;
// Members
protected:
  cws_fiber_data  *m_data;
};
class child_window_sequence
{
public:
  typedef child_window_sequence                 class_type;
  typedef child_window_sequence_const_iterator  const_iterator;
// Construction
public:
  child_window_sequence(HWND hwnd)
    : m_fiberMain(GetMainFiber()), m_hwnd(hwnd)
  {}
// Iteration
public:
  const_iterator begin() const
  {
    // (i) Create a shared area
    cws_fiber_data  *data = new cws_fiber_data;
    // (ii) Create the callback fiber
    data->main_fiber   = m_fiberMain;
    data->worker_fiber = ::CreateFiber(0, WorkerProc, data);
    data->hwndParent   = m_hwnd;
    // (iii) Switch to fiber and start the enumeration
    ::SwitchToFiber(data->worker_fiber);
    // (iv) set the first window and the fiber data into the iterator
    if(data->hwndChild == 0)
    {
      ::DeleteFiber(data->worker_fiber);
      delete data;
      data = 0;
    }
    return child_window_sequence_const_iterator(data);
  }
  const_iterator end() const
  {
    return const_iterator(0);
  }
// Implementation
protected:
  // This reduces fragility of Fibers somewhat, by ensuring all
  // instances of sequence class share a particular main fiber.
  static LPVOID GetMainFiber()
  {
    static LPVOID s_fiberMain = ::ConvertThreadToFiber(0);
    return s_fiberMain;
  }
  static void WINAPI WorkerProc(PVOID lpParam)
  {
    // This makes enumeration call, then scopes enumeration via fiber 
    // exchanges between this fiber and main fiber (where iterators are used)
    cws_fiber_data  &data = *(cws_fiber_data*)lpParam;
    // Start the enumeration
    ::EnumChildWindows(data.hwndParent, EnumChildProc, (LPARAM)lpParam);
    // End enumeration by setting child to NULL ...
    data.hwndChild = 0;
    // ... and switch back for the last time (main will delete this fiber).
    ::SwitchToFiber(data.main_fiber);
  }
  static BOOL CALLBACK EnumChildProc(HWND hwnd, LPARAM lParam)
  {
    cws_fiber_data  &data = *(cws_fiber_data*)lParam;
    // Place this result in the data ...
    data.hwndChild = hwnd;
    // ... and switch back to main fiber
    ::SwitchToFiber(data.main_fiber);
    return true;
  }
// Members
protected:
  LPVOID  m_fiberMain;  // The main (/ ctor caller) fiber
  HWND    m_hwnd;       // Window whose children are enumerated
};




February 04: 

Listing 3: Client program that demonstrates the use of child_window_sequence in an STL-compliant fashion.

#include <stlsoft_simple_algorithms.h>
#include <winstl_window_functionals.h>
#include <winstl_string_access.h>

struct trace_text
  : public std::unary_function<HWND, void>
{
  void operator ()(HWND hwnd)
  {
    // Print out the window handle, and the caption.
    printf("0x%08x: \"%s\"\n", hwnd,
           static_cast<char const*>(winstl::c_str_ptr(hwnd)));
  }
};
int main()
{
  child_window_sequence wnds(::GetDesktopWindow());
  // (i) Use in explicit loop
  child_window_sequence::const_iterator begin(wnds.begin());
  child_window_sequence::const_iterator end(wnds.end());
  for(; begin != end; ++begin)
  {
    if(is_visible()(*begin))
    {
      trace_text()(*begin);
    }
  }
  // (ii) Use via algorithm
  stlsoft::for_each_if( wnds.begin(), wnds.end(), trace_text(), is_visible());
}




February 04: Input Iterators

Input Iterators

An input iterator has many expected qualities — comparable, dereferenceable, incrementable — that are commonly associated with iterators. The important distinction that is often missed is that an input iterator may support only single-pass algorithms. SGI (http://www.sgi.com/tech/stl/InputIterator.html) states that "after executing ++i, it is not required that copies of the old value of i be dereferenceable or that they be in the domain of operator ==()" and "it is not guaranteed that it is possible to pass through the same input iterator twice." It is this single-pass nature that is an artifact of the technique described here. Therefore, conducting an inner iteration within the outer would lead to unintended behavior, as in:


container::iterator begin(c.begin());
container::iterator end(c.end());
for(; begin != end; ++begin)
{
  if(xyz(*begin))
  {
    container::iterator it = find_if(begin, end, . . .);
    ...
  }

}

The find_if() loop would cause the outer iteration to be advanced, thereby failing to do the search in the desired manner.

Back to Article

February 04: Performance

Performance

One concern with the technique I present here involves performance. Even though fibers are not as heavyweight as threads, there is still a cost involved in switching contexts. I conducted a simple test (the code is available at http://www.cuj.com/code/) to quantify the costs of fiber switching. The test comprises incrementing an integer via fiber switching and function calls. The results show that switching fiber context is roughly 5-10 times the cost of a function call. Therefore, concerns over performance wouldn't appear to invalidate the technique, though they would be a factor in performance-critical scenarios.

Back to Article

February 04: Win32 Fibers

Win32 Fibers

Fibers were introduced into the Win32 API to facilitate the porting of UNIX programs containing user-controlled threading models to Windows, thereby providing an intermediate porting stage prior to moving such programs over to the Win32 threading architecture. Once one is used to the Fiber API, it is remarkably simple to use, and can also be useful for Win32-native applications.

When incorporated into an application, fibers provide a simple, explicitly controlled lightweight threading model. Each fiber has its own stack and execution context, but unlike threads a fiber must call SwitchToFiber() to pass control to another fiber.

Fiber execution is commenced by turning a thread into a fiber using the ConvertThreadToFiber() function. What this actually means is that the system allocates and returns a fiber-management block (whose nature is opaque to the caller), including associating a caller-supplied 32-bit value given in the function call to the fiber.

Subsequent fibers are created by calling the CreateFiber() function (which also takes a 32-bit parameter and returns a pointer to an opaque fiber-management block). These fibers may then be destroyed by calling the DeleteFiber() function from another fiber or, if all the fibers are to be terminated, from within the fiber to be closed, since this ends the thread by calling ExitThread().

Back to Article

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.