A Semaphore With Priorities For Win32

By Thomas Becker, August 01, 1999

Yes, you can add priorities to Win32 semaphores, but they aren't easy to get right.

August 1999/A Semaphore With Priorities For Win32

A Semaphore With Priorities For Win32

Thomas Becker

Yes, you can add priorities to Win32 semaphores, but they aren't easy to get right.

Introduction

More and more Windows application these days take advantage of the multithreading capabilities of Microsoft's 32-bit operating systems. The virtues of multithreading should be obvious to readers of this magazine. However, professional developers frequently underestimate the special problems and pitfalls arising from multithreaded programming.

The purpose of this article is twofold. On the one hand, I present a ready-to-use class that extends the capabilities of Win32 semaphores. You can use this class as is, without knowing about its inner workings. On the other hand, I discuss the source code of the class, thereby helping you to sharpen your awareness of the synchronization issues that arise in multithreaded programming. Needless to say, I will be addressing only a few of the many critical issues in Windows multithreaded programming. In addition to Jeffery Richter's book [1], which has become the standard reference for low-level Win32 programming, I very strongly recommend Jim Beveridge and Robert Wiener's book [2] to anybody who wishes to embark on a serious multithreaded Windows project.

Accessing Resources with Priorities

Suppose your application uses a resource that can be accessed by only a limited number of threads at a time. An example would be a database for which you own a limited number of licenses. In that situation, you would typically use a Win32 semaphore to enforce the limitation. The initial count of the semaphore is set to the maximum number of threads that may access the resource at a time. Before a thread uses the resource, it waits on the semaphore. If the current count of the semaphore is greater than zero, the wait operation completes successfully, and the count of the semaphore is decremented by one. If the current count of the semaphore is zero, the wait operation blocks until either the wait timeout is reached or another thread increments the current count of the semaphore.

Now assume that you wish to give certain threads a higher priority than others when waiting on the semaphore. This could be necessary in a client/server situation in which certain clients are blessed with higher privileges than others. But even in an egalitarian world, the necessity for such prioritization arises. A client request is often handled by first making one short trip to a database just to offer a list of options to the client. When the client has made a choice, the server accesses the database again to retrieve the "real" data. In this sort of situation, the initial short trip should obviously be given higher priority, because it does not hold things up in a significant way, and clients should not have to wait just to see their options. In other words, there is still need for a semaphore as explained above, but this semaphore should release waiting threads according to different priorities.

Before I discuss my solution to this problem, I need to discuss how Windows NT handles the scheduling of threads that are blocking on a synchronization object such as a Win32 semaphore. On the lowest level, the scheduler places these threads in a FIFO (First In/First Out) queue. On the application level, the order in which the threads are released is not always FIFO. The reason for this is that a waiting thread may be scheduled to perform an APC (asynchronous procedure call) such as an I/O completion routine. When the thread returns from the APC, it will once again be placed at the tail end of the queue of threads waiting for the synchronization object. If you have a situation in which it is critical for threads to be released in the order in which they arrived, you must implement your own queue on the application level. This will of course incur overhead as well as programming complexity and should be avoided if possible.

For the problem of releasing threads according to priorities, it is necessary to know how NT's thread priorities affect the order in which waiting threads are released. The answer is: not at all. The scheduler's priority system kicks in only when a thread emerges from the wait state, not while it is waiting. You can easily verify this by letting several threads with different priorities wait on a Win32 semaphore. If you release them one by one, they will emerge in the order that they were queued, regardless of the thread priority (assuming no APCs). However, if you release them all at once with a single call to the ReleaseSemaphore API, they will be scheduled according to their thread priorities. Note that when performing such scheduling experiments, you should not run your program under a debugger, because the debugger has its own mind about suspending and releasing threads. For more information on NT thread scheduling, see Chapter 4 in David A. Solomon's book on the NT operating system [3].

The Class CPrioritySemaphore

The upshot of all this is that thread priorities cannot be used to obtain a semaphore that releases threads by priority level. Instead, this functionality must be implemented on the application level. Class CPrioritySemaphore (see Figures 2 and 3) wraps a Win32 semaphore with the additional option to specify a priority when waiting on the semaphore. The interface is modeled after the Win32 functions that operate on handles to semaphores. After construction, a CPrioritySemaphore object must be properly initialized with the Create member function. In addition to the initial and maximum count of the semaphore, the caller must specify the number of priority levels that the semaphore object will support. The Release member function increments the count of the semaphore object just like the Win32 function ReleaseSemaphore.

The additional functionality of the priority semaphore is provided by the Wait member function. In addition to the wait timeout, which can be a finite number of milliseconds or the Win32 constant INFINITE, Wait takes as an argument the desired priority level, where priorities range from zero to one less than the number of priority levels. When the current count of the semaphore object is greater than zero, the object behaves exactly like a Win32 semaphore: the wait operation completes immediately, and the current count of the semaphore decreases by one. When the count has dropped down to zero, Wait blocks just like Win32's WaitForSingleObject would. However, when more than one thread is blocking on a call to Wait and the semaphore object becomes signaled (i.e., some thread calls the Release member function), the semaphore now releases the waiting threads in the order of their wait priorities. More precisely, threads are released according to what is known as the "round robin scheme."As long as there are threads waiting with priority level X, no thread on a priority level less than X can be released, regardless of the order in which the threads arrived.

As always when a round robin scheme is employed for scheduling, there is the danger of "starving" threads on low priority levels. To see how this happens, assume that one or more threads are blocking on the Wait function with the highest wait priority. Assume further that more threads are calling Wait with the highest wait priority, and they are coming in at a rate that is greater than or equal to the rate at which they are being released. The number of threads waiting with the highest priority level will always be greater than zero. While this situation lasts, threads that are waiting with a lower priority level cannot be released. Whether or not this phenomenon becomes a problem depends upon the context in which CPrioritySemaphore is used.

To help get around the "starving" problem, CPrioritySemaphore provides a member function GetNumWaitingThreads. This function returns the internal count of waiting threads for a given wait priority level. This function is of course to be taken with a grain of salt, because by the time you look at its return value, the number of waiting threads may have changed dramatically due to thread preemption. In particular, you must not under any circumstances use this function for synchronization purposes, as in the following flawed scenario: the main thread has stopped creating new threads that wait on the semaphore. A call to GetNumWaitingThreads returns zero. The program concludes that no more threads will emerge from waiting on the semaphore with this priority level. This is an invalid conclusion, because a thread may have been preempted just around the time it entered the semaphore's Wait function. Although the program is not creating any more threads and the number of currently waiting threads is zero, there is still a thread out there that is going to acquire the semaphore.

However, you can still reasonably use the GetNumWaitingThreads function to avoid backlogs of starving threads. Before you accept a client with high priority, you can check if the number of waiting threads with lower priority exceeds an acceptable backlog. If so, you turn away the client with a "try again later" message.

There are more sophisticated ways to avoid thread starvation when using a round robin scheduling scheme. NT's thread scheduler, for example, uses priority boosting to keep low-priority threads from starving. However, furnishing CPrioritySemaphore with such capabilities would require a separate thread that does nothing but administer the queue of waiting threads. This kind of overhead should certainly be avoided unless it is really necessary.

If you want to use the class CPrioritySemaphore without looking at its inner workings, you are all set. The complete code comes with an HTML file available on the CUJ ftp site (see p. 3 for downloading instructions) that documents the class and its member functions.

CPrioritySemaphore Internals

Now I show how CPrioritySemaphore achieves the conditional releasing of threads according to wait priority levels. Conceptually, the class members are:

a Win32 semaphore,
a set of Win32 manual reset events, one for each priority level but the lowest,
a set of ints to keep track of the number of waiting threads, one for each priority level but the lowest,
an auxiliary critical section whose purpose is explained below.

All the good stuff is in the Wait member function. Figure 1 shows pseudo-code for this function. Take a look at the wait operation in the middle of the pseudo-code first. Here, the function performs a simultaneous wait on the semaphore and the events that are associated with all higher priority levels. Each of these events will be signaled if and only if there are no waiting threads on the corresponding priority level. Hence, the waiting thread will be released if and only if the semaphore is signaled and there are no waiting threads on higher priority levels.

What remain to be discussed are the two blocks surrounding the multiple wait operation in the pseudo-code of Figure 1. These blocks ensure that each event is signaled when there are no more threads waiting on the corresponding priority level, and unsignaled otherwise. Not considering synchronization issues, this is simple enough to implement. Keep track of the number of waiting threads by incrementing a count before entering the wait state and decrementing it after leaving the wait state. Furthermore, reset the event before entering the wait state, and set it when the count of waiting threads has dropped to zero.

The important part is to lump the thread counting operation and the event operation together into one atomic operation by bracing them with a critical section. Without this enclosure, the following scenario would be possible:

1. Thread 1 completes a wait operation on some priority level X that is not the lowest priority.

2. Thread 1 decrements the count of threads waiting with priority level X and evaluates the if-condition. Assume that this condition evaluates to true — that is, the thread count for this level now equals zero.

3. At this point, the thread is preempted by Thread 2, which enters the Wait function with the same priority level.

4. Thread 2 dutifully increments the count of waiting threads for this level to 1, resets the corresponding event, and goes to sleep on the wait operation.

5. While Thread 2 is asleep, Thread 1 gets scheduled again. Since it has already evaluated the if-condition and found it to be true, it sets the event associated with this priority level.

Oops, there goes the round robin scheme! A waiting thread exists on level X, but the event associated with level X is signaled. This will allow threads waiting on priority levels less than X to complete their wait operations.

Using the critical section eliminates all problems of this nature. All that remains to do is to consider the possibility of threads being preempted just before and just after the multiple wait operation. The total number of possible scenarios is certainly too large to discuss here. The following two observations preclude the need to systematically analyze all possibilities: first of all, there are no nested wait operations here. This rules out the classical type of deadlock, in which two threads try to obtain exclusive access to two resources, and they do so in opposite order. Second, when a thread runs through the Wait function, it locks and unlocks the wait operation for threads that wait with lower priorities. Its own wait operation, however, depends only on the semaphore and the events controlled by threads that wait with higher priorities. Hence, no two threads can block each other.

I have used a single critical section to enclose the thread counting operation and the event operation for all priority levels. It would actually be possible to use a different critical section for each priority level. That would obviously mean more overhead in terms of operating system resources. In the presence of massive numbers of threads, however, a critical section for each priority level could improve the performance.

An Example

Figure 4 shows an example of how to use the semaphore in a multithreaded database server. Here, the assumptions are that each client is served in a separate thread, and that only a limited number of concurrent threads may access the database. Furthermore, there are "lightweight" and "heavyweight" client requests, and the lightweight requests are unconditionally given preference over the heavyweight ones. The starvation problem is not addressed in this example.

Conclusion

The combination of the thread count with an event that gets signaled when that count equals zero represents a pattern that occurs quite frequently in multithreaded programming. It would certainly be desirable to have a primitive synchronization object that achieves this. Such an object would be a kind of inverse semaphore. It would keep a count that could be incremented and decremented, and it would become signaled when that count dropped down to zero.

References

[1] Jeffery Richter. Advanced Windows, Third Edition (Microsoft Press, 1997).

[2] Jim Beveridge and Robert Wiener. Multithreading Applications in Win32, (Addison-Wesley Developers Press, 1997).

[3] David A. Solomon. Inside Windows NT, Second Edition (Microsoft Press, 1998).

Thomas Becker works as a senior software engineer for Zephyr Associates, Inc. in Zephyr Cove, NV. He can be reached at [email protected].

Previous 1 2 3 4 5

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

A Semaphore With Priorities For Win32