Channels ▼
RSS

Design

Advanced .NET Debugging: Synchronization


Mario Hewardt is a senior design engineer with Microsoft, and has worked extensively in the Windows system level development area for the last seven years. Mario is the author of Advanced .NET Debugging on which this article is based. Courtesy Addison-Wesley Professional. All rights reserved.


Editor's Note. The source code accompanying this article is available at the author's Advanced .NET Debugging website.

The Windows operating system is a preemptive and multithreaded operating system. Multithreading refers to the capability to run any number of threads concurrently. If the system is a single processor machine, Windows creates the illusion of concurrent thread execution by allowing each thread to run for a short period of time (known as a "time quantum'). When that time quantum is exhausted, the thread is put to sleep and the processor switches to another thread (known as a "context switch"), and so on. On a multiprocessor machine, two or more threads are capable of running concurrently (one thread per physical processor).

By being preemptive, all active threads in the system must be able to yield control of the processor to another thread at any point in time. Given that the operating system can take away control from a thread, developers must take care to always be in a state where control can safely be taken away.

If all applications were single threaded, or if all the threads were running in isolation, synchronization would not be a problem. Alas, for efficiency sake, dependent multithreading is the norm today and also the source of a lot of bugs in applications.

Dependent multithreading occurs when two or more threads need to work in tandem to complete a task. Code execution for a given task may, for example, be broken up between one or more threads (with or without shared resources) and hence the threads need to "communicate" with each other in regards to the order of thread execution. This communication is referred to as "thread synchronization" and is crucial to any multithreaded application.

Thread Synchronization Primitives

Internally, the Windows operating system represents a thread in a data structure known as the "thread execution block" (TEB). This data structure contains various attributes such as the thread identifier, last error, local storage, and so on. Listing 1 shows an abbreviated output of the different elements of the TEB data structure.


0:000> dt _TEB
ntdll!_TEB
  +0x000 NtTib : _NT_TIB
  +0x01c EnvironmentPointer : Ptr32 Void
  +0x020 ClientId : _CLIENT_ID
  +0x028 ActiveRpcHandle : Ptr32 Void
  +0x02c ThreadLocalStoragePointer : Ptr32 Void
  +0x030 ProcessEnvironmentBlock : Ptr32 _PEB
  +0x034 LastErrorValue : Uint4B
  +0x038 CountOfOwnedCriticalSections : Uint4B
   …
  +0xfca RtlExceptionAttached : Pos 9, 1 Bit
  +0xfca SpareSameTebBits : Pos 10, 6 Bits
  +0xfcc TxnScopeEnterCallback : Ptr32 Void
  +0xfd0 TxnScopeExitCallback : Ptr32 Void
  +0xfd4 TxnScopeContext : Ptr32 Void
  +0xfd8 LockCount : Uint4B
  +0xfdc ProcessRundown : Uint4B
  +0xfe0 LastSwitchTime : Uint8B
  +0xfe8 TotalSwitchOutTime : Uint8B
  +0xff0 WaitReasonBitMap : _LARGE_INTEGER

Listing 1: Abbreviated output of the TEB data structure

All in all, on a Windows Vista machine, the TEB data structure contains right around 98 different elements. Although most of these elements aren't typically used when debugging .NET synchronization problems, it is important to be aware that Windows carries a lot of information about any given thread to accurately schedule execution. Much in the same way that Windows includes a thread data structure to maintain the state of a thread, so does the CLR. The CLR's version of the thread data structure is, not surprisingly, called Thread. The internals of the Thread class is not made public.

One very useful command is the threads command, which outputs a summary of all the CLR threads currently in the process as well as individual state for each thread:

Although the threads command gives us some insight into the CLR representation of a thread (such as the thread state, CLR thread ID, OS thread ID, etc.), the internal CLR representation is far more extensive. Even though the internal representation is not made public, we can use the Rotor source code to gain some insight into the general structure of the Thread class. The Rotor source files of interest are threads.h and threads.cpp located under the sscli20\clr\src\vm folder. Listing 2 shows a few examples of data members that are part of the Thread class.


class Thread
{
   …
   volatile ThreadState m_State;
   DWORD m_dwLockCount;
   DWORD m_ThreadId;
   LockEntry *m_pHead;
   LockEntry m_embeddedEntry;
   …
}

Listing 2: Abbreviated version of the CLR Thread class

The m_State member contains the state of the thread (such as alive, aborted, etc.). The m_dwLockCount member indicates how many locks are currently held by the thread. The m_ThreadId member corresponds to the managed thread ID, and the last two members (m_pHead, m_embeddedEntry) correspond to the reader/writer lock state of the thread. If we need to take a closer look at a CLR thread (including the members above), we have to first find a pointer to an instance of a Thread class. This can easily be done by first using the threads command and looking at the ThreadOBJ column, which corresponds to the underlying Thread instance:

We can see that the threads command shows that the first thread pointer is located at address 0x003b4528. We can then use the dd command to dump out the contents of the pointer. What if we want to find out the contents of the m_State member? To accomplish this, we have to first figure out the offset of this member in the object's memory layout. A couple of different strategies can be used. The first strategy is to look at the class definition and see if there are any members in close proximity that you already know the value of. If that is the case, you can simply dump out the contents of the object until you find the known member and subsequently find the target member by relative offset. The other strategy is to simply look at all the members in the class definition and find the offset of the target member by simply adding up all the sizes of previous members leading up to the member of interest.

Let's use the latter strategy to find the m_State member. Looking at the class definition, we can see that the m_State member is in fact the very first member of the class. It then stands to reason that if we were to dump out the contents of the thread pointer, the very first field should be the state of the thread:

<

Interestingly enough, the first element (0x79f96af0) doesn't seem to resemble a thread's state. As a matter of fact, if we use the ln (list near) command, we can see the following:

We are seeing the virtual function table pointer of the object. Although not terribly interesting from a debugging perspective, it can come in handy to convince ourselves that the pointer we are looking at is in fact a pointer to a valid thread object. Because we can safely ignore this pointer for our current purposes, the next value is 0x00000220. This value looks like it may represent a bitmask of sorts but to interpret this bitmask in the context of a thread state, we must first enumerate the various bits that constitute a thread state. The Thread class contains an enumeration that represents a thread's state called the ThreadState enumeration. This enumeration can yield important clues when debugging synchronization problems. Although the entire enumeration contains close to one hundred fields, some are more important than others when debugging synchronization issues. Table 1 shows the most interesting fields of the ThreadState enumeration.

Table 1: ThreadState Enumeration

Based on Table 1 and our previous state 0x00000220, we can infer the following:

  • The thread is a background thread (0x00000200).
  • The thread is in a state where it can enter a Join (0x00000020).
  • The thread is a newly initialized thread (0x00000000).


THREAD CLASS DISCLOSURE Although it may be useful to see the "internals" of a thread, it is important to realize that there is a good reason why this information is internal and not exposed through the threads command. Much of the information is an implementation detail and Microsoft reserves the right to change it at any time. Taking a dependency on these internal mechanisms is a dangerous prospect and should be avoided at all cost. Secondly, Rotor is a reference implementation and does not guarantee that the internals mimic the CLR source code in detail.


Now that we have discussed how the CLR represents a thread internally, it is time to take a look at some of the most common synchronization primitives that the CLR exposes as well as how they are represented in the CLR itself.

Events

The event is a kernel mode primitive accessible in user mode via an opaque handle. An Event is a synchronization object that can take on one of two states: signaled or nonsignaled. When an event goes from the nonsignaled state to the signaled state (indicating that a particular event has occurred), a thread waiting on that event object is awakened and allowed to continue execution. Event objects are very commonly used to synchronize code flow execution between multiple threads. For example, the native Win32 API ReadFile can read data asynchronously by passing in a pointer to an OVERLAPPED structure. Figure 1 illustrates the flow of events.

Figure 1: Asynchronous API flow

The ReadFile returns to the caller immediately and processes the read operation in the background. The caller is then free to do other work. After the caller is ready for the results of the read operation, it simply waits (using the WaitForSingleObject API) for the state of the event to become signaled. When the background read operations succeeds, the event is set to a signaled state, thereby waking up the calling thread, and allows execution to continue. There are two forms of event objects: manual reset and auto reset. The key difference between the two is what happens when the event is signaled. In the case of a manual reset event, the event object remains in the signaled state until explicitly reset, thereby allowing any number of threads waiting for the event object to be released. In contrast, the auto reset event only allows one waiting thread to be released before being automatically reset to the nonsignaled state. If there are no threads waiting, the event remains in a signaled state until the first thread tries to wait for the event. In the .NET framework, the manual reset event is exposed in the System.Threading.ManualResetEvent class and the auto reset event is exposed in the System.Threading.AutoResetEvent class.

To take a closer look at an instance of either of the two classes of events, we can use the do command as shown in the following:

Because the Event classes in the System.Threading namespace are simply wrappers over the underlying Windows kernel objects, the waitHandle member of the classes can be used to gain more insight into the underlying kernel mode object. We can use the handle debugger command with the waitHandle value:

Here, we can see that the waitHandle with value 204 corresponds to an auto reset event that is currently in a waiting state.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video