Dr. Dobb's | Detecting Deadlocks in C++ Using a Locks Monitor

Detecting Deadlocks in C++ Using a Locks Monitor

In multithreaded programming, locking mechanisms are a must in order to keep data consistent. This article will discuss a technique for detecting and breaking deadlocks in multithreaded environment. It will also present a simple and powerful way to analyze why the deadlock occurred.

April 01, 2003
URL:http://www.drdobbs.com/detecting-deadlocks-in-c-using-a-locks-m/184416644

Detecting Deadlocks in C++ Using a Locks Monitor

A system where multiple threads coexist in the same address space is extremely vulnerable to data corruption and deadlock. While there is a trade-off between these two problems, deadlock remains the ultimate bug and is almost impossible to eradicate.

This article will discuss a technique for detecting and breaking deadlocks in a multithreaded environment. It will also present a simple and powerful way to analyze why the deadlock occurred.

A deadlock is a situation where each thread in a set of at least two threads is waiting for a resource (shared data is also considered a resource), which is locked by another thread in the set. The result is that each thread in the set is waiting indefinitely for the resources to be released. There are three major deadlock-handling strategies:

Ignoring — assuming deadlocks rarely occur and not taking any action to handle them.
Avoidance — avoiding states that may lead to a deadlock by keeping track of resource allocation.
Detection and breaking — detecting the deadlock and breaking the cycle, usually by killing one of the threads in the cycle.

Each strategy has its advantages and weaknesses, but none gives a perfect solution to the deadlock problem. This article will focus on the design and implementation of one common technique to detect deadlocks. The code was compiled using Visual C++ 6 and was tested under Windows 2000. Note that in order to simplify the code, I don't perform any error checking and all my WaitForObjects() wait infinite time.

The Algorithm

The basic idea is to track each lock request, successful lock, and release of lock that is made in our system. The application updates the locks monitor about each lock operation and the information about the locks is entered into a directed graph in the following ways:

When the thread tries to lock a resource, insert a node from the thread to the resource.
When the thread acquires the lock, reverse the direction of the node from the resource to the thread.
When the thread releases a resource, erase the node from the resource to the thread.

Figure 1 represents the following deadlock scenario (T-stands for Thread, R-Resource):

T1 locks R1

T2 locks R2

T1 tries to lock R2

T2 tries to lock R1

A cycle in the graph represents a deadlock. Note that a cycle can be closed only when a thread tries to lock a resource.

Along with finding a deadlock, by keeping track of the locks in the application, we can get some extra information; for example, if a thread tries to lock a resource while already owning the resource or if the thread tries to release a resource without owning it. These situations, although not necessarily presenting a problem, can be the result of an unexpected flow in the program and it can be helpful to be notified about them.

Locks Monitor Design

The process of tracking and analyzing the locks is done by what I call the "locks monitor," shown in Figure 2. The locks monitor is controlled by a manager, which creates the locks monitor, starts it, and ends it. The manager also makes sure that only one locks monitor exists and runs in the application (running more than one locks monitor in an application will give poor results).

The locks monitor interacts with the rest of the application via the messages queue. For each lock operation (try, own, or release), the application sends a message to the queue and the locks monitor retrieves the message and processes it. Each message contains three elements: the lock operation, the thread performing the lock operation, and the resource, which is the destination of the operation.

The locks monitor consists of three main components:

Messages processor — as its name implies, this part gets messages from the queue and processes them. This part is the main loop of the locks monitor and controls its flow.
Directed graph — holds the information about the locks in a graph as mentioned in the algorithm description.
Logger — a simple logger to notify about possible problems. When a deadlock occurs, the logger will dump some useful information about the state of the application that will help to reproduce the scenario that led to the deadlock.

Messages Queue

The most important part of the locks monitor is the messages queue.

It synchronizes the messages arriving from the sending threads along with the receiving thread (the locks monitor). If it is not handled carefully, the messages queue can create a bottleneck and slow down performance and, if the queue doesn't do the synchronization right, the results will be unpredictable. In order to assure high efficiency and good performance of this important component, I decided to use the Win32 messaging mechanism.

The locks monitor is created and terminated by the locks monitor manager. The manager is implemented in the class CLocksMonitorManager. It is implemented as a singleton, meaning there could be only one instance of it in the application. (See References for a more detailed explanation of singletons.) To get this instance, you should call this static method:

CLocksManager()::GetInstance()

The manager is the object that creates the thread in which the locks monitor runs. The manager has another static method, GetLMThreadId(), which holds the ID of the locks monitor thread. This method is used by the threads in the application when sending messages to the locks monitor thread.

The locks monitor thread is implemented in the LocksMonitorThreadProc() function (Listing 1) and is quite simple: All it does is sit in a loop and pull messages from the queue. For each message received, the thread calls the appropriate method in the class CLocksMonitor, which handles the message.

CLocksMonitor uses two major data members, which are responsible for most of the work. The first one is a member of type CLocksLogger, which is a simple logger, used to dump info into a file when problems (such as a deadlock) arise.

Note that the logger can produce a full stack trace for any given thread. It does it with the help of a data member of type CStackTrace. This class produces a stack trace using the StackWalk() function, which is part of the Win32 API. In order to get a stack trace when your application deadlocks, you must make sure that you have dbgHelp.dll on your computer and that you produced .pdb files for your application. If the .pdb files are missing, all you will be able to see is a bunch of addresses instead of a meaningful stack trace. A more detailed discussion about this class is out of the scope of this article. See References for more information.

The second member is of type CDirectGraph. It holds the direct graph using an STL map:

struct Edge
{
	   HANDLE m_hID;
	   bool   m_bThread;
}
typedef  map<HANDLE, Edge> EdgeList;
typedef  map<Edge, EdgeList> GraphMap;
.
.
.
GraphMap m_graphMap;

An Edge is a simple struct that contains the thread ID or a handle to a resource, and a Boolean member that tells if this edge represents a thread or a resource. Each entry in the GraphMap maps an edge to a list of edges. This edge has nodes that point to each edge in the list of edges. For example, if we have three edges: E1, E2, and E3, and the GraphMap maps edge E1 to the list of edges {E2, E3}, this represents the existance of the following nodes in the direct graph: node from E1 to E2 (E1-->E2) and a node from E1 to E3 (E1-->E3).

In the application part, I implemented the operation of putting the messages in the queue. For that purpose, I created some macros that will do the job automatically. The macros replace each WaitForObjects() function with MyWaitForObjects() and any ReleaseResource() with MyReleaseResource():

define WaitForSingleObject(hResource,
       dwMilliseconds) \
       MyWaitForSingleObject(hResource,
       dwMilliseconds);
define WaitForMultipleObjects(nCount, 
       pHandles, bWaitAll, dwMilliSec) \
       MyWaitForMultipleObjects(nCount, 
       pHandles, bWaitAll, dwMilliSec);
define ReleaseMutex(hMutex) \
       MyReleaseMutex(hMutex);

All MyFunctions() are actually the basic functions with the addition of posting the proper messages to the locks monitor. Note that in the message, I pass the thread ID rather than the thread handle. This is because using GetCurrentThread() will return only a pseudohandle, which is actually a constant (0xfffffffe). When needed, I can obtain the thread handle from the thread ID by calling OpenThread(). (The OpenThread() function is supported only under Windows 2000/Me/XP and is part of the Microsoft platform SDK. If you use NT or 98, or you don't have the appropriate platform SDK installed, you can build a map that maps thread ID to the thread handle instead of using OpenThread().)

Listing 2 shows the implementation of MyWaitForSingleObject() and MyReleaseMutex().

Integration

To integrate the locks monitor into your application, you need to perform just two steps:

Add to each of your files (or at least the ones that perform any lock operation) the file locksWrapperMacros.h. This is done in order to replace the regular WaitForObjects() with the MyWaitForObjects(), which updates the locks monitor.

In the entry point of your application, call CLockMonitorMgr::GetInstance()->Start() to start the locks monitor. At the exit point from your application call CLockMonitorMgr::GetInstance()->End() to terminate it.

That's all. Now your application is ready to work with the locks manager. Listing 3 is a simple example that demonstrates how to use the locks monitor. In the example, I created two threads (Thread_1, Thread_2) and two resources (g_hResource[0], g_hResource[1]), and produced the following scenario:

Thread_1 locks g_hResource[0] and try to lock g_hResource[1]
Thread_2 locks g_hResource[1]and try to lock g_hResource[0]

I added sleep in order to guarantee a deadlock. Note that the calls to WaitForSingleObject() and ReleaseMutex() are being replaced with MyWaitForSingleObject() and MyReleaseMutex(). This replacement is totally transparent to the programmer and is made by the macros defined in locksWrapperMacros.h. The output of the example is dumped into lockMonitor.log.

In the output (Listing 4) you see that a deadlock was detected, and a full stack trace was obtained for each thread in the cycle. Afterward, one thread from the cycle was killed in order to break the deadlock.

Known Bugs

The implementation presented here has one major drawback: It doesn't deal correctly with WaitForMultipleObjects() when the bWaitAll parameter is False. Think of the following situation (shown in Figure 3):

T1 locks R1, T2 Locks R2, and T3 Locks R3. Now T1 waits for R2 and T2 waits for R1 or R3

In my implementation, the locks monitor will detect a deadlock even though there is no deadlock: T3 will release R3, T2 will lock R3 and will stop trying to lock R1, and the cycle is broken.

It is possible to upgrade the locks monitor so it will handle this state correctly. The reason I didn't do it is because it will increase the complexity of the code and will make the code and the basic idea much harder to understand.

Another problem is the fact that the locks monitor will work only if you use a mutex to gain ownership over a resource. This is because of the different behavior of different synchronization objects.

If you wait on a mutex after you owned it, the second wait will return immediately and no harm is done. But, if for some reason you chose to use event instead of a mutex (not recommended), you will find that the second wait on the event causes a deadlock. Another difference is that with a mutex, a thread that didn't acquire the lock can't release it, which is not the case with events.

Again, although the locks monitor behavior can be enhanced to handle other synchronization objects, I didn't do it in order to keep things simple.

Conclusion

In this article, I presented an approach for detecting deadlocks. I also presented a way for the programmer to get useful information about the flow that led to the deadlock.

Obviously, there is no perfect solution for deadlock. Even when detected and broken, it is still hard to estimate how breaking the deadlock by killing a thread will affect the application. Still, by using the locks monitor you can minimize the damage and, based on the info that the locks monitor gives, you can get to the root of the problem that caused the deadlock.

References

C/C++ Users Journal. "A Per-Thread Singleton," May 2002, by Puneesh Chaudhry.

Windows Developer Magazine, August 2002 (Tech Tips). More info about producing a stack trace.

Microsoft Systems Journal. "Under the Hood," April 1997, by Matt Pietrek.

"Detecting and Breaking Deadlocks," by Ben Adida, http://web.mit.edu/6.033/1997/reports/r03-ben.html.

Tomer Abramson holds a BSc in Computer Science from Ben Gurion University. He has been programming for four years, mostly in C++, and currently works as a software engineer for cti2 in the telephony group.

Detecting Deadlocks in C++ Using a Locks Monitor

Figure 1 Diagram of a deadlock

Detecting Deadlocks in C++ Using a Locks Monitor

Figure 2 General design of the locks monitor

Detecting Deadlocks in C++ Using a Locks Monitor

Figure 3 False deadlock

Detecting Deadlocks in C++ Using a Locks Monitor

Listing 1 LocksMonitor class


#include <iostream.h>
#include "LocksMonitor.h"
#include "directedGraph.h"
#include "locksLogger.h"
#include <crtdbg.h>
#include <winbase.h>

// Messages proccesor
DWORD WINAPI LocksMonitorThreadProc(LPVOID param)
{
  CLocksMonitor locksMonitor;
  Edge threadEdge;
  Edge resourceEdge;
  MSG msg;
	
  threadEdge.m_bThread = true;
  resourceEdge.m_bThread = false;

  while (GetMessage(&msg, NULL, 0, WM_LM_INIT_MSG_QUEUE)) 
  {
    threadEdge.m_hID = (HANDLE)msg.wParam;
    resourceEdge.m_hID = (HANDLE)msg.lParam;
    switch (msg.message)
    {
      case(WM_LM_TRY_LOCK):
        locksMonitor.TryLock(threadEdge, resourceEdge);
	break;
      case(WM_LM_OWN_LOCK):
	locksMonitor.OwnLock(threadEdge, resourceEdge);
	break;
      case(WM_LM_RELEASE_LOCK):
	locksMonitor.ReleaseLock(threadEdge, resourceEdge);
	break;
      case(WM_LM_INIT_MSG_QUEUE):
	// dummy message that is sent in order to create 
	// a message queue for this thread
	break;
      default:
	// no operation
	break;
    }
  }

  return 0;
}

//-------------------------------------------------------------------------------------------
void CLocksMonitor::TryLock(const Edge& thread, const Edge& resource)
{
  if (m_directedGraph.InsertNode(thread, resource) == RC_NODE_ALREADY_EXIST)
  {
    m_logger.TryWhileOwn((DWORD)thread.m_hID, resource.m_hID);
       return;
  }
	
  if (m_directedGraph.IsCyclic(thread, resource, m_cycleList))
  {
    AnalizeDeadlock();
    BreakDeadlock();
    m_cycleList.clear();
		
  }
}

void CLocksMonitor::OwnLock(const Edge& thread, const Edge& resource)
{
  m_directedGraph.ReverseNode(thread, resource);
}

void CLocksMonitor::ReleaseLock(const Edge& thread, const Edge& resource)
{
  if (m_directedGraph.RemoveNode(resource, thread) == RC_NODE_NOT_EXIST)
    m_logger.ReleaseWithoutAquire((DWORD)thread.m_hID, resource.m_hID);
}

void CLocksMonitor::AnalizeDeadlock()
{
  m_logger.DumpText("*** Detect deadlock. Dump stack of all threads in deadlock cycle ***");
  for (list<Edge>::const_iterator listIter = m_cycleList.begin(); listIter != m_cycleList.end(); listIter++)
  {
    if ((*listIter).m_bThread)
      m_logger.DumpStack((DWORD)(*listIter).m_hID);
  }
}

void CLocksMonitor::BreakDeadlock()
{
  // you should think of some clever alghoritm to decide about the victim
  // thread to be killed. I'll select it randomaly...
  for (list<Edge>::const_iterator listIter = m_cycleList.begin(); listIter != m_cycleList.end(); listIter++)
  {
    if ((*listIter).m_bThread)
    {
      HANDLE hThread = OpenThread(THREAD_TERMINATE, FALSE, (DWORD)(*listIter).m_hID);
      TerminateThread(hThread, 0);
      m_directedGraph.RemoveEdge(*listIter);
      m_logger.NotifyKill((DWORD)(*listIter).m_hID);
      CloseHandle(hThread);
      break;
    }
  }
}

Detecting Deadlocks in C++ Using a Locks Monitor

Listing 2 MyWaitForSingleObject && MyReleaseMutex


#include "locksWrapper.h"

void MyWaitForSingleObject(HANDLE hResource, DWORD dwMilliseconds)
{
  PostThreadMessage(CLocksMonitorMgr::GetLMThreadId(), WM_LM_TRY_LOCK, (WPARAM)GetCurrentThreadId(), (LPARAM)hResource);
  WaitForSingleObject(hResource, dwMilliseconds);
  PostThreadMessage(CLocksMonitorMgr::GetLMThreadId(), WM_LM_OWN_LOCK, (WPARAM)GetCurrentThreadId(), (LPARAM)hResource);
}


void MyReleaseMutex(HANDLE hMutex)
{
  ReleaseMutex(hMutex);
  PostThreadMessage(CLocksMonitorMgr::GetLMThreadId(), WM_LM_RELEASE_LOCK, (WPARAM)GetCurrentThreadId(), (LPARAM)hMutex);
}

Detecting Deadlocks in C++ Using a Locks Monitor

Listing 3 Demonstrating the use of the locks monitor


#include "locksWrapperMacros.h"


// globals
HANDLE g_hResource[2] = { NULL, NULL };

void f2()
{
  WaitForSingleObject(g_hResource[1], INFINITE);
  Sleep(100);
  WaitForSingleObject(g_hResource[0], INFINITE);
  ReleaseMutex(g_hResource[1]);
  ReleaseMutex(g_hResource[0]);
}

void f1()
{
  WaitForSingleObject(g_hResource[0], INFINITE);
  Sleep(100);
  WaitForSingleObject(g_hResource[1], INFINITE);
  ReleaseMutex(g_hResource[0]);
  ReleaseMutex(g_hResource[1]);
}


DWORD WINAPI Thread_1(LPVOID pParam)
{
  f1();	
  return 0;
};

DWORD WINAPI Thread_2(LPVOID pParam)
{
  f2();
  return 0;
};

void main()
{
  HRESULT hr = S_FALSE;
  // get ptr to the only instance of CLocksMonitorMgr
  CLocksMonitorMgr* pMgr = CLocksMonitorMgr::GetInstance();  
  HANDLE hThread[2] = { NULL, NULL };
  DWORD  dwThreadId = 0;
  // start  locks monitor via its manager
  pMgr->Start();

  for (int i = 0; i < 2; i++)
    g_hResource[i] = CreateMutex(NULL, FALSE, NULL);

  hThread[0] = CreateThread(NULL, 0, Thread_1, NULL, 0, &dwThreadId);
  hThread[1] = CreateThread(NULL, 0, Thread_2, NULL, 0, &dwThreadId);  

  WaitForMultipleObjects(2,hThread,TRUE,INFINITE);

  // stop locks monitor via its manager
  pMgr->Stop();
}

Detecting Deadlocks in C++ Using a Locks Monitor

Listing 4 Output file lockMonitor log


*** Detect deadlock. Dump stack of all threads in deadlock cycle ***

start stack trace for thread 1732:

address 2012751848
function: NtWaitForSingleObject
module: ntdll.dll

address 2011708251
function: ExitVDM
module: KERNEL32.dll

address 4241383
function: void __cdecl f1(void)
file: d:\locksmonitor\main.cpp
line: 20 + 14 bytes
module: LocksMonitor.exe

address 4241501
function: unsigned long __stdcall Thread_1(void *)
file: d:\locksmonitor\main.cpp
line: 29 + 0 bytes
module: LocksMonitor.exe

address 2011784483
function: SetThreadAffinityMask
module: KERNEL32.dll
finish stack trace for thread 1732:
start stack trace for thread 2032:

address 2012751848
function: NtWaitForSingleObject
module: ntdll.dll

address 2011708251
function: ExitVDM
module: KERNEL32.dll

address 4241223
function: void __cdecl f2(void)
file: d:\locksmonitor\main.cpp
line: 11 + 14 bytes
module: LocksMonitor.exe

address 4241565
function: unsigned long __stdcall Thread_2(void *)
file: d:\locksmonitor\main.cpp
line: 35 + 0 bytes
module: LocksMonitor.exe

address 2011784483
function: SetThreadAffinityMask
module: KERNEL32.dll
finish stack trace for thread 2032:
killed thread 1732 In Order to break deadlock