Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

.NET

Multithreading .NET Apps for Optimal Performance


September, 2005: Multithreading .NET Apps For Optimal Performance

Eric has developed everything from data reduction software for particle bombardment experiments to software for travel agencies. He can be contacted at ericterrell@ comcast.net.


From the 1970s to 2004, microprocessor execution speed increased at an amazing rate—from kilohertz to megahertz to gigahertz. But starting in 2004, CPU speeds have been increasing at a much slower pace. The main speed bumps are power and heat. Faster clock rates require more power, extra power creates more heat, and keeping the CPU from melting requires increasingly elaborate and expensive cooling mechanisms. CPU manufacturers are shifting their focus from increasing clock rates to improving the performance of multithreaded code. For instance, Intel's HyperThreading technology lets an individual Pentium 4 CPU appear as two CPUs to Windows XP and other operating systems. Although a HyperThreaded CPU is still a single processor, HyperThreading can speed up multithreaded code up to 25 percent. Intel, AMD, and other manufacturers are also shipping multicore CPUs, which are simply multiple CPUs on the same chip. Multithreading is often the only way to achieve maximum performance from parallel CPU architectures. In this article, I show how to reliably multithread .NET applications. I include File Search (available electronically; see "Resource Center," page 3), which I wrote to examine .NET multithreading. File Search was developed and tested with Visual Studio 2003/.NET 1.1 and Visual C# Express 2005/.NET 2.0 beta 2.

.NET provides two main threading mechanisms—the Thread class and asynchronous methods. Listing One shows how to explicitly manage threads. In the Main method, two Thread objects are instantiated with ThreadStart delegates as constructor parameters. When their Start methods are called, the methods pointed to by the ThreadStart delegates start running. These methods, HelloWorld1 and HelloWorld2, must be void and take no parameters. Calling the Join methods causes the program to block until the methods finish. The program exits after both HelloWorld1 and HelloWorld2 have finished. If the Join statements were removed, the program would still wait for the threads to complete because thread1 and thread2 are foreground threads. If the threads were changed to be background threads (by setting the IsBackground property to True), and if the Join statements were removed, the program would exit before the threads finished running. Thread objects are easy to configure. For example, you can change a thread's priority by assigning a new value to its Priority property.

Unlike Threads, asynchronous method calls (Listing Two) can have parameters. In the Main method workDelegate refers to a method (Work) that takes a string parameter. Each workDelegate.BeginInvoke call returns immediately. After each BeginInvoke call, the Work method is run on its own background thread. BeginInvoke returns an IAsyncResult object. The IAsyncResult.AsyncWaitHandle object can be used to block until the method returns. The WaitHandle.WaitAll method waits until both asynchronous method calls have finished. Asynchronous methods run on threads managed by the ThreadPool class. The ThreadPool class maintains a set of worker threads (25 per CPU, 50 per Hyperthreaded CPU), and creates and destroys threads as necessary. Asynchronous methods don't let you directly manipulate the threads running the methods, but an asynchronous method can manipulate its thread once it's running, by accessing Thread.CurrentThread.

I decided to use asynchronous methods in File Search rather than explicit Thread objects because asynchronous methods can be parameterized. File Search (Figure 1) uses asynchronous methods to simultaneously search multiple drives. For example, if the user searches the C: and F: drives in parallel, one asynchronous method searches the C:\ drive, another searches F:\.

To install File Search, navigate to the Setup folder and double-click filesearch.zip. If you use WinZip, you can automatically install by clicking the Install toolbar button. Otherwise, you may need to extract the .zip file to a hard disk folder and run setup.exe. To build File Search, extract the source code to a hard disk folder and load the .sln file into Visual Studio. After you've installed or built File Search, run it. Select the Settings! menu item. The number of CPUs on your system is displayed. HyperThreaded CPUs count as 2. Make sure that the Multithreaded search checkbox is checked, then press OK. Enter the file types to search, and optional search text. Select the Text or Regular Expression radio button. Then specify the drives to search by checking the drive letter checkboxes. Click the Browse button to specify individual folders and network drives. Then click the Search Now button. When the search begins, the Results tab automatically displays (Figure 2). Click a file in the ListView to display its contents and search for specific text. Notice the 4 and 5 on the right of the status bar. The number to the left, 4, is the number of completed asynchronous method calls. The number to the right, 5, is the total number of asynchronous method calls.

Asynchronous Method Call Lifecycle

When you click the Search Now button the Search.SearchAllFolders method is called (Listing Three). The actual searches are performed by the SearchFolders method. The SearchFoldersDelegate variable is used to call SearchFolders asynchronously. The AsyncCallback delegate points to the SearchCallback method. SearchCallback is automatically called each time an asynchronous method finishes running. The SearchFoldersDelegate.BeginInvoke calls return immediately. After the BeginInvoke calls have been made, the SearchFolders method is called from a pooled thread. If a pooled thread is available, it is used; otherwise, a new thread is created and added to the pool.

The call to CreateDriveHashtable creates a Hashtable with a key for each drive being searched. The value corresponding to each key is an ArrayList of the folders in the drive to search. Once the Hashtable has been created, it's used to call the asynchronous methods. For each key (drive) in the Hashtable, the SearchFolders method is called asynchronously with an array of folders to search. Because each drive is searched by only one asynchronous method, searching the drive will not slow down due to thrashing. The last parameter (drive) in the BeginInvoke call is automatically sent to the SearchCallback method after the method finishes. It is available via the IAsyncResult.AsyncState property.

SearchFolders calls FileUtils.GetFiles to navigate through folders and search for files. As GetFiles runs, it periodically checks the SearchInfo.Cancelled property. If users press the Stop Search button, the Cancelled property is set to True and GetFiles returns. The thread running GetFiles could be terminated by calling Thread.Abort, but I don't recommend it. Thread.Abort attempts to terminate a thread by throwing a ThreadAbortException. This is a dangerous thing to do without knowing exactly where the thread is in its processing. Additionally, calling Abort on a suspended thread deadlocks the thread and the application.

When SearchFolders returns, the SearchCallback method is automatically called (Listing Four). The SearchInfo.remainingSearches counter is decremented because one of the parallel searches has just finished. SearchCallback needs to update the status bar and the progress bar, but it must not directly call MainForm.UpdateStatusBar and MainForm.UpdateProgressBar. Manipulating GUI components derived from the Control class (forms, labels, status bars, progress bars, and the like) must only be done on the GUI thread that created the component. Because SearchCallback is called from a different worker thread, the MainForm methods are called indirectly by the Control.Invoke method (the Form class is derived from Control). Control.Invoke takes two parameters, a delegate to the method being called, and an object array containing the method's parameters. Control.Invoke calls the method on the thread that created the Control. Visual Studio 2003's debugger does not notify you when your code manipulates Controls in a worker thread unless you put Debug.Assert(!InvokeRequired); statements in GUI code that is callable by worker threads. However, Visual Studio 2005's debugger automatically reports these threading bugs.

Race Conditions and Synchronization

In multithreaded applications, variables can be accessed and updated simultaneously by multiple threads. This can cause race conditions (timing-dependent bugs) and data corruption issues that are notoriously difficult to debug. For example, consider the UnsafeCounter property:

private int unsafeCounter;
public int UnsafeCounter
{
[MethodImpl(MethodImplOptions.
Synchronized)]
get { return unsafeCounter; }
[MethodImpl(MethodImplOptions
.Synchronized)]
set { unsafeCounter = value; }
}

The Synchronized attribute ensures that the get and set methods are only called by one thread at a time. This prevents one thread from assigning a value at the same time that another thread is reading it. But this is not sufficient to avoid a race condition. When a thread executes the UnsafeCounter++ statement, these native CPU instructions are executed:

// UnsafeCounter++;
mov esi,ebx
mov ecx,ebx

// Call the get method
call dword ptr ds:[0BFF63D8h]
mov edi,eax

// Increment value
inc edi
mov edx,edi
mov ecx,esi

// Call the set method to store the result
call FFB91C23

Consider what happens if the UnsafeCounter has a value of 0 and two threads are about to increment the counter with the ++ operator. Here's one possible outcome:

  1. Thread 1 calls the get method, retrieves value (0).
  2. Thread 1 increments the value to 1.

The operating system switches to Thread 2.

  • Thread 2 calls the get method, retrieves value (0).
  • Thread 2 increments the value to 1.
  • Thread 2 calls the set method to store result (1).

The operating system switches back to Thread 2.

  1. Thread 1 calls the set method to store result (1).

In this case, the counter started with a value of 0, was incremented twice, and ended up with an incorrect value of 1. And this is only one of many erroneous outcomes that could be caused by the race condition.

I wrote a ThreadSafeCounter class to prevent this race condition (Listing Five). The ThreadSafeCounter class uses the Interlocked class to increment and decrement counters atomically. It also uses the Synchronized attribute to ensure that the get and set methods are not called by multiple threads simultaneously. The ThreadSafeCounter is limited to 32-bit integer values. .NET 2.0 lets the Interlocked class manipulate 64-bit integers, but this can cause race conditions on 32-bit CPUs. The Interlocked class is ideal for synchronizing access to integers. The lock statement is a more general-purpose synchronization mechanism:

lock ( {expression} )
{
{statements}
}

The lock statement waits until no other thread is holding a lock on the expression. Then the expression is locked and the statements are executed. After the statements are finished, the lock is released and another thread can acquire a lock on the expression. The expression must be a reference type. Typically code locks on this to lock an entire object. To acquire locks in a static method, use the typeof operator to lock on the method's class. See the lock statement in updateStaticMembers below:

public class LockDemo
{
int w, x;
static int y, z;
public void updateMembers()
{
lock(this)
{
w = w * 2; x = w * 4;
}
}
public static void updateStaticMembers()
{
lock (typeof(LockDemo))
{
y = 33; z = z * 2 + y;
}
}
}

Applications must synchronize access to any variable that can potentially be accessed by multiple threads. The Interlocked class is a high-performance option for integer variables. The lock mechanism and Monitor class are slower than the Interlocked class, but they can synchronize access to any type, not just integers.

Performance

One of the main reasons for developing a multithreaded application is to improve performance. But even multithreaded apps can have performance problems. For example, creating too many threads can slow down an application. Every thread uses memory to store its stack and other state information. And it takes time for the operating system to context switch from one thread to a different thread. More threads mean more memory consumption and more context switches. For some applications you may want to make the number of threads proportional to the number of available CPUs. The NUMBER_OF_PROCESSORS environment variable specifies the number of CPUs (HyperThreaded processors count as 2). You can access this variable programmatically, see SettingsForm.cs (available electronically) for the details.

The Synchronized attribute is a convenient way to make an entire method mutually exclusive, but if only a subset of the method's code requires mutual exclusion, it's better to put only that subset in one or more lock statements. Make sure each lock is held only as long as necessary. You can use Monitor.TryEnter method to acquire a lock only if it's available:

if (Monitor.TryEnter(this))
{
lock(this)
{
}
}

This works because the lock uses the Monitor class internally. The lock statement calls Monitor.Enter before executing the statements in its body, and calls Monitor.Exit after the statements have been executed. Another way to reduce lock contention is to use the [ThreadStatic] attribute. When the [ThreadStatic] attribute is placed above a member variable, a separate instance of that variable is available to each thread. If synchronizing a member variable is slowing your application down, you may be able to mark it as [ThreadStatic] and remove the synchronization. For example, File Search's asynchronous method uses a Regex member variable for regular expression searching. The member variable, FileUtils.regularExpression, is marked as [ThreadStatic] so there's no need to synchronize it.

Run perfmon to determine if your app is slowing down due to lock contention. Press the + toolbar button and select the .NET CLR LocksAndThreads performance object. Select the All Counters and Select Instances From List radio buttons. Highlight your program in the listbox and press Add and Close. Click on the View Report toolbar button and monitor the Contention Rate/sec and Total # of Contentions counters.

Searching files for text is I/O-bound rather than CPU-bound. Consequently, searching multiple drives with simultaneously executing asynchronous methods is a major performance enhancement. For example, when I searched two local drives and four network shares, the multithreaded search was about 3.7 times faster than the single-threaded search. This dramatic speedup is not surprising. While one thread is blocked waiting for file I/O, another thread can search for text. Multithreading a CPU-bound program typically results in a more modest speedup.

Debugging and Testing Multithreaded Apps

Debugging a multithreaded application full of race conditions is a nightmare. Here are some testing tips to make testing more effective and reduce the likelihood of race conditions. If you're developing a Windows Forms application, test it with the Visual Studio 2005 debugger. Unlike previous versions, the 2005 debugger automatically detects when a control is manipulated by a worker thread. Unit testing software like NUnit is extremely useful for multithreaded code. NUnit (Figure 3) is able to detect the multithreaded increment bug mentioned earlier. To see NUnit in action, download it from http://www.nunit.org/. Run the NUnit-Gui program. Select File/Open and open the FileSearchNUnitTests.nunit project file from the FileSearchNUnitTests project. Press Run and NUnit detects the race condition if it happens during the test run. NUnit test cases are synchronous. If you call asynchronous methods in NUnit, use WaitHandle.WaitAll to force the test case to wait for the asynchronous methods to complete. If your code uses Thread objects, force the test code to wait by calling Thread.Join. See NUnitTests.cs (available electronically) for sample multithreaded NUnit test cases.

It's worthwhile to test on a variety of machines with different performance characteristics. Some race conditions only show up on true multiprocessor machines. If your customers will be running your software on multicore or SMP machines, your test matrix needs to include multiprocessor machines.

Conclusion

Moore's Law hasn't been repealed. CPU manufacturers are still able to double the transistor count on a given area of silicon every 18 months or so. What has changed is the rate at which the performance of single-threaded applications can be improved. For at least the next few years, single-threaded applications will not speed up dramatically as customers upgrade their machines. But multithreaded applications will speed up if they take advantage of the parallel processing capabilities of newer HyperThreaded and multicore CPUs. Perhaps in the future, compilers will automatically parallelize .NET code and distribute processing across multiple CPUs. But for now, multithreading is the way to harness the parallelism of today's computers. Asynchronous method calls are a convenient way to call parameterized methods on multiple threads. Just be sure to synchronize access to variables. Debug your Windows Forms code with Visual Studio 2005 and you'll automatically find Windows Forms threading bugs. Finally, put NUnit to work testing your multithreaded code. Detecting any bugs, especially threading issues, during unit testing is much less expensive than finding them later in the development process or at a customer's site.

DDJ



Listing One

class ThreadTest
{
  const int iterations = 100;
  public void HelloWorld1()
  {
    for (int i = 1; i <= iterations; i++)
    {
      Console.WriteLine("Hello World1 {0}", i);
    }
    Console.WriteLine("\nHello World1 finished\n");
  }
  public void HelloWorld2()
  {
    for (int i = 1; i <= iterations; i++)
    {
      Console.WriteLine("Hello World2 {0}", i);
    }
    Console.WriteLine("\nHello World2 finished\n");
  }
  [STAThread]
  static void Main(string[] args)
  {
    ThreadTest threadTest = new ThreadTest();
    // Create the threads. The ThreadStart delegate must
    // refer to a void method with no parameters.
    Thread thread1 = 
        new Thread(new ThreadStart(threadTest.HelloWorld1));
    Thread thread2 = 
        new Thread(new ThreadStart(threadTest.HelloWorld2));
    // Start the threads.
    thread1.Start();
    thread2.Start();
    // Wait for threads to complete. Doesn't matter which thread finishes first.
    thread1.Join();
    thread2.Join();
    Console.WriteLine("Main finished");
  }
}
Back to article


Listing Two
class AsynchMethodTest
{
  const int iterations = 100;
  private delegate void WorkDelegate(string message);
  private void Work(string message)
  {
    for (int i = 1; i <= iterations; i++)
    {
      Console.WriteLine(message + " " + i);
    }
    Console.WriteLine("\n" + message + " finished\n");
  }
  static void Main(string[] args)
  {
    AsynchMethodTest test = new AsynchMethodTest();
    WorkDelegate workDelegate = new WorkDelegate(test.Work);
    WaitHandle[] waitHandles = new WaitHandle[2];
    waitHandles[0] = workDelegate.BeginInvoke("Hello World1", null,
                               null).AsyncWaitHandle;
    waitHandles[1] = workDelegate.BeginInvoke("Hello World2", null,
                               null).AsyncWaitHandle;
    // Asynchrononous methods are run on background threads by default.
    // Wait for them to complete before letting the main thread exit.
    WaitHandle.WaitAll(waitHandles);

    Console.WriteLine("Main finished");
  }
}
Back to article


Listing Three
public static void SearchAllFolders(string[] allFolders, 
       string searchPattern, string containingText, 
       bool regularExpression, bool caseSensitive)
{
  SearchInfo.StartTime = DateTime.Now;
  SearchFoldersDelegate searchFoldersDelegate = 
    new SearchFoldersDelegate(SearchFolders);
  AsyncCallback asyncCallback = new AsyncCallback(SearchCallback);
  SearchInfo.Cancelled  = false;
  SearchInfo.InProgress = true;
  // If user requested a multithreaded search...
  if (SerializeConfiguration.Settings.MultithreadedSearch)
  {
    // Don't want to thrash hard drives and optical drives. 
    // Ensure that each drive is only accessed by one thread.
    Hashtable drives = CreateDriveHashtable(allFolders);
    // Keep track of how many searches will be done.
    SearchInfo.totalSearches.Set(drives.Keys.Count);
    SearchInfo.remainingSearches.Set(drives.Keys.Count);
    // For each drive being searched...
    foreach (string drive in drives.Keys)
    {
      // Get the folders on the drive to be searched.
      ArrayList foldersArrayList = (ArrayList) drives[drive];
      // Convert ArrayList of folders to a string array.
      string[] folders = (string[]) foldersArrayList.ToArray(typeof(string));
      // Call the asynchronous method with all the search parameters.
      searchFoldersDelegate.BeginInvoke(folders, searchPattern,
       containingText, regularExpression,caseSensitive, asyncCallback, drive);
    }
  }
  else // If user requested a single-threaded search...
  {
    SearchInfo.totalSearches.Set(1);
    SearchInfo.remainingSearches.Set(1);
    searchFoldersDelegate.BeginInvoke(allFolders, searchPattern,
      containingText, regularExpression, caseSensitive, 
      asyncCallback, "all folders");
  }
  // It's OK to directly manipulate the status bar because
  // this method was called from the GUI thread.
  Globals.mainForm.UpdateStatusBar("Searching...");
}
Back to article


Listing Four
delegate void UpdateStatusBarDelegate(String Text);
 ...
delegate void UpdateProgressBarDelegate();
private static void SearchCallback(IAsyncResult asynchResult)
{
  // One parallel search just completed, so decrement
  // the number of remaining searches.
  int remainingSearches = SearchInfo.remainingSearches.Decrement();
  UpdateStatusBarDelegate USBD = 
    new UpdateStatusBarDelegate(Globals.mainForm.UpdateStatusBar);
  // If the search was not cancelled...
  if (!SearchInfo.Cancelled)
  {
    string message = string.Format("Finished searching {0}", 
                    asynchResult.AsyncState);
    // Update the status bar with a progress message.
    Globals.mainForm.Invoke(USBD, new object[] { message } );
  }
  // If the last asynchronous method finished searching...
  if (remainingSearches == 0)
  {
    TimeSpan elapsedTime = DateTime.Now - SearchInfo.StartTime;
    // Update status bar to display total search time.
    Globals.mainForm.Invoke(USBD, 
      new object[] { SearchInfo.Cancelled ? "Search cancelled" : 
      "Search completed." + " Elapsed time: " + elapsedTime.ToString() });
    SearchInfo.InProgress = false;
  }
  // Update the progress bar.
  UpdateProgressBarDelegate UPD = new
    UpdateProgressBarDelegate(Globals.mainForm.UpdateProgressBar);
  // You don't need to pass an object array to Invoke when the
  // method (UpdateProgressBar) does not have any parameters.
  Globals.mainForm.Invoke(UPD);
}
Back to article


Listing Five
// A ThreadSafeCounter contains an integer value that can be read, written, 
// incremented and decremented from multiple threads without data corruption 
// issues caused by race conditions.
public sealed class ThreadSafeCounter
{
  private int intValue;
  public ThreadSafeCounter()
  {
    intValue = 0;
  }
  public ThreadSafeCounter(int intValue)
  {
    this.intValue = intValue;
  }
  [MethodImpl(MethodImplOptions.Synchronized)]
  public int Get()
  {
    return intValue;
  }
  [MethodImpl(MethodImplOptions.Synchronized)]
  public int Set(int newValue)
  {
    int result = intValue;
    intValue = newValue;
    return result;
  }
  public int Increment()
  {
    return Interlocked.Increment(ref intValue);
  }
  public int Decrement()
  {
    return Interlocked.Decrement(ref intValue);
  }
}
Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.