C/C++

C++/CLI Threading: Part I

By Rex Jaeschke, October 01, 2005

C++/CLI supports the ability to create multiple threads of execution within a single program. Rex covers the creation and synchronization of threads in C++/CLI.

C++/CLI supports the ability to create multiple threads of execution within a single program. This month, we'll see how threads are created and synchronized. Next month, we'll see how shared variables can be guarded against compromise during concurrent operations.

Introduction

A thread is an individual stream of execution as seen by the processor, and each thread has its own register and stack context. The runtime environment executes only one thread at a time. The execution of a thread is interrupted when it needs resources that are not available, it is waiting for an operation such as an I/O to complete, or if it uses up its processor time slice. When the processor changes from executing one thread to another, this is called "context switching." By executing another thread when one thread becomes blocked, the system allows processor idle time to be reduced. This is called "multitasking."

When a program is executed, the system is told where on disk to get instructions and static data. A set of virtual-memory locations, collectively called an address space, is allocated to that program, as are various system resources. This runtime context is called a "process." However, before a process can do any work, it must have at least one thread. When each process is created, it is automatically given one thread, called the "primary thread." However, this thread has no more capability than other threads created for that process; it just happens to be the first thread created for that process. The number of threads in a process can vary at runtime under program control. Any thread can create other threads; however, a creating thread does not in any sense own the threads it creates—all threads in a process belong to the process as a whole.

The work done by a process can be broken into subtasks with each being executed by a different thread. This is called "multithreading." Each thread in a process shares the same address space and process resources. When the last remaining thread in a process terminates, the parent process terminates.

Why have more than one thread in a process? If a process has only one thread, it executes serially. When the thread is blocked, the system is idle if no other process has an active thread waiting. This may be unavoidable if the subtasks of the process must be performed serially; however, this is not the case with many processes. Consider a process that has multiple options. A user selects some option, which results in lots of computations using data in memory or a file and the generation of a report. By spawning off a new thread to perform this work, a process can continue accepting new requests for work without waiting for the previous option to complete. Moreover, by specifying thread priorities, a process can allow less-critical threads to run only when more-critical threads are blocked.

Once a thread has been dispatched, another thread can be used to service keyboard or mouse input. For example, the user might decide that a previous request is not the way to go after all, and wish to abort the first thread. This can be done by selecting the appropriate option on a pull-down menu and having one thread stop the other.

Another example involves a print spooler. Its job is to keep a printer as busy as possible and to service print requests from users. The users would be very unhappy if the spooler waited until a job had completed printing before it started accepting new requests. Of course, it could periodically stop printing to see if any new requests were pending (this is called "polling"), but that wastes time if there are no requests. In addition, if the time interval between polls is too long, there is a delay in servicing requests. If it is too short, the thread spends too much time polling. Why not have the spooler have two threads—one to send work to the printer, the other to deal with requests from users? Each runs independent of the other, and when a thread runs out of work, it either terminates itself or goes into an efficient state of hibernation.

When dealing with concurrently executing threads, we must understand two important aspects: atomicity and reentrancy. An atomic variable or object is one that can be accessed as a whole, even in the presence of asynchronous operations that access the same variable or object. For example, if one thread is updating an atomic variable or object while another thread reads its contents, the logical integrity of those contents cannot be compromised—the read will get either the old or the new value, never part of each. Normally, the only things that can be accessed atomically are those having types supported atomically in hardware, such as bytes and words. Most of the fundamental types in C++/CLI are guaranteed to be atomic. (Others might also be atomic for a given implementation, but that's not guaranteed.) Clearly, a Point object implemented as an x- and y-coordinate pair is not atomic, and a writer of a Point's value could be interrupted by a reader to that Point, resulting in the reader getting the new x and old y, or vice versa. Similarly, arrays cannot be accessed atomically. Because most objects cannot be accessed atomically, we must use some form of synchronization to ensure that only one thread at a time can operate on certain objects. For this reason, C++/CLI assigns each object, array, and class a synchronization lock.

A reentrant function is one that can safely be executed in parallel by multiple threads of execution. When a thread begins executing a function, all data allocated in that function comes from either the stack or the heap. In any event, it's unique to that invocation. If another thread begins executing that same function while the first thread is still working there, each thread's data will be kept separate. However, if that function accesses variables or files that are shared between threads, it must use some form of synchronization.

Creating Threads

In Listing One, the primary thread creates two other threads, and the three threads run in parallel without synchronization. No data is shared between the threads and the process terminates when the last thread terminates.

Listing One

using namespace System;
using namespace System::Threading;

public ref class ThreadX
{
  int loopStart;
  int loopEnd;
  int dispFrequency;
public:
  ThreadX(int startValue, int endValue, int frequency)
  {
    loopStart = startValue;
    loopEnd = endValue;
    dispFrequency = frequency;
  }

/*1*/ void ThreadEntryPoint()
  {
/*2*/   String^ threadName = Thread::CurrentThread->Name;
    
    for (int i = loopStart; i <= loopEnd; ++i)
    {
      if (i % dispFrequency == 0)
      {
        Console::WriteLine("{0}: i = {1,10}", threadName, i);
      }
    }
    Console::WriteLine("{0} thread terminating", threadName);
  }
};

int main()
{
/*3a*/  ThreadX^ o1 = gcnew ThreadX(0, 1000000, 200000);
/*3b*/  Thread^ t1 = gcnew Thread(gcnew ThreadStart(o1, &ThreadX::ThreadEntryPoint));
/*3c*/  t1->Name = "t1";

/*4a*/  ThreadX^ o2 = gcnew ThreadX(-1000000, 0, 200000);
/*4b*/  Thread^ t2 = gcnew Thread(gcnew ThreadStart(o2, &ThreadX::ThreadEntryPoint));
/*4c*/  t2->Name = "t2";

/*5*/ t1->Start();
/*6*/ t2->Start();
  Console::WriteLine("Primary thread terminating");
}

Let's begin by looking at the first executable statement in the program in case 3a. Here we create an object having the user-defined type ThreadX. That class has a constructor, an instance function, and three fields. We call the constructor passing it a start and end count, and an increment amount, which it stores for later use in controlling a loop.

In case 3b, we create an object of the library type System::Thread, which is from the namespace System::Threading. A new thread is created using such an object; however, before a thread can do useful work, it must know where to start execution. We indicate this by passing to Thread's constructor a delegate of type System::ThreadStart, which supports any function taking no arguments and returning no value. (Being a delegate, it could encapsulate multiple functions; however, in our examples, we'll specify only one.) In this case, we identify that the thread is to begin by executing instance function ThreadEntryPoint on object o1. Once started, this thread will execute until this function terminates. Finally, in case 3c, an arbitrary name is given to this thread by setting its Name property.

In cases 4a, 4b, and 4c, we do the same thing for a second thread, giving it a different set of loop control data and a different name.

At this stage, two thread objects have been constructed but no new threads have yet been created; these threads are inactive. To make a thread active, we must call Thread's function Start, as shown in cases 5 and 6. This function starts a new executing thread by calling its entry-point function. (Calling Start on a thread that is already active results in an exception of type ThreadStateException.) The two new threads each display their names and then proceed to loop and display their progress periodically. Because each of these threads is executing its own instance function, each has its own set of instance data members.

All three threads write to standard output and, as we can see from Figure 1, the output from the threads in one execution is intertwined. (Of course, the output might be ordered differently on subsequent executions.) We see that the primary thread terminated before either of the other two started running. This demonstrates that although the primary thread was the parent of the other threads, the lifetimes of all three threads are unrelated. Although the entry-point function used in this example is trivial, that function can call any other function to which it has access.

Figure 1: Intertwined output of three threads.

Primary thread terminating
t1: i =          0
t1: i =     200000
t1: i =     400000
t1: i =     600000
t2: i =   -1000000
t2: i =    -800000
t2: i =    -600000
t2: i =    -400000
t2: i =    -200000
t2: i =          0
t2 thread terminating
t1: i =     800000
t1: i =    1000000
t1 thread terminating

If we want different threads to start execution with different entry-point functions, we simply define those functions in the same or different classes (or as nonmember functions) as we see fit.

Synchronized Statements

The main program in Listing Two has two threads accessing the same Point. One of them continually sets the Point's x- and y-coordinates to some new values while the other retrieves these values and displays them. Even though both threads start executing the same entry-point function, by passing a value to their constructors, we can make each thread behave differently.

Listing Two

using namespace System;
using namespace System::Threading;

public ref class Point
{
  int x;
  int y;
public:

// define read-write instance properties X and Y

  property int X
  {
    int get() { return x; }
    void set(int val) { x = val; }
  }

  property int Y
  {
    int get() { return y; }
    void set(int val) { y = val; }
  }

  // ...

  void Move(int xor, int yor) 
  {
/*1a*/    Monitor::Enter(this);
    X = xor;
    Y = yor;
/*1b*/    Monitor::Exit(this);
  } 

  virtual bool Equals(Object^ obj) override
  {

    // ...

    if (GetType() == obj->GetType())
    {
      int xCopy1, xCopy2, yCopy1, yCopy2;
      Point^ p = static_cast<Point^>(obj);

/*2a*/      Monitor::Enter(this);
      xCopy1 = X;
      xCopy2 = p->X;
      yCopy1 = Y;
      yCopy2 = p->Y;
/*2b*/      Monitor::Exit(this);

      return (xCopy1 == xCopy2) && (yCopy1 == yCopy2);
    }

    return false;
  }

  virtual int GetHashCode() override
  {
    int xCopy;
    int yCopy;

/*3a*/    Monitor::Enter(this);
    xCopy = X;
    yCopy = Y;
/*3b*/    Monitor::Exit(this);
    return xCopy ^ (yCopy << 1);
  }

  virtual String^ ToString() override
  {
    int xCopy;
    int yCopy;

/*4a*/    Monitor::Enter(this);
    xCopy = X;
    yCopy = Y;
/*4b*/    Monitor::Exit(this);

    return String::Concat("(", xCopy, ",", yCopy, ")");
  }
};

public ref class ThreadY
{
  Point^ pnt;
  bool mover;
public:
  ThreadY(bool isMover, Point^ p)
  {
    mover = isMover;
    pnt = p;
  }

  void StartUp()
  {
    if (mover)
    {
      for (int i = 1; i <= 10000000; ++i)
      {
/*1*/       pnt->Move(i, i);
      }
    }
    else
    {
      for (int i = 1; i <= 10; ++i)
      {
/*2*/       Console::WriteLine(pnt); // calls ToString
        Thread::Sleep(10);
      }
    }
  }
};

int main()
{
  Point^ p = gcnew Point;

/*1*/ ThreadY^ o1 = gcnew ThreadY(true, p);
/*2*/ Thread^ t1 = gcnew Thread(gcnew ThreadStart(o1, &ThreadY::StartUp));

/*3*/ ThreadY^ o2 = gcnew ThreadY(false, p);
/*4*/ Thread^ t2 = gcnew Thread(gcnew ThreadStart(o2, &ThreadY::StartUp));

  t1->Start();
  t2->Start();

  Thread::Sleep(100);
/*5*/ Console::WriteLine("x: {0}", p->X);
/*6*/ Console::WriteLine("y: {0}", p->Y);

/*7*/ t1->Join();
  t2->Join();
}

The purpose of the call to Sleep for 100 milliseconds is to allow the two threads to start executing before we attempt to access p's x- and y-coordinates. That is, we want the primary thread to compete for exclusive access to p's coordinates with those two threads.

A call to Thread::Join suspends the calling thread until the thread on which Join is called terminates.

Consider the type ThreadY in Listing Two. The potential for conflict arises from the fact that one thread can be calling Move in case 1 while the other is (implicitly) calling ToString in case 2. Since both functions access the same Point, without synchronization, Move might update the x-coordinate, but before it can update the corresponding y-coordinate, ToString runs and displays a mismatched coordinate pair. In such a case, the output produced might be as shown in Figure 2(a). However, when the appropriated statements are synchronized, the coordinate pairs displayed by ToString always match. The output from one synchronized execution is shown in Figure 2(b). In the type Point in Listing Two, we can see how these (and other) functions' access to the x- and y-coordinates is synchronized.

Figure 2: (a) Thread output producing a mismatched coordinate pair; (b) matched coordinate pair from a synchronized execution.

<b>(a)</b>
(1878406,1878406)
(2110533,2110533)
(2439367,2439367)
(2790112,2790112)
x: 3137912
y: 3137911	  // y is different from x
(3137912,3137911) // y is different from x
(3466456,3466456)
(3798720,3798720)
(5571903,5571902) // y is different from x
(5785646,5785646)
(5785646,5785646)

<b>(b)</b>
(333731,333731)
(397574,397574)
(509857,509857)
(967553,967553)
x: 853896
y: 967553 // y is still different from x
(1619521,1619521)
(1720752,1720752)
(1833313,1833313)
(2973291,2973291)
(3083198,3083198)
(3640996,3640996)

A set of statements can be marked as wanting exclusive access to some resource by including them in what we shall refer to as a "lock block," by delimiting those statements with a call to the Thread::Monitor functions Enter and Exit, as shown in cases 1a and 1b, 2a and 2b, 3a and 3b, and 4a and 4b.

Since Move and ToString are instance functions, when they are called on the same Point, they share a common lock for that Point. To get exclusive access to an object's lock, we pass a handle to that object to Enter. Then if Move is called to operate on the same Point as ToString, Move is blocked until ToString is completed, and vice versa. As a result, the functions spend time waiting on each other, whereas without synchronization, they both run as fast as possible.

Once a lock block gets control of an object's lock, it ensures that only one instance function from that class can have its critical code be executed on that object at any one time. Of course, an instance function in that class that uses no lock pays no mind to what any of its synchronized siblings are doing, so we must be careful to use locks as appropriate. (Note that the X and Y accessors are not synchronized.) Instance functions' lock blocks that are operating on different objects do not wait on each other.

Ordinarily, a lock is released when Exit is called. (We'll discuss later what happens if an exception is thrown from inside the lock block.) Therefore, the lock is in place while code within a lock block calls any and all other functions. It is the programmer's responsibility to avoid a deadlock—the situation when thread A is waiting on thread B, and vice versa.

Consider a function that contains 25 statements, only three consecutive ones of which really need synchronization. If we enclose all 25 of them in one big lock block, we'll be locking out resources longer than we really need to. As we can see in the aforementioned lock blocks, each lock is held for the minimum possible time.

Look at the struct ArrayManip in Listing Three. When the lock block begins execution in case 2, the lock referenced by array is engaged, thereby blocking all other code that also needs to synchronize on that array, such as case 3, when both functions are called to operate on the same array.

Listing Three

using namespace System;
using namespace System::Threading;

public ref struct ArrayManip
{
  static int TotalValues(array<int>^ array)
  {
/*1*/   int sum = 0;
/*2*/   Monitor::Enter(array);
    {
      for (int i = 0; i < array->Length; ++i)
      {
        sum += array[i];
      }
    }
    Monitor::Exit(array);
    return sum;
  }

  static void SetAllValues(array<int>^ array, int newValue)
  {
/*3*/   Monitor::Enter(array);
    {
      for (int i = 0; i < array->Length; ++i)
      {
        array[i] = newValue;
      }
    }
    Monitor::Exit(array);
  }

  static void CopyArrays(array<int>^ array1, array<int>^ array2)
  {
/*4*/   Monitor::Enter(array1);
    {
/*5*/     Monitor::Enter(array2);
      {
        Array::Copy(array1, array2, 
          array1->Length < array2->Length ? array1->Length
          : array2->Length);
      }
      Monitor::Exit(array2);
    }
    Monitor::Exit(array1);
  }
};

A lock block can contain another lock block for the same object because it already has a lock on that object. In this case, the lock count is simply increased; it must decrease to zero before that object can be operated on by another synchronized statement in another thread. A lock block can also contain a lock block for a different object, in which case, it will be blocked until that second object becomes available. Function CopyArrays contains an example.

The obvious thing to use a lock for is to use the instance object for the parent function. However, we can invent lock objects and synchronize on them without actually having those objects contain any information. For example, see Listing Four. Class C has a lock object called Lock that contains no data and is never initialized or used in any context except a lock block. Functions F3 and F4 each contain a set of statements, one of which must be blocked while the other runs, and vice versa.

Listing Four

using namespace System::Threading;

public ref class C
{
/*1*/ static Object^ Lock = gcnew Object;

public:
  static void F1()
  {
/*2*/   Monitor::Enter(C::typeid);
/*3*/   try {
      // perform some operation(s)
    }
    finally {
      Monitor::Exit(C::typeid);
    }
  }

  static void F2()
  {
    Monitor::Enter(C::typeid);
    // ...
    Monitor::Exit(C::typeid);
  }

  static void F3()
  {
/*4*/   Monitor::Enter(Lock);
    // ...
    Monitor::Exit(Lock);
  }

  static void F4()
  {
    Monitor::Enter(Lock);
    // ...
    Monitor::Exit(Lock);
  }
};

If a class function (rather than an instance function) needs synchronizing, the lock object is obtained by using the typeid operator, as shown in case 2. There is one lock object for each CLI type (as well as one for each instance of that type). A lock on a class means that only one class function's lock block for that class can execute at a time.

Note the try/finally in case 3. If execution of the lock block completes normally, the previous examples of calling Monitor::Exit will work correctly. However, if an exception is thrown inside the lock block, the calls to Exit will never happen because the flow of control is interrupted. As a result, if there is any chance that an exception could be thrown from within a lock block—either directly, or indirectly from any function that blocks calls—we should use a try/finally construct, as shown. That way, Exit is called both on normal and abnormal termination of the lock block.

Exercises

To reinforce the material we've covered, perform the following activities:

In your implementation's documentation, carefully read the description of the class Thread.
Modify the example in Listing Two such that it contains three classes: Point, ManipulateThread, and a main application, where ManipulateThread has two entry-point functions, StartUpMover and StartUpDisplay.
Write a program that has the primary thread create one secondary thread. Every second, the secondary thread gets the current date and time (using type System::DateTime) and displays it on the console. Have the primary thread sleep for some time and, when it wakes up, have it terminate the secondary thread by setting a shared variable (that should be declared volatile) that the secondary thread checks regularly to see if it should shut itself down.

Rex Jaeschke is an independent consultant, author, and seminar leader. He serves as editor of the Standards for C++/CLI, CLI, and C#. Rex can be reached at [email protected].

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++

C++/CLI Threading: Part I

Introduction

Creating Threads

Listing One

Figure 1: Intertwined output of three threads.

Synchronized Statements

Listing Two

Figure 2: (a) Thread output producing a mismatched coordinate pair; (b) matched coordinate pair from a synchronized execution.

Listing Three

Listing Four

Exercises

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C/C++

C++/CLI Threading: Part I

Introduction

Creating Threads

Listing One

Figure 1: Intertwined output of three threads.

Synchronized Statements

Listing Two

Figure 2: (a) Thread output producing a mismatched coordinate pair; (b) matched coordinate pair from a synchronized execution.

Listing Three

Listing Four

Exercises

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content