Channels ▼
RSS

C/C++

Examining uC++

Source Code Accompanies This Article. Download It Now.


February, 2006: Examining C++

Peter is an associate professor in the School of Computer Science at the University of Waterloo. He can be contacted at pabuhr@uwaterloo.ca. Richard is a research assistant in the School of Computer Science at the University of Waterloo, working in the Programming Languages Group. He can be contacted at rcbilson@ plg.uwaterloo.ca.


Concurrency is the most complex form of programming, with certain kinds of real-time programming being the most complex forms of concurrency. Often the mechanisms for writing concurrent programs only exacerbate the complexity because they are too low level and/or independent from the language. Currently, concurrency is being thrust upon all programmers indirectly through the push for higher performance in hardware. To maintain Moore's Law, it is becoming necessary to add parallelism at a number of hardware levels—instruction pipeline, multithreading, multicore processors, shared-memory multiprocessors, and distributed clusters. Some successful attempts have been made to implicitly discover concurrency in a sequential program; for instance, by parallelizing loops and access to data structures. While this approach is appealing because of the simple sequential programming model and ability to parallelize legacy code, there is a limit to how much parallelism can be found, and current techniques only work on certain kinds of programs. Therefore, explicit concurrent mechanisms are necessary to achieve maximum concurrency potential. Luckily, approaches to this are complementary, and can appear together in a single programming language.

C++ Concurrency

Given the need for explicit concurrency, modern programming languages such as Beta, Ada, Modula-3, Java, and C#, among others, provide some direct support for concurrency. Surprisingly, however, C++ has no concurrency support. During C++'s 20-year history, many different concurrency approaches for C++ have been suggested and implemented with only varying degrees of adoption. As a result, there is no de facto standard dominating concurrent programming in C++. (In C, there are two dominant, but incompatible, concurrency libraries—Win32 and pthreads.) In this article, we argue that C++'s lack of concurrency is significantly limiting the language's future. This deficiency has also been recognized by the C++ Standards committee, which is currently examining concurrency extensions. We also outline how high-level object-oriented concurrency can be added to C++ through a freely available concurrent dialect of C++ called "C++" (http://plg.uwaterloo.ca/ usystem/uC++.html).

Concurrent Design Principles

There are a number of major design principles for adding concurrency to object-oriented languages, such as:

  • Object-oriented design is built on the notion of the class. Hence, concurrency should be built on the class notion, allowing it to leverage other class-based language features.
  • All concurrent systems must provide three fundamental properties: thread, a mechanism to sequentially execute statements, independently of (and possibly concurrently with) other threads; execution context, the state needed to permit independent execution, including a separate stack; and mutual exclusion/synchronization (MES), mechanisms to exclusively access a resource and provide necessary timing relationships among threads. These properties cannot be expressed in an architecture-independent way through existing language constructs. (Even algorithms for MES, such as Dekker's algorithm, do not always work without a sufficient memory model.) Therefore, any concurrency system must provide abstractions to implement these properties.
  • Because MES causes the most errors for programmers and the greatest difficulty for safe code optimizations, it should be implicit through concurrent language constructs.
  • If the routine call is the basis for normal object communication, it should also be used for concurrency. Mixing mechanisms, such as routine call with message-passing/channels, is confusing and error prone, and may lose important capabilities such as static type checking.

Joining the fundamental concurrency properties with the class model is best done by associating thread and execution-context with the class, and MES with member routines. This coupling and the interactions among the concurrency properties generate the programming abstractions in Table 1.

  • Case 1 in Table 1 is a standard C++ object. Its member routines do not provide MES, and the caller's thread and stack are used to perform execution.
  • Case 2 has all the properties of case 1 but only one thread at a time can be executing among the member routines with the MES property, called a "mutex member." Within a mutex member, synchronization with other tasks can be performed. This abstraction is a monitor, which is well understood and appears in many concurrent languages (Java, for instance).
  • Case 3 is an object that has its own execution context but no MES or thread; the execution context is associated with a distinguished member in the object. This abstraction is a coroutine, which goes back to the roots of C++ in Simula.
  • Case 4 is like case 3 but deals with concurrent access by adding MES. This abstraction is a coroutine monitor.
  • Cases 5 and 6 are a thread without a stack, which is meaningless because a thread must have a stack to execute.
  • Case 7 is an object that has its own thread and execution context but no MES. This case is questionable because explicit locking is now required to handle calls from other threads, which violates design principle 3.
  • Case 8 is like case 7 but deals with concurrent access by adding MES. This abstraction is a task, which is an active object and appears in many concurrent languages (Ada). Note, the abstractions are derived from fundamental properties and not ad hoc decisions by a language designer, and each has a particular set of problems it can solve well. Simulating one abstraction with the others often results in awkward solutions that are inefficient; therefore, each has a place in a programming language.

C++: Concurrency in C++

C++ was designed using these concurrency design principles and engineered to provide high-level, integrated, lightweight, object-oriented concurrency for C++. By being high level, you can code in a race-free style, which eliminates the need for a complex memory model. By being integrated into the C++ language, the compiler can understand precisely when it can safely perform optimizations. Currently, C++ is a translator that converts to C++, but its design ultimately assumes it is part of C++.

Figure 1 shows the syntax for adding the programming abstractions in Table 1 to C++. There are two new type constructors _Coroutine and _Task, extensions of class, implicitly associating the execution context and thread properties to objects. There are two new type qualifiers, _Mutex and _Nomutex, for qualifying member routines needing the mutual exclusion property and which contain synchronization. There are implicitly inherited members providing context-switch/synchronization, suspend(), resume(), wait(), signal(), signalBlock(), and one new statement—_Accept. Each of these new constructs is explained through examples.

Coroutine

A coroutine is not a concurrent abstraction, but it appears directly from the combination of fundamental concurrency properties and supports direct implementation of finite-state machines (FSM). In C++, the execution context (stack) for a coroutine object is associated with its distinguished member main; see Listing One.

A coroutine type implicitly inherits member routines resume and suspend, which provide control flow among coroutines. Like a class, a coroutine's public members define its interface but also provide the interaction with the coroutine's main; multiple public member routines allow complex, type-safe communication. The resume routine is called from the public members and the suspend routine is called directly or indirectly from the coroutine's main. The first call to resume starts main, which executes on its own stack. Subsequent resumes restart at the last suspend from main. Routine suspend restarts the last resume executed by a public member. A coroutine object becomes a coroutine when main starts (first resume); the coroutine becomes an object again when main ends.

Listing Two is a simple FSM for recognizing phone numbers of the form: (555)opt 123-4567. Characters of the phone number are passed one at a time to the next member, which returns the current status of the parse. Note how the coroutine main retains its execution location and restarts there when it is resumed; for example, when parsing groups of digits, the coroutine suspends in the middle of a for loop and restarts within the particular loop when resumed. The killer application for a coroutine is device drivers, which cause 70-85 percent of failures in Windows/Linux. Many device drivers are FSMs parsing a protocol; for instance:

...STX...message...ESC ETX...message... ETX 2-byte crc...

Here, a network message begins with the control character STX and ends with an ETX, followed by a 2-byte cyclic redundancy check. Control characters can appear in the message if preceded by an ESC. An Ethernet driver is just a complex version of this simple protocol, and the FSM for the Ethernet protocol can be directly coded as a coroutine. Because FSMs can be complex and occur frequently in important domains, direct support of the coroutine is crucial, independent of concurrency.

Monitor

A monitor is a concurrency abstraction that encapsulates a shared resource with implicit mutual exclusion and provides for complex synchronization among tasks using the resource; see Listing Three.

Any member routine can be qualified with the MES qualifiers, _Mutex/_Nomutex, indicating the presence or absence of MES, respectively. Only one thread at a time can be executing among the mutex routines of a monitor object; other threads calling mutex routines of the same object implicitly block. Recursive entry is allowed for the thread currently using the monitor; that is, it may call other mutex members. The MES qualifiers can also qualify a class, which defines the default qualification for the public member routines. Hence, the presence of a single mutex member in a class makes it a monitor. Member variables cannot be MES qualified. The destructor of a monitor is always _Mutex, because the thread terminating a monitor object must wait if another thread is executing in it.

The mutex property ensures exclusive access to the monitor's data by multiple threads. For simple cases, such as an atomic counter, exclusive access is sufficient and the order of access is unimportant. For complex cases, the order of access can be crucial for correctness; for example, one task may need to communicate information to another task and wait for a reply, or a resource may have strict ordering rules with respect to thread usage. Ordering is controlled by threads synchronizing among themselves within a monitor using condition variables and operations wait(), signal(), signalBlock(), or an _Accept statement on mutex members.

A condition variable is a place where a task within the monitor can wait for an event to occur by another task using the monitor. Figure 2(a) illustrates the internal parts of a monitor object for synchronization with condition variables. Calling threads wait until no mutex member is active. A condition variable (for example, c) is a queue of waiting threads. The thread active in the monitor object waits on queue c by executing c.wait() (dotted line), which either implicitly restarts the last signalled thread, or if no signalled threads, releases the monitor lock so a new thread may enter. The active thread may execute c.signal() (dashed line) to restart a waiting thread at the front of a condition queue; the signalled thread can only restart after the signaler thread blocks or exits due to the mutual exclusion property, which is accomplished by having the signalled thread wait temporarily on the hidden urgent condition. Alternatively, the active thread may execute c.signalBlock() (solid line), which makes the active thread wait on the urgent queue and immediately starts the signalled thread at the front of the queue. Using these mechanisms, order of access within the monitor can be precisely controlled.

Tasks within the monitor can wait for an event to occur by a calling task using an accept statement. Figure 2(b) illustrates the internal parts of a monitor object for synchronization with an _Accept statement. _Accept selects which mutex member call to execute next (like Ada's select). _Accept(m1) unblocks the thread on the front of the mutex queue after the accepter is implicitly blocked (like signalBlock). If there is no calling task, the accepter waits (on the hidden urgent queue) until a call to the specified member occurs. When the member call ends, the accepter implicitly restarts after the _Accept statement. _Accept can appear at any level of routine nesting in the monitor. The _Accept statement can check multiple mutex members for calls:

_Accept( m1, m2,...);

The call on the first nonempty mutex queue is selected (so the order of member names is important); if no calls are pending, the accepter waits until a call occurs. Finally, each selected member can be separated and supplied with a guard:

_When( conditional-expression )
_Accept( m1 ) statement
else _When( conditional-expression )
_Accept( m2 ) statement
...
else
statement

The guard must be true before a mutex queue is considered; if there is a terminating else, the accepter does not block, rather it polls for callers. The statement after an _Accept is executed by the accepter after the mutex call, allowing it to perform different actions depending on which call occurred.

Monitor Examples

Listing Four(a) shows the classic dating- service problem implemented with condition variables, where two kinds of tasks exchange information based on some criteria. In this case, there are girl and boy tasks exchanging phone numbers if they have matching compatibility codes (values 0-19). A girl task checks if a boy with the same code is waiting. If not, she waits; otherwise, she copies her phone number to the shared variable GirlPhoneNo, and does a signalBlock to immediately restart the waiting boy while she waits on the urgent queue. The waiting boy restarts, copies his phone number to the shared variable BoyPhoneNo, and returns with the girl's phone number. The waiting girl is then implicitly restarted from the urgent queue after the boy returns, and she now returns with the boy's phone number. Listing Four(b) shows the classic read/write problem implemented with _Accept, where multiple reader tasks can simultaneously read a resource, but writer tasks must be serialized to write the resource. Tasks access the resource like this:

ReadersWriter rw;
reader task writer task
rw.StartRead(); rw.StartWrite();
// read resource // write resource
rw.EndRead(); rw.EndWrite();

The variables rcnt and wcnt count the number of simultaneous reader or writer tasks using the resource. EndRead/EndWrite decrement the appropriate counter when a task finishes using the resource. StartRead checks if a writer is using the resource, and if so, accepts EndWrite, causing the reader task to wait on the urgent queue and preventing calls to any other mutex member. When the current writer finishes writing, it calls EndWrite; then the waiting reader implicitly restarts in StartRead and increments rcnt. StartWrite begins with the same check for a writer and the same actions as for a reader if a writer is using the resource. Alternatively, if there are rcnt readers using the resource, the writer loops and performs rcnt accepts of EndRead, one for each of the completing reader tasks. After the last reader finishes reading and completes its call to EndRead, the waiting writer implicitly restarts and increments wcnt. Because the accept statement strictly controls entry into the monitor, new (calling) tasks may not enter out of order.

Coroutine Monitor

The properties of a coroutine and monitor can be combined to generate a concurrency abstraction for resource sharing and synchronization along with retaining data and execution state; see Listing Five.

A coroutine monitor is ideal for an FSM used by multiple threads, such as a shared formatter printer. The printer is the shared resource called by multiple threads to print each thread's data, and the printer can be a complex FSM organizing the data into rows and columns with appropriate markings and headings. Combining these fundamental properties into a single construct simplifies the job of developing the solution for a complex problem.

Task

The properties of a coroutine monitor can be combined with a thread to generate a concurrency abstraction for an active object that dynamically manages a resource; see Listing Six, for example.

Active objects appear in many concurrent languages. The use of both wait/signal on condition variables and accepting mutex members occurs frequently in a task (less so in a monitor). Finally, because the destructor is a mutex member, it can be accepted to determine when to terminate a task or monitor.

Listing Seven(a) shows a basic worker task, Adder, generating the subtotals for each row of a global matrix by summing the elements of a particular row. (Global variables are used to simplify the example.) In C++, the member routine uMain::main serves as the program's main (starting) routine. This routine reads the matrix and starts a block that creates an array of Adder tasks, one for each row of the matrix. Each task's main starts implicitly after its constructor completes—no explicit start is needed. Similarly, no explicit join is needed because the block containing the array of tasks cannot end until all the tasks in the array terminate, otherwise, the storage for the tasks could be deallocated while threads are executing. After all tasks in the block terminate, the block allocating the array of tasks ends, and the subtotals generated by each worker task can be safely summed to obtain the total. The constructor for each Adder task selects a specific row to sum by incrementing a shared variable; no mutual exclusion is necessary for the selection as each task of the array is created serially. The main member of each Adder task adds the matrix elements for its particular row in its corresponding subtotal location.

Listing Seven(b) shows a classic administrator server, where clients call different members for service. The server may provide multiple interface members for different kinds of clients and client requests. A client's call may be serviced immediately or delayed using condition variables. The server's main loops accept client calls using the _Accept statement; an accepted call may complete immediately or require subsequent servicing and signalling of the delayed client. Finally, the server's destructor is also being accepted to know when the server is being deleted by another thread.

Miscellaneous C++ Features

C++ has a number of other features that integrate concurrency with C++:

  • Both termination and resumption exception handling are supported, as well as the ability to raise exceptions among coroutines and tasks; see Listing Eight(a). For resumption, the stack is not unwound, and control returns after the _Resume when the handler completes. The _At clause provides nonlocal delivery of an exception to another coroutine or task. Nonlocal delivery of exceptions is controlled by _Enable and _Disable statements; see Listing Eight(b). Specifying no exceptions enables/disables all exceptions.
  • The execution environment can be structured into multiple clusters of tasks and processors. Each cluster has a scheduler to control selection of its tasks to run on its processors; tasks and processors can migrate among clusters.
  • C++ streams and UNIX files/sockets are augmented to be thread safe, object-oriented, and nonblocking; for example, safe stream I/O is performed like this:
  • isacquire( cin ) >> . . .;
    osacquire( cout ) << ...<< endl;

  • The declaration at the start of the I/O expression provides necessary locking of the specified stream for the duration of the expression. There are three classes for accessing sockets: uSocketServer, uSocketAccept, and uSocketClient, which hide most of the socket complexity and support connectionless and connected protocols with timeout capabilities.
  • Basic real-time programming is available through three extensions of the task:
  • _RealTimeTask R {...};
    _PeriodicTask P {...};
    _SporadicTask S {...};

  • Fixed and dynamic priority schedulers are provided for use with clusters, including a transitive priority-inheritance protocol. The _Accept statement is extended to handle timeouts:
  • _Accept( M1, M2 ) {...}
    else _Accept ( M3 ) {...}
    else _Timeout( 1 ) {...} // restart after
    // 1 second if
    // no call

  • There is a debug mode for testing with many assertions and runtime checks, and C++ also generates reasonable error messages. C++ compiles on GCC 3.2 or greater and Intel icc 8.1/9.0; for Linux Intel x86/Itanium and AMD 32/64; Solaris 8/9/10 SPARC; and IRIX 6.x MIPS.

Conclusion

Providing concurrency via low-level libraries such as pthreads makes no sense for C++. This approach is error prone and does not integrate with existing C++ mechanisms. Medium-level approaches that attempt to leverage existing language features with a concurrency library also fall short, as programmers still struggle with multiple coding conventions and limitations of use, and some primitive concurrency properties are still hidden from the compiler. To truly help you and the compiler, concurrent programming requires high-level concurrency models and constructs. The three fundamental properties in concurrency—thread, execution context, and mutual-exclusion/ synchronization—can be integrated directly into C++'s core programming notion—the class—and subsequently work with other C++ mechanisms. This approach retains the language's object-oriented programming model and provides multiple concurrency approaches and models, while requiring only a few new keywords and mechanisms in C++. C++ is a full implementation of these ideas, providing a system that lets you tackle complex concurrent projects.

DDJ



Listing One

_Coroutine C {
  void main() { // distinguished member / executes on coroutine's stack
    ...suspend()... // restart last resume
  }
public:
  void m1(...) {... resume();...} // restart last suspend
  void m2(...) {... resume();...} // restart last suspend
};
Back to article


Listing Two
_Coroutine Phone {
public:
  enum status { MORE, GOOD, BAD };
private:
  char ch;
  status stat;
  void main() {
    int i;
    stat = MORE; // continue passing characters
    if ( ch == '(') { // optional area code ?
      for ( i = 0; i < 3; i += 1 ) {
        suspend();
        if ( ! isdigit(ch) ) { stat = BAD; return; }
      }
        suspend();
        if ( ch != ')') { stat = BAD; return; }
          suspend();
      }
      for ( i = 0; i < 3; i += 1 ) { // region code ?
         if ( ! isdigit(ch) ) { stat = BAD; return; }
           suspend();
      }
      if ( ch != '-') { stat = BAD; return; } // separator ?
      for ( i = 0; i < 4; i += 1 ) { // local code ?
        suspend();
        if ( ! isdigit(ch) ) { stat = BAD; return; }
      }
      stat = GOOD;
  }
public:
  status next( char c ) { // pass one character at a time to FSM
    ch = c;
    resume(); // activate coroutine
    return stat;
  }
};
Back to article


Listing Three
_Mutex class M { // default MES for public member routines
  // SHARED DATA ACCESSED BY MULTIPLE THREADS
  uCondition c1, c2[10], *c3 = new uCondition; //different condition variables
  // default for private/protected is no MES
  void m1(...) {... /* MES statements */...} // no MES
  _Mutex void m2(...) {... /* MES statements */...}; // MES
public:
  void m3(...) {.../* MES statements */...} // MES
  _Nomutex void m4(...) {...} // no MES  
  ... // destructor is ALWAYS mutex
};
Back to article


Listing Four
(a)
_Mutex class DatingService {
    uCondition Girls[20], Boys[20];
    int GirlPhoneNo, BoyPhoneNo;
  public:
    int Girl( int PhoneNo, int code ) {
        if ( Boys[code].empty() ) {
             Girls[code].wait();
             GirlPhoneNo = PhoneNo;
    } else {
             GirlPhoneNo = PhoneNo;
             Boys[code].signalBlock();
    }
    return BoyPhoneNo;
  }
  int Boy( int PhoneNo, int code ) {
    if ( Girls[code].empty() ) {
         Boys[code].wait();
         BoyPhoneNo = PhoneNo;
    } else {
         BoyPhoneNo = PhoneNo;
         Girls[code].signalBlock();
    }
    return GirlPhoneNo;
  }
};

(b)
_Mutex class ReadersWriter {
    int rcnt, wcnt;
public:
    void ReadersWriter() {
        rcnt = wcnt = 0;
    }
    void EndRead() {
        rcnt -= 1;
    }
    void EndWrite() {
        wcnt -= 1;
    }
    void StartRead() {
        if ( wcnt == 1 ) _Accept( EndWrite );
        rcnt += 1;
    }
    void StartWrite() {
        if ( wcnt == 1 ) _Accept( EndWrite );
        else while ( rcnt > 0 )
            _Accept( EndRead );
        wcnt += 1;
    }
};
Back to article


Listing Five
_Mutex _Coroutine CM { // default MES for public member routines
   uCondition c1,c2[10],*c3 = new uCondition; // different condition variables
   void m1(...) {.../* MES statements */ ...} // no MES
   _Mutex void m2(...) { .../* MES statements */ ...}; // MES
   void main() {...} // distinguished member / has its own stack
public:
   void m3(...) { ...resume()/* MES statements */ ...} // MES
   _Nomutex void m4(...) { ...resume(); ...} // no MES
   ...// destructor is ALWAYS mutex
};
Back to article


Listing Six
_Task T { // default MES for public member routines
   uCondition c1,c2[10],*c3 = new uCondition; // different condition variables
   void m1( ... ) {.../* MES statements */ ... } // no MES
   _Mutex void m2(...) {.../* MES statements */...}; // MES
   void main() {...} // distinguished member/has own stack/thread starts here
public:
   void m3(...) {.../* MES statements */ ...} // MES
   _Nomutex void m4(...) {...} // no MES
   ...// destructor is ALWAYS mutex
};
Back to article


Listing Seven
(a)
const int rows = 10, cols = 10;
int M[rows][cols], ST[rows];

_Task Adder { // add specific row
    static int row; // sequential access
    int myrow, c;
    void main() {
        ST[myrow] = 0; // subtotal location
        for ( c = 0; c < cols; c += 1 )
             ST[myrow] += M[myrow][c];
    }
  public:
    Adder() { myrow = row++; } // choose row
};
int Adder::row = 0;
void uMain::main() {
    // read matrix
    {
        Adder adders[rows]; // create threads
    } // wait for threads to terminate
    int total = 0; // sum subtotals
    for ( int r = 0; r < rows; r += 1 )
         total += ST[r];
    cout << total << endl;
}

(b)
_Task Server {
    uCondition delay;
    void main() {
        for ( ;; ) { // for each client request
            _Accept( ~Server ) { // terminate ?
              break;
            // service each kind of client request
            } else _Accept( workReq1 ) {
              ...
              delay.signalBlock(); // restart client
              ...
            } else _Accept( workReq2 ) {
              ...
            }
        }
        // shut down
  }
public:
  void workReq1( Req1_t req ) {
       ...delay.wait();...// service not immediate?
       // otherwise service request
    }
    void workReq2( Req2_t req ) { ... }
    ...
};
Back to article


Listing Eight
(a)
_Throw [ throwable-exception [ _At coroutine/task-id ] ] ; // termination
_Resume [ resumable-exception [ _At coroutine/task-id ] ] ; // resumption

(b)
_Enable <E1><E2>... {
    // exceptions E1, E2, ... delivered
}
_Disable <E1><E2>... {
    // exceptions E1, E2, ... not delivered
}
Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video