Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Threading and Parallel Programming Constructs


Flow Control-based Concepts

In the parallel computing domain, some restraining mechanisms allow synchronization among multiple attributes or actions in a system. These are mainly applicable for shared-memory multiprocessor or multi-core environments. The following section covers only two of these concepts, fence and barrier.

Fence

The fence mechanism is implemented using instructions and in fact, most of the languages and systems refer to this mechanism as a fence instruction. On a shared memory multiprocessor or multi-core environment, a fence instruction ensures consistent memory operations. At execution time, the fence instruction guarantees completeness of all pre-fence memory operations and halts all post-fence memory operations until the completion of fence instruction cycles. This fence mechanism ensures proper memory mapping from software to hardware memory models, as shown in Figure 15. The semantics of the fence instruction depend on the architecture. The software memory model implicitly supports fence instructions. Using fence instructions explicitly could be error-prone and it is better to rely on compiler technologies. Due to the performance penalty of fence instructions, the number of memory fences needs to be optimized.

Figure 15: Fence Mechanism

Barrier

The barrier mechanism is a synchronization method by which threads in the same set keep collaborating with respect to a logical computational point in the control flow of operations. Through this method, a thread from an operational set has to wait for all other threads in that set to complete in order to be able to proceed to the next execution step. This method guarantees that no threads proceed beyond an execution logical point until all threads have arrived at that logical point. Barrier synchronization is one of the common operations for shared memory multiprocessor and multi-core environments. Due to the aspect of waiting for a barrier control point in the execution flow, the barrier synchronization wait function for ith thread can be represented as

Wbarrier i = f Tbarrier i, Rthread i where Wbarrier is the wait time for a thread, Tbarrier is the number of threads has arrived, and Rthread is the arrival rate of threads.

For performance consideration and to keep the wait time within a reasonable timing window before hitting a performance penalty, special consideration must be given to the granularity of tasks. Otherwise, the implementation might suffer significantly.

Implementation-dependent Threading Features

The functionalities and features of threads in different environments are very similar; however the semantics could be different. That is why the conceptual representations of threads in Windows and Linux remain the same, even though the way some concepts are implemented could be different. Also, with the threading APIs of Win32, Win64, and POSIX threads (Pthreads), the semantics are different as well. Windows threading APIs are implemented and maintained by Microsoft and work on Windows only, whereas the implementation of Pthreads APIs allows developers to implement threads on multiple platforms. The IEEE only defined the Pthreads APIs and let the implementation be done by OS developers. Due to the implementation issues of Pthreads, not all features exist in Pthreads APIs. Developers use Pthreads as a wrapper of their own thread implementations. There exists a native Linux Pthreads library similar to Windows native threads, known as Native POSIX Thread Library (NPTL).

Consider the different mechanisms used to signal threads in Windows and in POSIX threads. Windows uses an event model to signal one or more threads that an event has occurred. However, no counterpart to Windows events is implemented in POSIX threads. Instead, condition variables are used for this purpose.

These differences are not necessarily limited to cross-library boundaries. There may be variations within a single library as well. For example, in the Windows Win32 API, Microsoft has implemented two versions of a mutex. The first version, simply referred to as a mutex, provides one method for providing synchronized access to a critical section of the code. The other mechanism, referred to as a CriticalSection, essentially does the same thing, with a completely different API. What's the difference?

The conventional mutex object in Windows is a kernel mechanism. As a result, it requires a user-mode to kernel-mode transition to work. This is an expensive operation, but provides a more powerful synchronization mechanism that can be used across process boundaries. However, in many applications, synchronization is only needed within a single process. Therefore, the ability for a mutex to work across process boundaries is unnecessary, and leads to wasted overhead. To remove overhead associated with the standard mutex Microsoft implemented the CriticalSection, which provides a user-level locking mechanism. This eliminates any need to make a system call, resulting in much faster locking.

Even though different threading libraries will have different ways of implementing the features described in this article, the key to being able to successfully develop multi-threaded applications is to understand the fundamental concepts behind these features.

Key Points

This article presented parallel programming constructs. To become proficient in threading techniques and face fewer issues during design and development of a threaded solution, an understanding of the theory behind different threading techniques is helpful. Here are some of the points you should remember:

  • For synchronization, an understanding of the atomic actions of operations will help avoid deadlock and eliminate race conditions.
  • Use a proper synchronization construct-based framework for threaded applications.
  • Use higher-level synchronization constructs over primitive types.
  • An application cannot contain any possibility of a deadlock scenario.
  • Threads can perform message passing using three different approaches: intra-process, inter-process, and process-process.
  • Understand the way threading features of third-party libraries are implemented. Different implementations may cause applications to fail in unexpected ways.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.