Channels ▼

Clay Breshears

Dr. Dobb's Bloggers

Insert Your Own Clever Bubblesort Pun

March 30, 2012

Okay, here we go with my data decomposition parallel code for Bubblesort. With applications dealing with large data sets that involve sorting (or parts of larger applications needing to sort large amounts of data), it seems only natural to consider a data parallel solution. It got longer than I had originally thought, but I hope you find it worthwhile.

Before we dive into the gory details, let's do a quick review. Under a data decomposition scheme, we need to divide up the large collection of data and assign a portion to a thread. Each thread runs the computation on the assigned piece and coordinates with other threads as needed. For Bubblesort, I divide up the data array into non-overlapping partitions and assign a thread to each partition. Threads use the Bubblesort algorithm on their assigned partition. The coordination needed between threads falls between adjacent partitions. Each thread will need to "peek" over the partition boundary to compare (and possibly swap) the last item in the assigned partition with the first item in the succeeding partition. This must be done in a safe manner. After each thread performs a complete pass on the assigned partition values, threads will determine if there was at least one exchange by any thread during the just completed pass. If no thread did an exchange, the sorted order has been achieved; otherwise another pass is initiated.

Due to the length and the format of this forum, I'll break my BubbleSort() function into a few logical pieces and describe each one separately. I think that will be easier to follow the flow than to require you to keep paging back and forth to examine a monolithic code sequence. The code blocks described will be broken down into 5 "steps":

  1. Variable declarations, shared and local
  2. Test the first and second elements in partition
  3. One full pass through the data in partition
  4. Test the overlapping elements at end of partition and start of next partition
  5. Update and check exchanges made from any previous compares

As with the previous Bubblesort code examples, I've used POSIX threads for my implementation. First up, the declarations for variables, both shared by all threads and local to individual threads.

// Shared declarations
pthread_mutex_t BLock[NUM_LOCKS];
pthread_mutex_t m_Done;
pth_barrier_t bBarrier;
int Done[2] = {0,0};
int *A, N;

void *BubbleSort(void *pArg)
{
  int i, j, temp;
  int k = 0;
  int exch;
  int myid = *((int*)pArg);
  int start = ((float)N / (float)NUM_THREADS) * myid;
  int end = ((float)N / (float)NUM_THREADS) * (myid+1);
  if (myid == NUM_THREADS-1) end = N;

The shared variable set contains the BLock[] array of mutexes to be used by threads to look into an adjacent partition or to prevent a thread from an adjacent partition from swapping out a value that could be read by another thread. Prevalent wisdom in using locks is that a lock must protect some data, not the execution of code. (We put mutexes around code that updates or reads those data items that must be protected from concurrent multiple thread accesses.) In this case, I will use a mutex to protect/allow access to the first elements in each assigned partition of the array. More details on how that is to be done in the description of upcoming code segments.

Added synchronization variables that weren't in the task decomposition variation are another single mutex, m_Done, to protect updates to the Done array elements and a barrier construct unimaginatively named bBarrier. This barrier is the answer to the thought problem posed at the end of the previous article as we shall see in the final code segment description. Finally, the array, A, to be sorted as well as the number of things in the array, N, rounds out the shared variables.

The local variables are what you would expect. There are some array index variables (i, j), a temp storage location for the swap of values found to be out of sequence, and an identification number for the thread, myid, that was sent as the single parameter to the thread function BubbleSort(). A counter, k, will be used to reference the elements of Data, an integer. exch keeps track of whether or not the thread made an exchange of at least one pair of adjacent items, and the two remaining integers, start and end, denote the indices of the array partition that will be sorted by the thread. I employ the trick of using floating-point arithmetic and integer truncation to get a balanced distribution of partition sizes across all threads, especially if the size of the data set isn't evenly divisible by the number of threads.

Now, with the preliminaries over, let's look at some executable code. Surrounding the bulk of the function will be an infinite while loop that is terminated in the last segment below. Each iteration of this loop will have threads execute a Bubblesort pass over the data within the assigned partition.

  while (1) {
    exch = 0;
    i = start;
    if (myid > 0) {
      pthread_mutex_lock(&BLock[myid-1]);
        if (A[i] > A[i+1]) {
          temp = A[i]; A[i] = A[i+1]; A[i+1] = temp;
          exch = 1;
        }
      pthread_mutex_unlock(&BLock[myid-1]);
    }
    else {
      if (A[i] > A[i+1]) {
        temp = A[i]; A[i] = A[i+1]; A[i+1] = temp;
        exch = 1;
      }
    }

At the outset of each while-loop iteration, I reset the exch flag since no exchanges have been made yet. The index variable i is set to the first index of the thread's partition, start. This is more of a personal style thing. I could have just as easily used start in the code, but the use of i makes it a little easier to decipher what's going on with the compare and exchange.

The rest of the code in this segment compares the first and second elements within the partition. For the first thread, myid == 0, since there is no other thread responsible for lower indexed elements, there will be no chance of a data race on this first element from another thread. Thus, the else-clause simply compares the first and second partition elements and swaps as needed.

For all other threads, there is a partition of lower-indexed items and a thread passing through that partition. Because I know that there is going to be a "peek" across the partition boundaries by such a thread, I must protect the access to the first element, A[start], in the partition. From the way the partition boundaries are computed, I know that the partitions are numbered with the same ID number as the assigned thread. With that in mind, I use the mutex from the Block array that is indexed by the ID number of the thread that I know will be peeking over the boundary, which is one less than the partition thread's ID number (myid - 1). When we get to the fourth code segment, it should be clear how this scheme protects data across the other end of the partition. Consequently, I know that there will be no race condition on the first element of any partition.

The compare and swap are standard stuff with an addition. If a compare and swap are executed, the exch flag is set. This serves to force at least one more pass over the data.

    for (j = start+1; j < end-1; j++) {
      if (A[j] > A[j+1]) {
        temp = A[j]; A[j] = A[j+1]; A[j+1] = temp;
        exch = 1;
      }
    }

After the first compare (and exchange, if needed) is done, threads are free to run a Bubblesort pass over all other items within the partition. As with that first item, if a thread does execute an exchange, then the exch flag is set.

There is one more comparison to be done by almost every thread: Compare the last element of the partition, A[end-1], to the first element in the succeeding partition, A[end], if it exists.

    i = end-1;
    if (myid != NUM_THREADS-1) {
      pthread_mutex_lock(&BLock[myid]);
      if (A[i] > A[i+1]) {
        temp = A[i]; A[i] = A[i+1]; A[i+1] = temp;
        exch = 1;
      }
      pthread_mutex_unlock(&BLock[myid]);
    }

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video