How Not to Pop Your Bubblesort with Threads
In my previous post, I reminded you of the Bubblesort algorithm. I also attempted to give some idea about how this O(n2) serial algorithm might be able to run in parallel. To do that I employed some Shakespearean characters, a shelf of books, and a submarine. This post will just be a straight examination of the parallel code based on those ideas, with only one allusion to submarines and Scottish gentlemen. As before, let's assume we are sorting keys from low to high.
The code below is an implementation of the BubbleSort() function using POSIX threads. Notice that the simple serial code from the previous post (12 lines) has bloated to almost three times the size (34 lines), and that doesn't include all the support code outside of this function needed to prepare the new shared variables and to create threads to execute the function.
void *BubbleSort(void *pArg)
{
int i, j, k, releasePoint, temp, rpInc;
BOOL exch;
int *A = (int*)pArg;
rpInc = N / NUM_LOCKS;
rpInc++; // number of data items in each zone
while (!Done) {
k = 0;
exch = FALSE;
pthread_mutex_lock(&BLock[k]);
i = iCounter--;
releasePoint = rpInc;
if (i <= 0) {
Done = TRUE;
pthread_mutex_unlock(&BLock[k]);
break;
}
for (j = 0; j < i; j++) {
if (A[j] > A[j+1]) {
temp = A[j]; A[j] = A[j+1]; A[j+1] = temp;
exch = TRUE;
}
if (j == releasePoint) {
pthread_mutex_unlock(&BLock[k++]);
pthread_mutex_lock(&BLock[k]);
releasePoint += rpInc;
}
}
pthread_mutex_unlock(&BLock[k]);
if (!exch) Done = TRUE;
}
return 0;
}
Any variable that is not explicitly declared in the code above will be a shared variable. Since we can only send a single parameter to the threaded function BubbleSort(), I've only sent a pointer to the array (of integers) to be sorted. The number of elements in the array, N, is assumed to be one of those shared variables. An array of mutex objects, BLocks[], for the critical regions and the number of those objects in the array, NUM_LOCKS, are also shared. The variable Done is used by threads to determine if another pass on the data is needed or if the sorting task has been completed. Finally, iCounter keeps track of the maximum number of elements that must be examined in the next pass of a thread through the array.
In this parallel version of Bubblesort, the while loop keeps threads executing passes through the array as long as there is still a chance some data remains out of place; i.e., no thread has done a complete pass without exchanging two adjacent elements.
For each pass through the data, the maximum number of unsorted elements (iCounter) is accessed (protected by the mutex object BLock[0]) and the value is stored locally in i. After the local assignment, the counter is decremented. The value stored in i will be used by a thread to limit the number of array elements to be compared within a pass. This decrementing counter keeps down the number of useless comparisons within each pass since we know that element finally located in the [i]th index position will be that element's final sorted position.
If the number of potential unsorted items is less than or equal to zero, the sorting is done and the Done flag is set. When other threads complete their current pass, the while conditional test will allow them to terminate.
The j-loop passes through the array, element by element, compares the values of A[j] with A[j+1] and swaps them if they are out of order. To ensure that we avoid any race conditions, we need to partition the array into zones, with each zone protected by one of the BLock[] mutex objects. The final index value for the current zone is held in releasePoint. If the thread has reached the end of the current data zone, it releases the lock on that zone and attempts to acquire the lock for the next zone. If the preceding thread has not vacated this next zone, the thread waits for the release of the lock before proceeding. After gaining access to the next zone, the index value for the final element of the new zone is computed by incrementing releasePoint by rpInc (the number of elements in each zone).
Once the end of the array has been reached, the thread releases the last mutex object used. If no exchanges were made during the just completed pass, the thread sets the Done flag; otherwise, it will prepare to run another iteration of the while-loop.

