Bubble, BubbleSort, and Trouble

February 07, 2012

What's the worst that could happen in this example? During a pass through the books, Banquo may need to make a swap for almost every comparison or he might pause to warm his hands from the cold Scottish mist every so often. If Duncan walks faster or can swap books faster than Banquo (or maybe doesn't find any books that need to be swapped), then he could eventually catch up to Banquo's position. In such a situation, both Banquo and Duncan would compare the same two books and both decide that the books need to swap positions. If Banquo makes the swap and then Duncan makes the swap (having not noticed that Banquo had made the exchange), the books will be left in their same unsorted order. Then, since both have found an exchange, they will each make another full pass through the books.

Two extra runs through the data? Is that the worst? Unfortunately, Banquo and Duncan could meet again at the same adjacent books and achieve the same result. This relationship can perpetuate indefinitely. What I've described here is an example of a livelock situation. Our protagonists are doing some work (computation) by sweeping through the books looking for out-of-order texts, but are unable to proceed to completion due to the actions of some other character (the duplicate swap of books from the same two shelf slots).

If we had some mechanism to ensure Duncan and Banquo would never catch up to the other, we'd have a way to execute Bubblesort in parallel. As an illustration, imagine the shelf of books in a long corridor on a submarine. (Ha ha! I was able to work in the nautical theme.) There are hatches every few yards down the corridor and we impose the rule that, while sorting, whenever Duncan reaches the hatch into the next chamber holding books, he must first peek into the next area. If Banquo is still sorting books in that compartment, Duncan must wait for Banquo to complete his sorting mission in that zone and exit out the hatch at the other end. Obviously, Banquo (as well as Malcolm, Lennox, Ross, Fleance, and Lady Macduff if we add characters to the book-sorting project) must follow the same rule to enter only an empty chamber.

Getting back to programming with threads, one easy way to maximize the number of threads useful in Bubblesort would be to divide the data into a number of non-overlapping zones at least equal to the number of threads. Threads are not allowed to enter a zone until the preceding thread has completed the computations within that zone. These zones are then critical regions of data and I can use an appropriate synchronization mechanism for critical regions to control access to the data zones. If desired, the size of the zones can be dynamically shrunk as the number of unsorted elements decreases in each pass.

How much additional overhead does this modification add to the Bubblesort algorithm? There would need to be a check after each compare-exchange to determine when the end of a zone had been reached. At the end of each zone, a thread would exit the critical region of the current zone (yield the lock) and attempt to enter the succeeding zone. The more zones we install for scalability, the more overhead of exiting and entering critical regions there will be.

Since I've rambled on and on about waves and Shakespearean characters and shelves of books on submarines, I've run out of room. I'll present my version of the code based on the ideas above in the next post. You might take the time between now and then to see if you can develop a solution.

More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>