*Victor Duvanenko is a long-time contributor to Dr. Dobb's. He can be contacted at [email protected]*

So far, sequential and parallel implementations of In-Place MSD N-bit-Radix Sort algorithm have been presented in Part 3 and Part 9. Both of these algorithms are linear-time** O( dn )** with competitive worst-case performance compared to STL sort and radix sorts in Intel's IPP library. In the first part of this article, the concepts behind In-place MSD Radix Sort will be illustrated with a concrete example. The sequential implementation will then be revisited, simplified, and explained in detail. Plain function and function template implementations will be analyzed. In the next installment, the parallel implementation will build on this foundation.

### Concepts

The main idea of Radix Sorting is to sort numbers not by comparing them to each other, but by using the digits that make up the numbers. Most-significant-digit (MSD) algorithm starts with the MSD and progresses toward the least-significant-digit (LSD) of each number, while the least-significant-digit (LSD) algorithm starts with the LSD and progresses toward the MSD. MSD algorithm uses divide-and-conquer recursive technique and can be in-place or stable, implemented in Part 3 and Part 4.

Figure 1 shows graphically the steps taken in the MSD In-Place Radix Sort, where the original source array is split into **R** bins, where **R** is the radix, based on the most-significant-digit of each array element. Then each bin is split into sub-bins, based on the next-significant-digit, and so on. The number of recursion levels is equal to the number of digits that make up each array element, plus the root of the tree (which is the original source array). For instance, if each array element consisted of 64-bit numbers and we used 8-bit digits (256-Radix), then eight recursion levels would be necessary to process all 64-bits, eight bits at each recursion level. The run time order of MSD Radix Sort is the number of recursion levels, with each level processing the entire source array **- O( dn )**, where **d** is the number of digits.

To illustrate the MSD algorithm let's take a small concrete example of an array that contains several 8-bit numbers, and use 4-bit digits (16-radix). This is convenient since each digit is a single hex letter, and every number is made of only two digits.

( FF 00 0F 50 31 19 11 E7 F3 30 )

The MSD of each number is **( F 0 0 5 3 1 1 E F 3 )**. Based on this MSD, the array is split into bins (up to 16 of them) — **( 00 0F ) ( 19 11 ) ( 31 30 ) ( 50 )( E7 )( FF F3 )**. If a resulting bin has more than one element, it is recursively sorted based on the next digit. Figure 2 shows a concrete example as a recursion tree.

Note that bin **0** is created in front of bin **1**, followed by bin **2**, bin **3**, and so on, until bin **E**, and bin **F**. By placing all of the elements within bin **0** in front of elements of bin **1**, radix sort accomplishes its job.

Another way of visualizing this process is to view the source array as a single bin that gets split into **R** bins at the first level, and then each of those bins gets split into **R** bins recursively, and so on.

Figure 3 shows the concrete example, which starts out as a single bin, shown in the top row as a single bin. This bin gets split into sub-bins **0**, **1**, **3**, **5**, **E** and F, based on the most-significant-digit of each array element (highlighted in bold in the figure), shown in the second row. Elements in the second row are sorted, if we look only at the most-significant-digits of each element (the digits in bold in the figure), and ignore all other digits. Each sub-bin that has more than one element is split/sorted further based on the second most-significant-digit, shown in the bottom row. Elements in the third row are sorted, if we look only at the top two most-significant-digits. Since the elements contain only two digits, the resulting array is sorted.

Note that Radix Sort algorithm does not perform comparison between array elements. Instead, it extracts digits from each element and moves elements into separate bins based on these digits. Bins are arranged in order, with smaller ones in front of the larger ones. This divide-and-conquer method is continues recursively until all digits have been used for sorting. The divide step works efficiently by splitting the input into radix number of bins at each level of the recursion tree.

Figure 4 demonstrates the steps during the first level of recursion. At the top the original source array of ten elements is shown. On the first step, the first element is examined (**FF**). Its MSD is extracted, which is **F** (shown in bold in the figure), indicating that the element belongs in bin **F**. This element is swapped with the first element of bin **F**. Thus, in step 1, element **FF** and **F3** are swapped. Element **F3** is now the first element of the array and element **FF** is in bin **F**. The empty slots in step 1 and subsequent steps indicate the array elements that are in their original positions and have not been touched.

On the second step, element **F3** is processed, extracting its MSD, which is also **F**. This element is swapped with the next element of bin **F**, which has a value of **30**. Thus, element **30** is now the first element of the array and element **F3** is in bin **F**. On the third, fourth and fifth steps, the algorithm continues to figure out which bin to move the current element to, and swaps with the element at that location.

On the sixth step, the current element is **0F** needs to move to bin **0**, which is the bin that it's currently in. Thus, element **0F** does not need to move. Instead, the algorithm advances to the second array element and processes from that location. On step seven, element **00** is already in bin **0**. At this point, bin **0** is full, and the first element of bin **1** is skipped over, making the forth array element the one being processed (value **50**). Element **50** is moved to bin **5** by swapping it with element **11**, on step 8. On step 9, element **11** is processed and kept in place since it belongs in bin **1**, which is where it already is. Since bin **1** is now full, the search for the next element to process skips over bin **3** and bin **5**, ending in bin **E** Bin **E** has one element that belongs in bin **E**, and thus stays where it already is. The search for the next element to process ends up going past bin **F** and beyond the end of the array, at which point the algorithm is done processing the first level of recursion.

Note that after the first level of recursion array elements are in sorted order if we only look at the most-significant-digit of each element. The next level of recursion will take each bin and sort recursively within the bin based on the next most-significant-digit. This process continues via divide-and-conquer until sub-bins have one or fewer elements or all digits have been used for sorting — i.e. the least-significant-digit has been used for sorting.

On every step of the algorithm, an element is placed into the bin where it belongs. In other words, input array with ten elements takes ten steps to sort based on MSD. This is a key behavior that is part of the linear-time performance of MSD Radix Sort.