Channels ▼
RSS

Parallel

Parallel In-Place Radix Sort Simplified


Plain Function

Listing 2 shows the plain function implementation. It was inspired by the algorithms and coding style of Professor Sedgewick's "Algorithms in C++" book series. This implementation has only a few differences with the function template version. Recursive calls have been moved inside the recursive function, and are no longer in the setup function. Two of the three template parameters have been converted to constants, and the third has become an added function parameter. From users point of view the function template and plain functions have an identical interface (for sorting arrays of "unsigned long" 32-bit elements). However, the function template will also work for other native unsigned numeric data types, such as 8, 16, 32, 64, and so on. To support other data types the plain function would need to be replicated for each data type.

// Copyright(c), Victor J. Duvanenko, 2010
// Plain function In-place MSD Radix Sort implementation (simplified).
const unsigned long PowerOfTwoRadix       = 256;	// constants in C++ are of static linkage - i.e. seen only by this source file (nice in this case!)
const unsigned long Log2ofPowerOfTwoRadix =   8;

inline void _RadixSort_Unsigned_PowerOf2Radix_Simple( unsigned long* a, long last, unsigned long bitMask, unsigned long shiftRightAmount, long Threshold )
{
	if ( last < Threshold )   { insertionSortSimilarToSTLnoSelfAssignment( a, last + 1 );   return; }

	unsigned long count[ PowerOfTwoRadix ];
	for( unsigned long i = 0; i < PowerOfTwoRadix; i++ )     count[ i ] = 0;
	for ( long _current = 0; _current <= last; _current++ )  count[ (unsigned long)(( a[ _current ] & bitMask ) >> shiftRightAmount ) ]++;	// Scan the array and count the number of times each value appears

	long startOfBin[ PowerOfTwoRadix + 1 ], endOfBin[ PowerOfTwoRadix ], nextBin = 1;
	startOfBin[ 0 ] = endOfBin[ 0 ] = 0;    startOfBin[ PowerOfTwoRadix ] = -1;			// sentinal
	for( unsigned long i = 1; i < PowerOfTwoRadix; i++ )
		startOfBin[ i ] = endOfBin[ i ] = startOfBin[ i - 1 ] + count[ i - 1 ];

	for ( long _current = 0; _current <= last; )
	{
		unsigned long digit;
		unsigned long current_element = a[ _current ];	// get the compiler to recognize that a register can be used for the loop instead of a[_current] memory location
		while( endOfBin[ digit = (unsigned long)(( current_element & bitMask ) >> shiftRightAmount )] != _current )  _swap( current_element, a[ endOfBin[ digit ]++ ] );
		a[ _current ] = current_element;

		endOfBin[ digit ]++;
		while( endOfBin[ nextBin - 1 ] == startOfBin[ nextBin ] )  nextBin++;	// skip over empty and full bins, when the end of the current bin reaches the start of the next bin
		_current = endOfBin[ nextBin - 1 ];
	}
	bitMask >>= Log2ofPowerOfTwoRadix;
	if ( bitMask != 0 )						// end recursion when all the bits have been processes
	{
		if ( shiftRightAmount >= Log2ofPowerOfTwoRadix )	shiftRightAmount -= Log2ofPowerOfTwoRadix;
		else												shiftRightAmount  = 0;

		for( unsigned long i = 0; i < PowerOfTwoRadix; i++ )
			_RadixSort_Unsigned_PowerOf2Radix_Simple( &a[ startOfBin[ i ]], endOfBin[ i ] - startOfBin[ i ] - 1, bitMask, shiftRightAmount, Threshold );
	}
}
inline void RadixSortInPlace_PowerOf2Radix_Simple( unsigned long* a, unsigned long a_size )
{
	unsigned long shiftRightAmount = sizeof( unsigned long ) * 8 - Log2ofPowerOfTwoRadix;						// Create bit-mask and shift right amount
	unsigned long bitMask = (unsigned long)( ((unsigned long)( PowerOfTwoRadix - 1 )) << shiftRightAmount );	// bitMask controls/selects how many and which bits we process at a time
	const long Threshold  = 25;
	_RadixSort_Unsigned_PowerOf2Radix_Simple( a, a_size - 1, bitMask, shiftRightAmount, Threshold );
}

Listing 2

Using constants was necessary, since local arrays (count[], startOfBin[], and endOfBin[]) are allocated based on the value of PowerOfTwoRadix. Some C++ compilers do not support allocation of local arrays (on the stack) based on an array size value passed as an argument to a function, as shown below:


void foo( const unsigned long array_size )
{
	unsigned long my_array[ array_size ];	// Compile ERROR!
	...
}

Some versions of gcc support this coding style natively. Also, an alloca() function could be used instead of constants, as using constants is constraining (and in the case of radix, needs to be varied).

References

Algorithm Performance

Parallel In-Place N-bit-Radix Sort

Quality of Random Numbers versus Performance

In-Place Hybrid Binary-Radix Sort

Optimizing Performance of Sorting Algorithms (Insertion and Selection Sort)

In-Place Hybrid N-bit-Radix Sort

Counting Sort Performance

Parallel Counting Sort

Parallel Counting Sort, Part 2

Stable Hybrid MSD N-bit-Radix Sort


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video