Channels ▼
RSS

Parallel

Algorithm Improvement through Performance Measurement: Part 4


Improved Template Function

So far, a simple template function has been used, which abstracted/generalized the data type only. Listing 3 abstracts the Radix, log2( Radix ), and Threshold, as well as the data type.


// Copyright(c), Victor J. Duvanenko, 2009

// Listing 3
template< class _Type >
inline unsigned long extractDigit( _Type a, _Type bitMask, unsigned long shiftRightAmount )
{
	unsigned long digit = (unsigned long)(( a & bitMask ) >> shiftRightAmount );
	// extract the digit we are sorting based on return digit;
}
template< unsigned long PowerOfTwoRadix, unsigned long Log2ofPowerOfTwoRadix, long Threshold, class _Type >
inline void _RadixSort_StableUnsigned_PowerOf2Radix_2( _Type* a, _Type* b, long last, _Type bitMask, unsigned long shiftRightAmount, bool inputArrayIsDestination )
{
	const unsigned long numberOfBins = PowerOfTwoRadix;
	unsigned long count[ numberOfBins ];

	for( unsigned long i = 0; i < numberOfBins; i++ )
		count[ i ] = 0;
	for ( long _current = 0; _current <= last; _current++ )
  	// Scan the array and count the number of times each value appears
		count[ extractDigit( a[ _current ], bitMask, shiftRightAmount ) ]++;

	long startOfBin[ numberOfBins ], endOfBin[ numberOfBins ];
	startOfBin[ 0 ] = endOfBin[ 0 ] = 0;
	for( unsigned long i = 1; i < numberOfBins; i++ )
		startOfBin[ i ] = endOfBin[ i ] = startOfBin[ i - 1 ] + count[ i - 1 ];
	for ( long _current = 0; _current <= last; _current++ )
	b[ endOfBin[ extractDigit( a[ _current ], bitMask, shiftRightAmount ) ]++ ] = a[ _current ];

	bitMask >>= Log2ofPowerOfTwoRadix;
	if ( bitMask != 0 )    // end recursion when all the bits have been processes
	{
	if ( shiftRightAmount >= Log2ofPowerOfTwoRadix ) 
                                               shiftRightAmount -= Log2ofPowerOfTwoRadix;
		else												shiftRightAmount  = 0;
		inputArrayIsDestination = !inputArrayIsDestination;
		for( unsigned long i = 0; i < numberOfBins; i++ )
		{
			long numOfElements = endOfBin[ i ] - startOfBin[ i ];
			if ( numOfElements >= Threshold )
				_RadixSort_StableUnsigned_PowerOf2Radix_2<
                                                             PowerOfTwoRadix, Log2ofPowerOfTwoRadix, 
                                                               Threshold >( &b[ startOfBin[ i ]], &a[ startOfBin[ i ]],
                                                                   numOfElements - 1, bitMask, 
                                                                        shiftRightAmount, inputArrayIsDestination );
			else {
		insertionSortSimilarToSTLnoSelfAssignment( &b[ startOfBin[ i ]], numOfElements );
				if ( inputArrayIsDestination )
					for ( long j = startOfBin[ i ]; j < endOfBin[ i ]; j++ )
                                                                               // copy from external array back into the input array
						a[ j ] = b[ j ];
			}
		}
	}
	else {	// Done with recursion copy all of the bins
		if ( !inputArrayIsDestination )
			for ( long _current = 0; _current <= last; _current++ )
                                                   // copy from external array back into the input array
				a[ _current ] = b[ _current ];
	}
}
template< class _Type >
inline void RadixSortMSDStablePowerOf2Radix_unsigned( _Type* a, _Type* b, unsigned long a_size )
{
	const unsigned long Threshold =  100;  // Threshold of when to switch to using Insertion Sort
	const unsigned long PowerOfTwoRadix       =   256;
	const unsigned long Log2ofPowerOfTwoRadix =     8;
	
	unsigned long shiftRightAmount = sizeof( _Type ) * 8 - Log2ofPowerOfTwoRadix;
	_Type bitMask = (_Type)( ((_Type)( PowerOfTwoRadix - 1 )) << shiftRightAmount );	// bitMask controls/selects how many and which bits we process at a time

	if ( a_size >= Threshold ) _RadixSort_StableUnsigned_PowerOf2Radix_2<
                          PowerOfTwoRadix, Log2ofPowerOfTwoRadix, 
                                Threshold >( a, b, a_size - 1, bitMask, shiftRightAmount, false );
	else						insertionSortSimilarToSTLnoSelfAssignment( a, a_size );
}

Listing 3

C++ provides a powerful template function capability, which allow not only function arguments to be passed to a template function (on the stack), but also generic values and data types [4]. The function arguments that are inside the parentheses () will be pushed on the stack, but generic parameters inside the pair of <> are not pushed on the stack. Sadly, the function arguments are allowed to have default values, where as generic parameters are not [4].

These generic values are not passed on the stack and thus don’t slow recursive function down. They can also be "true constants" to allow arrays to be allocated locally, and not slow things down by dynamic allocations. For example, the generic value PowerOfTwoRadix is used to allocate the count[] array of the Radix size locally (on the stack). Log2ofPowerOfTwoRadix and Threshold are used as "true constants" that are not passed on the stack. Note that _Type generic parameter is last, which allows it to not be specified, as is seen in the recursive function call to _RadixSort_StableUnsigned_PowerOf2Radix_2<>().

This implementation can also handle an arbitrary power-of-two radix -- i.e. any number of bits.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video