Algorithm Improvement through Performance Measurement: Part 3

Signed

Signed integers use 2's complement format shown below, with the binary representation on the left and its decimal value on the right:

01111111b 127
01111110b 126
01111101b 125
….
00000001b 1
00000000b 0
11111111b -1
11111110b -2
11111101b -3

10000011b -125
10000010b -126
10000001b -127
10000000b -128

Note that the most-significant-bit (MSB) serves as the sign bit. However, its meaning is opposite of what is needed for sorting (see Algorithm Improvement through Performance Measurement: Part 2_ and this bit either needs to be inverted or processed using opposite sense. Listing 4 shows the implementation for signed number, from 8-bit to 64-bit.

```
// Listing 4
// Treats the first digit specially, but the rest of the digits as unsigned.
template< class _Type, unsigned long PowerOfTwoRadix, unsigned long Log2ofPowerOfTwoRadix, long Threshold >
inline void _RadixSort_Signed_PowerOf2Radix_1( _Type* a, long last, _Type bitMask, unsigned long shiftRightAmount )
{
const unsigned long  numberOfBins = PowerOfTwoRadix;
unsigned long count[ numberOfBins ];
for( unsigned long i = 0; i < numberOfBins; i++ )
count[ i ] = 0;
for ( long _current = 0; _current <= last; _current++ )  // Scan the array and count the number of times each value appears
{
unsigned long digit = (unsigned long)logicalRightShift_ru( (_Type)( a[ _current ] & bitMask ),shiftRightAmount );   // extract the digit we are sorting based on
digit ^= ( PowerOfTwoRadix >> 1 );
count[ digit ]++;
}
long startOfBin[ numberOfBins ], endOfBin[ numberOfBins ], nextBin;
startOfBin[ 0 ] = endOfBin[ 0 ] = nextBin = 0;
for( unsigned long i = 1; i < numberOfBins; i++ )
startOfBin[ i ] = endOfBin[ i ] = startOfBin[ i - 1 ] + count[ i - 1 ];
for ( long _current = 0; _current <= last; )
{
unsigned long digit;
_Type tmp = a[ _current ];  // get the compiler to recognize that a register can be used for the loop instead of a[_first] memory location
while ( true )
{
digit = (unsigned long)logicalRightShift_ru( (_Type)( tmp & bitMask ), shiftRightAmount );
// extract the digit we are sorting based on
digit ^= ( PowerOfTwoRadix >> 1 );
if ( endOfBin[ digit ] == _current )
break;
_swap( tmp, a[ endOfBin[ digit ]] );
endOfBin[ digit ]++;
}
a[ _current ] = tmp;
endOfBin[ digit ]++;  // leave the element at its location and grow the bin
_current++;              // advance the current pointer to the next element
while( _current >= startOfBin[ nextBin ] && nextBin < numberOfBins )
nextBin++;
while( endOfBin[ nextBin - 1 ] == startOfBin[ nextBin ] && nextBin < numberOfBins )
nextBin++;
if ( _current < endOfBin[ nextBin - 1 ] )
_current = endOfBin[ nextBin - 1 ];
}
if ( bitMask != 0 )  // end recursion when all the bits have been processes
{
if ( shiftRightAmount >= Log2ofPowerOfTwoRadix )	shiftRightAmount -= Log2ofPowerOfTwoRadix;
else	shiftRightAmount  = 0;
for( unsigned long i = 0; i < numberOfBins; i++ )
{
long numberOfElements = endOfBin[ i ] - startOfBin[ i ];
if ( numberOfElements >= Threshold )  // currentOffset actually points to one beyond the bin
else if ( numberOfElements >= 2 )
insertionSortSimilarToSTLnoSelfAssignment( &a[ startOfBin[ i ]], numberOfElements );
}
}
}

```
Listing4

Two special steps are taken to treat signed numbers differently than unsigned, as shown here:

```
digit = (unsigned long)logicalRightShift_ru( (_Type)( tmp & bitMask ), shiftRightAmount );
digit ^= ( PowerOfTwoRadix >> 1 );

```

The first step performs a right shift of a signed number as if it was unsigned. The second step inverts the most-significant-bit. The rest of the code is identical to the initial iteration of unsigned implementation, except for another logical right shift of the bitMask. The rest of the recursive iterations are performed using the unsigned algorithm.

In Tables 9 through 12, measurements show a performance degradation of a few percent due to the need to invert the MSB of the first digit. This overhead gets smaller as the element size increases. The rest of the performance characteristics match those of the unsigned implementations. Note that Intel IPP does not implement signed 8-bit or 64-bit integer versions of sorting.

Table 9: Random 8-bit Signed Elements

Table 10: Random 16-bit Signed Elements

Table 11: Random 32-bit Signed Elements

Table 12: Random 64-bit Signed Elements

More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.