# Parallel In-Place Merge

### In-Place

The main idea for in-place merge, illustrated in Figure 2, is based on a modification to the not-in-place merge. Instead of moving a single element X at location q1 to location q3 at each recursive step, all elements between q1 and q2 are moved using In-Place Block Swap (Block Exchange). Let's go through this in detail.

Figure 2 An in-place merge.

The top row of Figure 2 shows the input array of elements and three indexes: l, m, and r. The left segment, which is between indexes l and m, inclusive, is pre-sorted. The right segment, which is between indexes m+1 and r, inclusive, is also pre-sorted. Merge algorithm uses these two segments to produce a single sorted result. In-place merge places the result back in the input array between indexes l and r (the bottom row of Figure 1), while using a small constant amount of extra memory. Note that the two input segments touch, whereas the not-in-place merge algorithm had array elements in between the two input segments.

The first step of this divide-and-conquer algorithm is to take the larger of the two input segments, divide it in half to determine the index q1 and its value X. Then a binary search is used within the other segment to determine the index q2, at which all elements to the left are smaller than X, and all elements to the right are larger or equal to X. Thus, each of the two segments have been split into two halves, one less than X and the other greater than or equal to X. These two steps are the same as in not-in-place merge algorithm. The algorithm preserves stability because only the elements that are strictly less than X, but were to the right of X are moved to the left of X.

The next step is to in-place swap the segment q1 to m, inclusive, and the segment m+1 to q2 inclusive, as shown by the diagonal arrows in Figure 2. This swap would be simple if the segments were of equal size, but this occurs rarely. In-Place Block Swap comes to the rescue, because it's capable of swapping two segments of unequal size, which are touching, in-place. Three algorithms are described in In-Place Block Swap, and the one with the highest performance along with the best parallel scaling needs to be chosen.

The algorithm proceeds recursively by processing (l, q1-1, q3-1) segment and (q3+1, q2-1, r) segments of the bottom row of Figure 1. Note that the array element X at q3 is in its final location and needs no further processing.

### Sequential

Listing One shows a sequential divide-and-conquer In-Place Merge algorithm implementation. Lengths of the left and the right input array segments are computed first, determining which segment is larger. When the left (length1) segment is larger, the index of the mid-element is computed (q1). Binary search is used to split the smaller segment (m+1 to r, with length2) using the value of q1, into elements that are strictly smaller than the element at q1 (X in Figure 1). Being strictly less than X at q1, preserves stability. This binary search produces q2, which is the index of the split of the smaller segment (left segment, length2).

```template< class _Type >
inline int my_binary_search( _Type value, const _Type* a, int left, int right )
{
long low  = left;
long high = __max( left, right + 1 );
while( low < high )
{
long mid = ( low + high ) / 2;
if ( value <= a[ mid ] )	high = mid;
else						low  = mid + 1;	// because we compared to a[mid] and the value was larger than a[mid].
// Thus, the next array element to the right from mid is the next possible
// candidate for low, and a[mid] can not possibly be that candidate.
}
return high;
}
// Swaps two sequential sub-arrays ranges a[ l .. m ] and a[ m + 1 .. r ]
template< class _Type >
inline void block_exchange_mirror( _Type* a, int l, int m, int r )
{
mirror_ptr( a, l,     m );
mirror_ptr( a, m + 1, r );
mirror_ptr( a, l,     r );
}
// Merge two ranges of source array T[ l .. m, m+1 .. r ] in-place.
// Based on not-in-place algorithm in 3rd ed. of "Introduction to Algorithms" p. 798-802
template< class _Type >
inline void merge_in_place_L1( _Type* t, int l, int m, int r )
{
int length1 = m - l + 1;
int length2 = r - m;
if ( length1 >= length2 )
{
if ( length2 <= 0 )	return;							// if the smaller segment has zero elements, then nothing to merge.
int q1 = ( l + m ) / 2;								// q1 is mid-point of the larger segment. length1 >= length2 > 0
int q2 = my_binary_search( t[ q1 ], t, m + 1, r );	// q2 is q1 partitioning element within the smaller sub-array (and q2 itself is part of the sub-array that does not move)
int q3 = q1 + ( q2 - m - 1 );
block_exchange_mirror( t, q1, m, q2 - 1 );
merge_in_place_L1( t, l,      q1 - 1, q3 - 1 );		// note that q3 is now in its final place and no longer participates in further processing
merge_in_place_L1( t, q3 + 1, q2 - 1, r      );
}
else {	// length1 < length2
if ( length1 <= 0 )	return;							// if the smaller segment has zero elements, then nothing to merge
int q1 = ( m + 1 + r ) / 2;							// q1 is mid-point of the larger segment.  length2 > length1 > 0
int q2 = my_binary_search( t[ q1 ], t, l, m );		// q2 is q1 partitioning element within the smaller sub-array (and q2 itself is part of the sub-array that does not move)
int q3 = q2 + ( q1 - m - 1 );
block_exchange_mirror( t, q2, m, q1 );
merge_in_place_L1( t, l,      q2 - 1, q3 - 1 );		// note that q3 is now in its final place and no longer participates in further processing
merge_in_place_L1( t, q3 + 1, q1,     r      );
}
}
```

Listing One

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>