The Ordersort Algorithm

Ordersort is a list-sorting algorithm that's simple to implement in C++, and that sorts lists faster than the Quicksort and Bubblesort algorithms.


September 01, 2005
URL:http://www.drdobbs.com/the-ordersort-algorithm/184402000

September, 2005: The Ordersort Algorithm

Matthew Aitkenhead is a postdoctoral researcher at the Macaulay Institute, in Aberdeen, Scotland, working on remote sensing interpretation and environmental modelling. Mark Richards is a Ph.D. student at the Department of Plant & Soil Science at the University of Aberdeen, Scotland. They can be contacted at [email protected].


Sorting lists of values or character strings is an important aspect of many computing problems. Many list-sorting algorithms exist, varying widely in speed, ease of implementation, and utility. Several reviews, including Shene's [1], provide a comparison of list-sorting algorithms, and several authors emphasize that the relative ability of algorithms depends upon the degree of sorting that already exists within the list they are applied to. The number of different algorithms and their relative merits can lead to confusion about which is most applicable in a given situation. Knuth [2] provides a compilation of known list-sorting algorithms and places their development in a historical perspective, while Niemann [3] gives a list of common algorithms alongside source code, enabling the implementation of these algorithms. An additional useful source of information is Martin [4], who presents a detailed survey of list-sorting algorithms over a period of 20 years, giving details of 37 sorting algorithms in all.

However, not all of the list-sorting algorithms in use are as historic. Nyberg et al. [5] demonstrated a novel record-breaking list-sorting algorithm called "Alphasort," while discussions of improvement and fine-tuning of various algorithms to produce ever-better performance is ongoing in many Internet fora. Bentley and Sedgewick [6] demonstrated a searching and sorting algorithm using ideas dating back to the 1960s, but that had never been implemented together in one algorithm.

The bench testing of list-sorting algorithms provides an important indicator of performance. The design of these bench-testing experiments has important effects on the measurements taken, as described in Alviar [7]. Various benchmark indicators exist for list-sorting algorithms, with the most common being the time taken to sort lists of specific size. A popular alternative is to measure how much sorting can be carried out in a set time period (for instance, Puschner and Burns [8]).

In this article, we present a novel list-sorting algorithm named "Ordersort" that is simple to implement and that sorts lists at a rate superior to both the Quicksort and Bubblesort algorithms for lists of less than a certain length, as shown by testing on lists of different lengths and with different degrees of randomization. We implement Ordersort in C++ and explain how the algorithm operates.

Methods and Results

The Ordersort algorithm operates in two stages:

  1. Determining the order in which the entries should be arranged.
  2. Sorting the list according to this order.

The order in which entries are to be arranged is determined by locating, for a particular entry, the entries that come immediately before and immediately after it. The algorithm moves sequentially from one entry to the next in the unsorted list, locating the previous and next entries in the sorted list, and labelling each with the address of the entry that follows it. The location of the previous entry is determined by a two-tier looping process that first loops through the list in steps of size N1/2, where N is the list length, attempting to find an entry that is only slightly lower in the list order than the entry of interest, and then loops forward in single steps from this point to find the entry immediately below the entry of interest. The pseudocode in Figure 1 details the Ordersort algorithm's processes.

All processes were carried out using programs written in C++, running on a Windows XP/Pentium 4 machine with a 2.59-GHz processor and 512 MB of RAM. Figure 2 presents the sort times using this system for three list-sorting algorithm implementations, including one O(N2) algorithm (Bubblesort), one O(NlogN) algorithm (Quicksort), and the Ordersort algorithm, with randomized integer lists of different lengths. The values in the lists used lie between 0 and the list length.

In Figure 3, sorting times for the same three algorithms at different list lengths are given, with the lists being partially sorted. In this case, partially sorting of the lists is carried out by allowing the ith entry to occupy the range [i-z, i+z], where z is equal to some factor w multiplied by the length of the list. For Figure 3, the value of w is set to 0.1.

Figure 4 shows the sorting times for the same three algorithms, using lists that are very sorted (the value of the aforementioned w is set to 0.001). Comparison of Figures 2, 3, and 4 shows that the three list-sorting algorithms all show improvements in list-sorting speed with more sorted lists, with the Bubblesort algorithm showing the most dramatic improvement. The Ordersort algorithm is faster than Quicksort up to a list length of approximately 3000 for the unsorted and partially sorted lists, and for very sorted lists, Bubblesort is superior to the other two up to list lengths of approximately 25,000.

Examination of the results of list sorting and of the methodology demonstrates that the algorithm provides a stable sort method (that is, it preserves the current sort order for equal keys). Examination of the method itself indicates that the algorithm provides a sorting time of O(N3/2) for the worst-case scenario, placing it between the categories occupied by algorithms such as Bubblesort and those such as Quicksort. In practice, for a completely randomized list, the effective sort time is approximately O(N1.41). Listing 1 is the C++ source-code listing for the Ordersort algorithm.

Discussion

We've presented and implemented a compact, simple, and rapid method of sorting lists, which compares favorably with an implementations of common O(NlogN) and O(N2) (Quicksort and Bubblesort) sorting algorithms using random and partially sorted lists. This method requires two arrays, each the size of the list being sorted to operate, but apart from this memory requirement, does not place any special demands computationally. It operates at comparable speeds to the Quicksort algorithm when applied to sorted and unsorted lists below a certain list-length threshold. The Ordersort algorithm has the capability of being applied to lists of strings as well as real number and integer lists, and can be used in lists where there are records with the same values whose order must be preserved. The method is not groundbreaking in its originality, and is similar to the insertion sort method with the additional use of a linked-list implemented in two arrays. However, the combination of the linked lists and the search technique used to find the insertion point for the next item does provide an original method.

Acknowledgments

Thanks to Dr. Jim McLeod for his advice and assistance during the writing of this article.

References

  1. [1] Shene, C.-K. "A Comparative Study of Linked List-sorting Algorithms," http:// portal.acm.org/portal.cfm.
  2. [2] Knuth, D.E. The Art of Computer Programming, Vol. 3: Sorting and Searching, Second Edition, Addison-Wesley, 1998.
  3. [3] Niemann, T. "Sorting and Searching Algorithms: A Cookbook," http://ciips.ee.uwa .edu.au/~morris/Year2/PLDS210/niemann/ s_man.htm.
  4. [4] Martin, W.A. "Sorting." ACM Computing Surveys, Volume 3 (4), 1971.
  5. [5] Nyberg, C., T. Barclay, Z. Cvetanovic, J. Gray, and D. Lomet. "AlphaSort: A Cache-sensitive Parallel External Sort," The International Journal on Very Large Data Bases, Vol 4(4), 1995.
  6. [6] Bentley, J.L. and R. Sedgewick. "Fast Algorithms for Sorting and Searching Strings," In Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms, 1997.
  7. [7] Alviar, S.M. "Design, Measurement and Analysis of Performance Experiment on Selected Sorting Algorithms," Proceedings of Computer Measurement Group (CMG) CMG XV, International Conference, 1984.
  8. [8] Puschner, P. and A. Burns. "Time-constrained Sorting: A Comparison of Different Sorting Algorithms," Proceedings of the 11th Euromicro International Conference on Real-Time Systems, 1999. q

September, 2005: The Ordersort Algorithm

Figure 1: Pseudocode.

Take unsorted list A of length N
Loop through A from start with step size 1
    Identify if value X is lowest or highest so far
    If not lowest or highest entry so far then
        Loop through A from start with step size N1/2
          Identify if value Y is highest value still lower than X
        End loop
        Loop through A from Y with step size 1
            Find address in list B of higher value Z
            Identify if Z is lower than X
            Set Z to highest value found lower than X
        End loop
        Set address in of Z in list B to point to X
    End if
Finish loop
Loop through list A
    Order A into C using values in B
Finish loop

September, 2005: The Ordersort Algorithm

Figure 2: Sort times using different algorithms and list lengths for random lists.

September, 2005: The Ordersort Algorithm

Figure 3: Sort times using different algorithms and list lengths for partially sorted lists.

September, 2005: The Ordersort Algorithm

Figure 4: Sort times using different algorithms and list lengths for very sorted lists.

September, 2005: The Ordersort Algorithm

Listing 1

#include "stdafx.h"
#include "stdlib.h"
#include "time.h"
#include "math.h"

int main(int argc, char* argv[])
{
  // Variable declarations
  const int ListLength = 100;
  int a[ListLength], b[ListLength], c[ListLength];
  int x, y, HighestSeen, LowestSeen, EntryValue, EntryStep;
  int HighestBelow, HighestBelowPos, CurrentPosition, Buffer1;
  int Buffer2, HighestEntry, LowestEntry;
  bool IsLower, IsHigher, IsNotGreater;

  // The list would be read or otherwise accessed at this point

  // Sort list
  HighestEntry = 0;
  LowestEntry = ListLength + 1;
  HighestSeen = 0;
  LowestSeen = 0;

  for (x=0; x < ListLength; x++)
  {
    EntryValue = a[x];
    IsLower = false;
    IsHigher = false;
    
    // Is it the lowest value so far?
    if (EntryValue < LowestEntry)
    {
      IsLower = true;
      b[x] = LowestSeen;
      LowestEntry = EntryValue;
      LowestSeen = x;
    }
    // Is it the highest value so far?
    if (EntryValue >= HighestEntry)
    {
      IsHigher = true;
      b[HighestSeen] = x;
      b[x] = 0;
      HighestEntry = EntryValue;
      HighestSeen = x;
    }
    // If not lowest or highest, find the next-lowest value
    if (!IsLower && !IsHigher)
    {
      // Jump through list to find any lower value
      EntryStep = 1 + (int)sqrt(x);
      HighestBelow = 0;
      for (y=0; y < x; y += EntryStep)
      {
        if (a[y] > HighestBelow && a[y] < EntryValue)
        {
          HighestBelow = a[y];
          HighestBelowPos = y;
        }
      }
      // If lower value not found, use lowest value
      if (HighestBelow == 0)
      {
        HighestBelow = LowestEntry;
        HighestBelowPos = LowestSeen;
      }
      // Loop through list to find highest lower value
      IsNotGreater = true;
      CurrentPosition = HighestBelowPos;
      while(IsNotGreater)
      {
        Buffer1 = CurrentPosition;
        CurrentPosition = b[CurrentPosition];
        if (a[CurrentPosition] > EntryValue)
        {
          IsNotGreater = false;
          Buffer2 = CurrentPosition;
        }
      }
      // Change list
      b[Buffer1] = x;
      b[x] = Buffer2;
  }
  }
  // Carry out sorting
  CurrentPosition = LowestSeen;
  for (x=0; x < ListLength; x++)
  {
    c[x] = a[CurrentPosition];
    CurrentPosition = b[CurrentPosition];
  }
  return 0;
}

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.