Channels ▼
RSS

JVM Languages

Sorting in Java & C#

Source Code Accompanies This Article. Download It Now.


January, 2006: Sorting in Java & C#

David teaches and writes about Java and C#, and can be contacted at dph@apk.net.


Sorting lists of built-in types or primitives in Java or C# is straightforward. Both languages offer sorting routines that work on in-memory lists of comparable items—comparable because each item in the list implements a "compare" interface expected by the list's sorting routine. In Java, each element to be sorted implements the java.lang.Comparable interface, and in C#, each element implements the System.Collections.IComparer interface. These interfaces allow types to define a natural ordering of instances of their class.

In Java, the java.util.Collections class can be used to sort lists, and in C#, the System.Collections.ArrayList class also offers a sort routine. These do their sorting by using the comparison routine defined in the java.lang.Comparable interface or the System.Collections.IComparer interface. Numerous built-in language types already implement this interface, so sorting types such as strings, numbers, and dates takes no extra effort beyond loading up the list and calling the sort method.

However, we often need to sort lists of complex domain-specific types such as Employee or Animal—not just lists of built-in strings. In this article, I use reflection to sort a list of complex types through a fully implemented example using Animals, leaving you with the chance to try it out by completing a partial implementation of Employees. Neither Animal nor Employee implements a comparator interface that would make them fit the sorting routines of either language. Users will want to sort a list of domain-specific objects on domain-specific fields. They'll want to sort a list of Employees by first name, or by last name, or by department, or by hire date, by contractor status, by manager, or by salary. Without trying very hard, that's seven possible ways to sort a list of Employees.

Both languages do offer a means for sorting user-supplied complex types by providing an additional sort method that takes both the list and a separate comparator that knows how to compare the elements in the list. Though the comparator is a separate type from the elements in the list, it is coded to know how to compare the elements in the list in some specific way. This separate, standalone comparator works the same way in both languages—it knows how to compare two like objects such that they can be ordered like this:

if (object1 < object2) return -1;
// a value less than 0
else if (object1 > object2) return 1;
// a value greater than 0
else return 0; // the objects sort equally

Obviously, the standalone comparator must know what field of the complex object is plucked from the list to sort on—Employee.salary or Employee.hireDate, and so on. For this to sort lists of Employees seven different ways, there would need to be seven java.util.Comparator implementations (or seven .NET System.Collections.IComparer implementations). The code would then need to pass to the sort method the list of Employees and the appropriate Comparator/IComparer implementation for the chosen field.

A time-tested paradigm to solve this kind of problem is to build an EmployeeComparator factory that knows how to supply all seven Comparators. You obtain the specific comparator for the chosen field by passing a key (such as "firstname") into the factory, and you get back a comparator implementation that knows how to compare Employee objects specifically by first name. This technique works well, and it is possible to lazily build comparator implementations only once and only as needed, store them in a map, and access them quickly by key from the map. Listing One is an incomplete implementation of the Java Employee class, and Listing Two is a partially completed comparator factory for it. (The complete code for sorting Animals with nested Classification references in both C# and Java is available electronically, see "Resource Center," page 4.)

The problem with an approach like this is that you have to code each key-comparator requirement into the factory, as indicated by the case blocks in the sample factory class. This makes it difficult to extend, and cumbersome for types such as Employee, which have more than a mere handful of sortable fields. As your application requirements grow to include associating a Project with each Employee, a new field called currentProject gets added to the Employee class. You then have the ability to display each employee's current project when you show a list of Employees. Because users will see this and want to sort by currentProject, you must go back to the factory and build another case block. Using reflection, you can have one reusable comparator, and therefore, avoid this maintenance.

Reflecting

Both Java and C# let you examine the details of each type that has been loaded into the runtime environment during code execution. The first time a type is loaded into the runtime environment of Java, for example, the environment essentially creates and stores a template of type Class that describes everything about the newly loaded type. When an instance of a particular type is asked for by the application, the template is used to render that instance, identifying all the fields, methods, properties, and constructors. Using reflection, all of these can be made available to applications at runtime.

In Java, the template object is the java.lang.Class object, and in C#, it is the System.Type object. Both can be used programmatically with reflection to look up methods by name and invoke them on specific objects. I access the appropriate field of the complex type using this ability of reflection to work by name. The C# Animal class in Listing Three is an easy domain-specific type to start with; Listing Four is its Java counterpart. They are nearly identical, except that the C# Animal defines age to be a true C# Property of the Animal, available by a "get" property accessor, while all the members of the Java Animal class are accessed by a method. There is a minor difference in the C# class involved if the reflected item is a property versus a method, and the code accounts for both.

Class Animal aggregates just three fields: a string animalType with values such as Monkey, Dog, or Bird; an int for the Animal's age; and another complex type called Classification that basically is used to house values such as mammal, reptile, amphibian, and so on. The Animal constructor wants all three values as arguments. When sorting Animals, the string animalType field and int age field clearly would be built-in types that already implement Comparator/IComparer. The Classification reference of an Animal by default is not sortable, but you can make it sortable, as will be shown later. The three steps involved in using reflection to access the animalType field of an Animal instance by method name are:

  1. Get the template for the given class. In C#, obtaining the System.Type template looks like this:
  2. Animal parakeet = 
       new Animal("Parakeet", 2, 
        new Classification("bird")); 
    Type t = parakeet.GetType();
    

  3. Ask the System.Type reference to vend the template for a specific method by name. This returns a reference to a System.Reflection.MethodInfo object that cannot be invoked by itself—it is not associated with the original parakeet object or with any Animal instance. It is just a template that knows what the properties of the named method are: public, returns a string, takes no arguments, and so on. Again, the C# example looks like:
    MethodInfo method = 
       t.GetMethod("GetAnimalType");
    

    System.Reflection.MethodInfo inherits from System.Reflection.MethodBase, which in turn, inherits from System.Reflection.MemberInfo—the base type for reflective behavior in .NET. In Java, the method reference is a java.lang.reflect.Method object that inherits from java.lang.reflect.AccessibleObject—the base class for reflecting Field, Method, and Constructor objects. By the way, Java throws a NoSuchMethodException if it can't find the named method, while C# returns a null MethodInfo reference if it can't find the method. This difference in the way the languages treat reflection has a small design impact on the standalone compare routine. The Java code can forego testing for null, but instead wraps the entire compare routine in a try-catch block.

  4. When it comes time to use this method template to invoke the method, the application needs to tell it which Animal instance it should be invoked on, and supply the method with the values of any arguments it requires. For this example, I assume the methods are simple getters or property accessors, and take no arguments. After obtaining the method template for the public string getAnimalType() method of the Animal class, you can invoke the method on the parakeet object via reflection in C# like this:
    object result = null;
    if(method != null)
    {
       // no args for getter method
       object[] params = null; 
       result = 
          method.Invoke(parakeet, params);
    }<br>
    

    Because this is the C# implementation, you have to first check for a null method, which is the only way you will know if the GetMethod request worked. Now you have the value of the animalType field (that you want to sort on) in the result that was returned from the invocation of the method.

Sorting with Reflection

Using the reflection abilities of both languages, you can bypass the factory altogether and reuse a comparator to sort anything that implements the interface required for the comparison. This is accomplished by providing a single, reusable, standalone System.Collections.IComparer (or java.lang.Comparable) implementation for all sorting of complex types. At runtime, the reusable comparator uses "reflection-by-name" to access from the complex type a specific field to be sorted on. The comparator then tests whether or not the reflected field itself implements the System.Collections.IComparer interface and therefore can be used to do comparison itself. In effect, it says: Give me by reflection the animal's animalType, and if the reflected animalType string implements IComparer, then let it handle the sorting.

For the remainder of this article, I concentrate on the C# example. The techniques and outcomes are similar for both languages. The starting point for sorting lists of complex types is with the sort method of the Compare.Comparator class:

public static void sort(ArrayList list,
string methodName, string dir)

When you invoke this method, behind the scenes it uses the standalone, reusable IComparer, so you don't need to be aware of how to bring the comparator into play, only of what arguments to pass to the sort method. The sort method sorts the list (of Animals) in argument one by comparing the results of reflection used to invoke the method named in argument two (for example, "GetAnimalType" ). Finally, it renders the results in ascending or descending order according to the direction supplied in the last argument. Invoke the sort routine as in CompareTest.cs or CompareTest (both available electronically), a snippet that does the sort on animalType follows. Recall, the Animal constructor wants an animalType, an age, and a Classification:

ArrayList aList = new ArrayList();
aList.Add(new Animal
("Cat", 8, new Classification("mammal")));
aList.Add(new Animal
("Stork", 5, new Classification("bird")));
...
Comparator.sort
(aList, "GetAnimalType", "asc");

The Comparator.sort routine is short:

  1. Acquiring the static-level lock on the Comparator's System.Type object to enforce thread safety. There is only going to be one of these locks because there is only one instance of the System.Type template per class loaded into the runtime environment. This ensures that only one thread can gain access to the sorting code, which is important because, as a static method, numerous threads could be vying for it and you would want to be sure that each sort invocation is complete before another invocation has a chance to access the comparator Singleton.
  2. Setting the method name in the Singleton that reflection should use.
  3. Setting the sort direction in the Singleton.
  4. Invoking Sort on the list and passing in the Singleton as the IComparer to use.
  5. lock(typeof(Compare.Comparator))
    {
    Compare.Comparator.
        comparatorInstance.MethodName = 
             methodName;
    Compare.Comparator.
        comparatorInstance.Direction = 
            dir; 
                list.Sort(Compare.Comparator.
                           comparatorInstance);
    }>
    

    The Singleton Compare.Comparator.comparatorInstance is the reusable, standalone comparer, and it is an instance of the ListComparator : IComparer class. As such, it must define public int Compare(object obj1, object obj2). This method, of course, is where the guts of the reusable comparison are. This method follows the steps already outlined for using reflection. It attempts to obtain the System.Reflection.MethodInfo object named by the methodName parameter. If it can't get this, it thinks the methodName names a C# property instead, and tries to obtain the System.Reflection.PropertyInfo object by the same methodName.

    With either a MethodInfo or a PropertyInfo in hand, it uses reflection to invoke the named getter, thereby accessing the fields of the two Animal instances from the list that need to be compared. The results of the invocation-by-reflection are stored in object references called result1 and result2, respectively. If the method named was, for example, "GetAnimalType" then result1 and result2 hold strings such as "Monkey" or "Alligator". After doing the reflection, the code tests to see if it can forward the actual comparison on to result1 and result2 by using the as operator to cast both results to IComparable.

This is the beauty of object-oriented code at work. Any type that implements IComparable can be compared using CompareTo. Because most built-in types in C# implement the IComparable interface, it is possible to cast to IComparable and let the built-in type handle the comparison. If the IComparable comp1 = result1 as IComparable; cast succeeds for both result1 and result2, then you can return icomp1.CompareTo(icomp2). If you are not comparing built-in types, but instead comparing another complex type such as Classification, the trick is just to make sure it implements the IComparable interface, and it will fit right into this sort method. If the attempt to cast to IComparable fails for either result1 or result2, you fall back on the trick of converting result1 and result2 to type string, and force a string comparison by returning result1.ToString().CompareTo(result2.ToString()). If the return from the reflected method is a null in either result1 or result2, then the code sorts the null to the front of the list without doing any actual comparison.

The very minor downside to this is that the code is plying the wily skills of reflection—a knock against which is that it might impact performance if the list to be sorted is lengthy—although I have not noted much penalty in performance as I've been using this technique on lists with thousands of items. But the great benefit is that there exists a single Comparator for every complex type that needs to be sorted. You won't need another comparator, nor need to maintain another comparator factory.

DDJ



Listing One

package dph;

/** Incomplete Employee implementation. Readers can build out this class, and 
possibly add a reference to an instance of type Project to match what 
the Animal class does with type Classification. */

public class Employee
{
     // Member Data
     private String firstname;
     private String lastname;
     private String department;
     private java.util.Date hiredate;
     private Boolean contractorStatus;
     private String manager;
     private Float salary;
     
     // Constructor
     public Employee(String fn, String ln)
     {
          this.firstname = fn;
          this.lastname = ln;
     }
     // Public accessor methods
     public String getFirstName() { return this.firstname; }
     public String getLastName() { return this.lastname; }
     // Public setter methods
     public void setDepartment(String d) { this.department = d; }
     /* Project reference that can be used to test sorting of Employee
     // based on another complex type.  Reader needs to define the
     // Project class so it implements java.lang.Comparable to compare
     // based on some member of the Project type.
     private Project curProject;
     public Project getCurrentProject() { return this.curProject; 
}
     public void setCurrentProject(Project p) { this.curProject = p; }
     */
}
Back to article


Listing Two
package dph;
/** Incomplete factory implementation for accessing Comparators
    that know how to compare Employees based on specific properties. */

public class EmployeeComparatorFactory
{
     // Class-level member for determining if types to be compared
     // actually are type Employee.  Used in convertToEmployee method.
     private static Class employeeClass;
     static
     {
          try
          {
               employeeClass = Class.forName("dph.Employee");
          }
          catch(Exception ex)
          {
               ex.printStackTrace();
          }
     }
     // Store the Comparators in this map
     private static java.util.Map<String,java.util.Comparator> map = 
                   new java.util.HashMap<String,java.util.Comparator>();
     // Help ensure singleton pattern--only want one factory to exist.
     private EmployeeComparatorFactory() {}
     // Public class-level interface. Pass all exceptions along for
     // simplicity's sake. Attempt to acquire already-existing Comparator
     // by key from the map, and build one for the map if it doesn't exist.
     // This is the only public interface necessary.
     public static java.util.Comparator 
                   getComparator(String type) throws Exception
     {
          java.util.Comparator comp = map.get(type);
          if(comp == null)
          {
               comp = makeComparator(type);
               map.put(type,comp);
          }
          return comp;
     }
     // Handy for converting Strings to type that can be used in a switch
     enum ComparisonType
     { 
          FIRSTNAME, LASTNAME, DEPARTMENT, HIREDATE, 
          CONTRACTORSTATUS, MANAGER, SALARY
     };
     // The maintenance work goes in here, when need to add new types to
     // be compared on. Convert type to an enum value, switch on enum value
     // and build the appropriate Comparator to be returned from this method.
     // The FactoryComparator class is nested in this class, defined below.
     private static java.util.Comparator 
                          makeComparator(String type) throws Exception
     {
          java.util.Comparator comp = null;
          switch(ComparisonType.valueOf(type.toUpperCase()))
          {
          case FIRSTNAME:
               comp = new FactoryComparator()
               {
                    public int compare(Object obj1, Object obj2)
                    {
                         Employee e1 = convertToEmployee(obj1);
                         Employee e2 = convertToEmployee(obj2);
                         return e1.getFirstName().compareTo(e2.getFirstName());
                    }
               };
               break;
          case LASTNAME:
               //...and so on and so on...
               break;
          }
          return comp;
     }
     // Private helper method to test on type before casting
     private static Employee convertToEmployee(Object obj)
     {
          if(obj.getClass() != EmployeeComparatorFactory.employeeClass)
          {
             throw new RuntimeException(obj + 
                                 " can't be compared as type Employee");
          }
          return (dph.Employee)obj;
     }
     // Private helper abstract implementation of Comparator so don't need
     // to define public boolean equals(Object other) in every case.  This is
     // abstract because it doesn't implement the compareTo method yet.
     private static abstract class 
                       FactoryComparator implements java.util.Comparator
     {
          // Required by java.util.Comparator interface
          public boolean equals(Object other)
          {
               return this.getClass().getName().equals(other.getClass().getName());
          }    
     }
}
Back to article


Listing Three
using System;
namespace Compare
{
     /// <summary>
     /// Summary description for Animal.
     /// </summary>
     public class Animal
     {
          private string _type;
          private int _age;
          private Classification _class;
          public Animal(string theType, int theAge, Classification theClass)
          {
               this._type = theType;
               this._age = theAge;
               this._class = theClass;
          }
          public override string ToString()
          {
               return this._type + ", " + this._age + " " + this._class;
          }
          // methods
          public string GetAnimalType() { return this._type; }
          public Classification GetClassification() { return this._class; }
          // property
          public int Age 
          { 
               get { return this._age; } 
          }
     }
}
Back to article


Listing Four
package dph;
/** A (slightly) complex domain-specific type that aggregates the 
    Classification type. */

public class Animal
{
     // Member Data
     private String type;
     private Integer age;
     private Classification _class;
     // Constructor--uses 1.5 autoboxing to convert int to Integer.
     // Requires all three data members to be initialized.
     public Animal(String _type, int age, Classification _class)
     {
          this.type = _type;
          this.age = age;
          this._class = _class;
     }
     // Public accessors for member data
     public Integer getAge() { return this.age; }
     public String getAnimalType() { return this.type; }
     public Classification getClassification() { return this._class; }
     public String toString() { return type + ", " + age + " " + _class; }
}
Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video