Channels ▼
RSS

.NET

Identity and Equality in .NET

Source Code Accompanies This Article. Download It Now.


June, 2004: Identity & Equality In .NET

Efficiently implementing Equality methods in C#

Matthew is a software development consultant specializing in performance and robustness, and the author of the forthcoming Imperfect C++ (Addison-Wesley, 2004). He can be contacted via http:// stlsoft.org/.


In .NET terminology, value types are those that are derived from System.ValueType and have the defining characteristics of being small, usually allocated on the thread's stack, not garbage collected, and passed by value to functions. Whether a type is a value type or reference type (those manipulable via references), it may be appropriate to compare it by value. In such types, the Object-derived virtual method Equals() will be overridden and implemented to provide a meaningful equality comparison for the referenced objects. The canonical form of Equals() for equivalence is usually expressed as in Listing One.

A more succinct, but mostly equivalent, form can be achieved by using the C# as operator, which tests an object instance against a type, and returns a reference to the given type if the instance is not null and is of said type, otherwise returning null. It is mostly equivalent because the previous implementation only compared types that are exactly equivalent. The as form evaluates the two instances if they are the same type or if obj is of a type derived from SomeType (Listing Two). Though this is usually correct semantically, there may be circumstances in which it is not appropriate.

When overriding the Equals() method of types for which you wish to provide equality comparison as a replacement for the default identity comparison, C# provides the additional facility to overload operators == and !=. Unfortunately, it is all too common to see (even in textbooks) implementations such as Listing Three, which is quite wrong. If o1 is null, then a NullReferenceException is thrown as the nonstatic Equals() method is called on the null reference. Ouch! This bug can exist for a long time within seemingly stable components, just waiting for a time when it will be precipitated, and you'll have to ship a new version of your components and start messing around with the configuration and the GAC to make your clients think they're dealing with your old version. (Fingers crossed that you've not introduced some new bugs in the changes you'll have made to other parts of your assembly!) As Listing Four shows, the correct version couldn't be simpler.

The implementation of Object's class/static Equals() method tests the object references against null and calls the override of Equals() on the left parameter if both parameters are nonnull, otherwise evaluating the equivalence of the null parameter(s). However, it seems potentially inefficient to call a function that will then call, virtually, another (your class's instance/nonstatic Equals() method). Hence, you may wish to write your own inline code, but beware. You can write it mindful of the potential for a null reference, but still be wrong. It's not unusual to see code like Listing Five. Unfortunately, this is also wrong. In the circumstances in which o1 and o2 are both null, then they will be evaluated as not equal! It should look like Listing Six. However, this is still wrong—there's another nasty problem in this one. It recurses forever—since o1 is of type SomeType, the test against null results in operator == or operator != being called, ad infinitum. The instance references need to be cast to Object before testing against null, as in Listing Seven. Often the best thing to do is to stick with Object.Equals(), except where efficiency is very important.

Efficiency

You may be asking whether there are performance differences between the various implementation options presented; indeed, there are. For the test program in Listing Eight, I wrote six variants of the class—Normal, Inline, Reorder, WithAsRefEquals, WithAsCastObject, and Combined. The implementations for all six, along with the test program and supporting binary components used to derive the results described here, are available electronically; see "Resource Center," page 3.

The implementation of the Normal class's Equals() method is according to Listing One, and its == and != operators according to Listing Four. The Inline class differs only by having its operators defined as in Listing Seven. Note from Listing One that the comparison of the member variables of the two comparand instances are compared in the order m_string, m_int as they are in the class definition. You'd instinctively assume that comparing an integer would be quicker than comparing a string, so the Reorder type has the same implementation as Normal with the exception that the member variable comparison order is reversed.

The WithAsRefEquals type corresponds to that of Listing Two. The WithAsCastObject is identical to the WithAsRefEquals type except that the conditional if(Object.ReferenceEquals(rhs, null)) is rewritten as if((Object)rhs == null) to ascertain whether an inline identity comparison is faster than calling ReferenceEquals(). The final type, Combined, is a combination of all the fastest elements from the other enhancements over the original Normal type.

Table 1 shows the results of the timings for the six variants. The timings were derived from a single session in which one execution of the test program generates five timings for each of the variants, of which the average is presented in the table. The program was compiled using both .NET SDK 1.0 and 1.1, and executed on a single-processor 512-MB, 2-GHz PC running Windows XP. The timings were measured with a performance counter component freely available from http://synsoft.org/dotnet.html.

As you can see, each of the given steps affords a significant performance advantage over the canonical version. The single biggest factor is the use of the as keyword, which removes approximately 60 percent of the cost of the original version. Overall, nearly 80 percent of the cost of the original version can be removed by using the as operator, inlining the null-checking in the == and != operators, and reordering the member tests. When object equality testing is being conducted on a frequent basis, such as in indexed/hashed containers, there are serious performance gains to be had!

One last thing: Don't forget to implement the GetHashCode() method correctly (C# makes you override it if you're overriding Equals(), and vice versa), or your types will do strange things in associative containers.

DDJ



Listing One

class SomeType
{
  private string m_string;
  private int    m_int;
  public override bool Equals(Object obj)
  {
    // Check against null
    if(obj == null)
    {
      return false;
    }
    else
    {
      // Check against different type
      if(this.GetType() != obj.GetType())
      {
        return false;
      }
      else
      {
        // Safely convert to "SomeType"
        SomeType  rhs = (SomeType)obj;

        // Compare values
        return (m_string == rhs.m_string && m_int == rhs.m_int);
      }
    }
  }
  ...
}
Back to article


Listing Two
public sealed class SomeType
{
  ...
  public override bool Equals(Object obj)
  {
    // Check against null
    if(obj == null)
    {
      return false;
    }
    else
    {
      // Get if is a "SomeType"
      SomeType  rhs = obj as SomeType;
      if(Object.ReferenceEquals(rhs, null))
      {
        return false;
      }
      else
      {
        // Compare values if given instance of SomeType
        return (m_string == rhs.m_string && m_int == rhs.m_int);
      }
    }
  }
}
Back to article


Listing Three
class SomeType
{
  ...
  public static bool operator ==(SomeType o1, SomeType o2)
  {
    return o1.Equals(o2);
  }
  public static bool operator !=(SomeType o1, SomeType o2)
  {
    return !o1.Equals(o2);
  }
  ...
}
Back to article


Listing Four
class SomeType
{
  ...
  public static bool operator ==(SomeType o1, SomeType o2)
  {
    return Object.Equals(o1, o2);
  }
  public static bool operator !=(SomeType o1, SomeType o2)
  {
    return !Object.Equals(o1, o2);
  }
  ...
}
Back to article


Listing Five
class SomeType
{
  ...
  public static bool operator ==(SomeType o1, SomeType o2)
  {
    return o1 == null ? false : o1.Equals(o2);
  }
  public static bool operator !=(SomeType o1, SomeType o2)
  {
    return o1 != null ? false : !o1.Equals(o2);
  }
  ...
}
Back to article


Listing Six
class SomeType
{
  ...
  public static bool operator ==(SomeType o1, SomeType o2)
  {
    return o1 == null ? o2 == null : o1.Equals(o2);
  }
  public static bool operator !=(SomeType o1, SomeType o2)
  {
    return o1 != null ? o2 != null :!o1.Equals(o2);
  }
  ...
}
Back to article


Listing Seven
class SomeType
{
  ...
  public static bool operator ==(WithAsInline o1, WithAsInline o2)
  {
    return (Object)o1 == null ? (Object)o2 == null : o1.Equals(o2);
  }
  public static bool operator !=(WithAsInline o1, WithAsInline o2)
  {
    return (Object)o1 == null ? (Object)o2 != null : !o1.Equals(o2);
  }
  ...
}
Back to article


Listing Eight
using PerfCntrType = 
                 SynSoft.Performance.PerformanceCounter;

. . .

PerfCntrType  counter = new PerfCntrType();
const int     CELEMENTS = 100000;
TestType[]    objects   = new TestType[CELEMENTS];

for(i = 0; i < CELEMENTS; ++i)
{
  const int mod = 512;

  objects[i] = new TestType("String #" + (i % mod), args);
}

counter.Start();
for(i = 0, k = 0; i < CELEMENTS; ++i)
{
  for(j = i + 1; j < CELEMENTS; ++j)
  {
    if(objects[i] == objects[j])
    {
      ++k;
    }
  }
}
counter.Stop();
Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.