Channels ▼

Community Voices

Dr. Dobb's Bloggers

The Difference between Equality and Identity Operators (in D)

April 28, 2009

My latest pet project (as of late of last year) is a .NET back-end for Walter Bright's D programming language compiler. It is a great way to learn and evangelize the inner workings of .NET and the CLR and I get to understand D's subtleties like never before.

D provides two ways of comparing objects: the equality operator and identity operator (actually we should say operators – plural, since each comes with its negated counterpart). Testing objects for equality is done with the familiar (at least to C/C++ and C# programmers) “==” and “!=” operators.  Testing for identity is the realm of the “is” and “!is” operators.

As you may suspect already, equality and identity are not the same.

To better understand the difference, let’s consider an example:

import std.stdio;

 

class Pet

{

    string name;

    public override bool opEquals(Object other)

    {

        return name == (cast(Pet)other).name;

    }

}

 

void main()

{

    Pet dog = new Pet();

    Pet cat = new Pet();

 

    writefln(dog is cat);

    writefln(dog == cat);

}

Puzzling enough, this code prints:

false

true

Naturally, a dog is not a cat and there is no surprise that the result of the “dog is cat” expression is false. But why is the dog equal to a cat (at least according to our code)? The short answer: identity tests if the object references are the same; equality results in a call to operator opEquals. In the code above, the name (the string member datum of the Pet class) is null in both the cat and the dog instances and opEquals correctly sees them as equal.

Like in C# (and unlike C++) D class objects are really references under the wraps. Occasionally, D programmers may need to compare such references (very much like they would compare pointers in C, without comparing the entities that are pointed to). In the “dog is cat” expression above, dog and cat refer to different class objects. So they are not identical. That explains the need for the “is” operator (aka identity operator). In Java, identity versus equality is an either-or deal, as suggested by the online docs at http://java.sun.com/docs/books/tutorial/java/IandI/objectclass.html: "The equals() method provided in the Object class uses the identity operator (==) to determine whether two objects are equal [...]To test whether two objects are equal in the sense of equivalency (containing the same information), you must override the equals() method."

The == operator is simply a shorthand for invoking opEquals; the line writefln(dog == cat), although not identical with writefln(dog.opEquals(cat)), generates the same syntax tree. D source code looks more intuitive and natural when “==” is used instead of the verbose opEquals function call.

 

Please note that because all classes in D inherit off a root Object class, which provides a base implementation for opEquals, every class will always have opEquals defined (whether it does what the programmer wants or it should be overridden is a different question).

 

In addition to class objects (i.e. instances of classes) D has other data type families: scalars (such as integers and floating point numbers), structs and arrays.  For scalars, equality and identity are the same, as demonstrated by this program:

 

 int i = 2;

 int j = 1;

 

writefln(i is ++j);     // prints true

writefln(i == j);       // also prints true

 

For struct objects, identity is defined as the bits in the struct being identical. The same as equality then (you may conclude); well, not quite. Structs in D have value type semantics. If in the example above we re-declare Pet as a struct rather than a class, like this:

 

struct Pet

{

    string name;

}

 

The equality and identity tests will yield the same result (that is, the dog and cat will bizzarrely appear as both equal and identical, because their respective name fields are null).  The compiler generates code to compare the bits in the structs: the null name fields are compared bit-by-bit as being equal.

 

Structs however are more interesting than that: they have value type semantics, like scalars, but are a bit like classes too, in the sense that some operators can be re-defined by the programmers. Note that  we did not say “overridden”, to avoid confusion with the polymorphic behavior of classes. Some text books on object-oriented languages call this “operator overloading”.

 

Classes have a slot in their virtual tables for opEquals: when the programmers writes her own, the base class operator is overriden. Structs do not have virtual tables, and hence no polymorphic behavior is possible. But programmers can re-define the behavior of certain operators, such as opEqual. The compiler will statically determine if such an operator should be called in lieu of the default behavior.

 

For example, we can decide (in a counter-Orwellian spur) that no pets are equal, ever:

 

struct Pet

{

    string name;

    public bool opEquals(Pet other)

    {

        return false;

    }

}

 

Note that the argument type is Pet (and not Object, as in the case of the overridden class operator); there is no overridden attribute, either.

 

The code

writefln(dog is cat);

writefln(dog == cat);

 

now prints:

true

false

 

This is because the identity operation for structs always results in a bit-by-bit comparison, and cannot be overridden, or re-defined, by the user; the equality operation has been re-defined to always return false. And by the way, the identity operator cannot be overriden for classes either: for class instances, the identity operator always compares the two object references.

 

Now that we have seen how identity and equality operators work for structs, lets see how they apply to arrays.  Lets assume we have two arrays of Pet objects:

 

 

Pet[] myPets;

Pet[] herPets;

 

// … populate the arrays…

 

if (myPets == herPets) { //  calls opEquals for each object

// …

}

if (myPets is herPets) { //  tests if object references are the same

// …

}

 

We can say that testing arrays for equality is a deep operation (opEquals is called on each pair of array elements), whereas an identity test simply checks that 1) the number of elements in the two arrays is the same, and 2) that the object references are the same; it is a shallow operation.

 

In conclusion, equality and identity operators work the same for scalars, but not for classes, structs, and arrays. The identity operator cannot be overloaded. In the cases where we need to know if two object references (or the references to objects in two arrays) are the same, the identity operator should be used. The identity operator is faster than the equality operator because it does not do a “deep” comparison of the two objects.

 

There is a difference between the signature of opEquals for class objects versus struct objects. For structs, we have the straightforward:

 

public bool opEquals(StructName other);

 

For class objects we have two choices: declare opEquals as above:

public bool opEquals(ClassName other);

 

or  (the more verbose):

public override bool opEquals(Object other);

 

The difference is in how the operator== is invoked. If it needs to be used in a polymorphic fashion (say you have an array of Objects) then the signature must be:

public override bool opEquals(Object);

For use cases when the final type of the class objects being compared is known, the first form may be used. The first form hides the base class' opEquals rather than overriding it. There's no need to worry about incorrect uses though: the compiler will catch them for you and either issue a compile-time error (if the "-w" flag is given to the command line) or insert code that detects the error at runtime (and throws an error):


version (D_NET)
{
 import System;
 alias Console.WriteLine print;
}
else
{
 import std.stdio;
 alias writefln print;
}

class Test
{
    int i;

    public bool opEquals(Test t)
    {
        print("opEquals");
        return i == t.i;
    }

    this(int j)
    {
        i = j;
    }
}

void main()
{
//this causes either a compile-time error or a runtime exception
//core.exception.HiddenFuncError: Hidden method called for test.Test
/*
    Object t1 = new Test(1);
    Object t2 = new Test(1);
 */
//this works (non-polymorphic behavior):
    Test t1 = new Test(1);
    Test t2 = new Test(1);

    try
    {
        print (t1 == t2);
    }
    catch (Exception e)
    {
        print (e);

    }

 

One last note: unlike in C#, where operators == and != can be overloaded separately, in D you only need to worry about the == (opEquals) operator: when the compiler sees != it simply calls opEquals, then logically negates the result.

 

Frankly, I never understood why C++ and C# allow the equals operator and the negated counterpart to be overloaded separately, only for text books to recommend that you should always overload them together: it seems like a lot of unncessary work that the designers of D avoided altogether.

 

 

Thanks to Walter Bright, Bartosz Milewski and Andrei Alexandrescu for reviewing this article.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video