Keyvan Nayyeri

God breathing through me

IComparer vs IEqualityComparer

Working with collection types is a common task in modern software development and as a widespread development platform, .NET Framework is coming with a very rich set of collection types and features varying from traditional types such as arrays or ArrayList to modern generic types and very helpful extension methods that are offered for them in .NET 3.5.

As we move forward, we feel the necessity of having the capability to apply some operations like sort and search more than the past and accomplishing these tasks is the common scenario behind many of the extension methods in .NET 3.5.

The interesting point about performing these tasks is their dependency to two interfaces that are required for their internal operation and these two classes are IComparer and IEqualityComparer. In this post I want to give an introduction to these two classes and discuss their usage.

Background

First of all, let me give a short background about the samples that I’m going to present in this post. Here I define a very simple class called Person that represents a person with his name and blog URI.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

 

namespace IComparerVsIEqualityComparer

{

    public class Person

    {

        #region Properties

 

        public string Name { get; set; }

 

        public Uri BlogUri { get; set; }

 

        #endregion

 

        #region Public Constructor

 

        public Person(string name, string blogUrl)

        {

            this.Name = name;

            this.BlogUri = new Uri(blogUrl);

        }

 

        #endregion

    }

}

Later in the post I will use collections of Person objects to perform some operations like sort and distinct select and exhibit the role of the abovementioned classes.

IComparer

The first interface to examine is the older class that has been a part of the .NET Framework since the first version, 1.0, to now, and that class is IComparer which is mainly responsible for comparing two objects of the same type, so it’s commonly used in the sort operations (and some other operations) in the .NET Framework.

IComparer appears as two types of interfaces in the .NET Framework: one is the traditional interface that can be implemented like regular interfaces and the other one is a generic interface that should be the preferred method of using this class. In either case there is a single Compare function that must be implemented which gets two objects as its parameters and returns an integer value. This integer value represents the comparison of two objects: if the first object is less than the second one then it returns a negative integer, and if both objects are equal then it returns a zero value otherwise it must return a positive integer.

In the below code I implement the generic interface for Person class in which I write a simple code that compares two objects and applies the CompareTo function of the string type as a helper. This implementation compares two Person objects based on the string order of their Name properties.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

 

namespace IComparerVsIEqualityComparer

{

    public class PersonComparer : IComparer<Person>

    {

        #region IComparer<Person> Members

 

        public int Compare(Person x, Person y)

        {

            if (x.Name == y.Name)

                return 0;

            else

                return x.Name.CompareTo(y.Name);

        }

 

        #endregion

    }

}

Now I apply this implementation in a piece of code where I define a generic list of Person objects and call its Sort method by passing an instance of PersonComparer object.

private static void IComparerSample()

{

    Console.Title = "IComparer vs IEqualityComparer";

 

    List<Person> persons = new List<Person>();

    persons.Add(new Person("Keyvan Nayyeri", "http://nayyeri.net"));

    persons.Add(new Person("Simone Chiaretta", "http://codeclimber.net.nz/"));

    persons.Add(new Person("Phil Haack", "http://haacked.com/"));

    persons.Add(new Person("Scott Hanselman", "http://hanselman.com/blog/"));

 

    PersonComparer personComparer = new PersonComparer();

    persons.Sort(personComparer);

 

    foreach (Person person in persons)

    {

        Console.WriteLine("Name: {0} - Blog: {1}",

            person.Name, person.BlogUri.ToString());

    }

 

    Console.ReadLine();

}

The output of the code is predictable and is shown in the following figure.

Output

IEqualityComparer

The second interface, IEqualityComparer, is not as common as the first one but will have a more important role after the birth of LINQ because it plays a key role in collection operations like distinction or intersection. Like IComparer, IEqualityComparer has a traditional implementation as well as a generic interface.

This interface has two functions to implement: an Equals function that gets two objects of the same type and returns a Boolean value specifying if they are equal or not, and a GetHashCode function that gets an object and return the integer hash code representative of that object.

In the below code I implement this interface for Person class with a simple comparison between the name and blog URI of objects.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

 

namespace IComparerVsIEqualityComparer

{

    public class PersonEqualityComparer : IEqualityComparer<Person>

    {

        #region IEqualityComparer<Person> Members

 

        public bool Equals(Person x, Person y)

        {

            if ((x.Name == y.Name) && (x.BlogUri == y.BlogUri))

                return true;

            else

                return false;

        }

 

        public int GetHashCode(Person obj)

        {

            return base.GetHashCode();

        }

 

        #endregion

    }

}

Applying this class in my sample, I can get the distinct collection of Person objects in the below code by passing an instance of this IEqualityComparer implementation to Distinct function.

private static void IEqualityComparerSample()

{

    Console.Title = "IComparer vs IEqualityComparer";

 

    List<Person> persons = new List<Person>();

    persons.Add(new Person("Keyvan Nayyeri", "http://nayyeri.net"));

    persons.Add(new Person("Scott Hanselman", "http://hanselman.com/blog/"));

    persons.Add(new Person("Simone Chiaretta", "http://codeclimber.net.nz/"));

    persons.Add(new Person("Phil Haack", "http://haacked.com/"));

    persons.Add(new Person("Keyvan Nayyeri", "http://nayyeri.net"));

    persons.Add(new Person("Scott Hanselman", "http://hanselman.com/blog/"));

 

    PersonEqualityComparer personEqualityComparer = new PersonEqualityComparer();

    persons = persons.Distinct(personEqualityComparer).ToList<Person>();

 

    foreach (Person person in persons)

    {

        Console.WriteLine("Name: {0} - Blog: {1}",

            person.Name, person.BlogUri.ToString());

    }

 

    Console.ReadLine();

}

Output

Conclusion

IComparer and IEqualityComparer play an important role in collection types to accomplish a variety of key operations. The main role of IComparer is to provide means to compare two objects and specify which one is greater than other, and the main role of IEqualityComparer is to compare two objects for equality, so IComparer has a more generic role than IEqualityComparer. Both interfaces are an inherent part of working with collection types in a professional level.

At the end, there is a sample code package available that you can download to get your hands on the code samples provided in this post.

10 Comments

Pingback from Reflective Perspective - Chris Alcock » The Morning Brew #190

Pingback from Dew Drop – September 30, 2008 (Evening Edition) | Alvin Ashcraft's Morning Dew


Matt Ellis
Oct 02, 2008 9:25 AM
#

Hi Keyvan, nice article, but I think you've got a bug. Your IEqualityComparer.GetHashCode is returning the hash code of the comparer object, rather than a hash code for the passed in object.

And yet it works.

Disappointingly, I can't get Microsoft's reference source symbol server to work, so I'm stuck with Reflector, but it looks like what's happening is that when you call Distinct, the list is enumerated and the values put into an (internal) Set object. This Set uses IEqualityComparer to ensure that only one instance of an item is added. If it were just relying on hash code, this should fail, since IEqComp is returning the same hash code. Fortunately, the set is internally using the hash code to decide which bucket to put the object in, and then uses Equals to make sure the object isn't already in the bucket. Since Equals is correct, all is good. However, performance is now going to be impacted because the set is now effectively an array instead of a hashtable.

I thought it was a bit odd that you had to provide an implementation of GetHashCode, but the docs explain why it makes sense:

Implementations are required to ensure that if the Equals method returns true for two objects x and y, then the value returned by the GetHashCode method for x must equal the value returned for y.

So, the correct implementation would generate a hash code based on the same parameters used in the Equals method, and we should hopefully get a working distinct command, with a better spread of values across the hash set's buckets.

Cheers!

Matt


KillerCoder
Nov 21, 2008 12:28 PM
#

Excellent !

Simple, clear and concise. Your explantion is exactly what I needed.

Tank you.


Gio
Feb 07, 2009 12:48 PM
#

Nice article. I'm facing a roadblock right now with Intersect and IEqualityComparer. It seems like the latter is not called in the Intersect. If I use Where() with the said comparer, the intended behavior is achieved. I'm just wondering if this could be a bug.


Gio
Feb 07, 2009 1:55 PM
#

I got it. The culprit is the GetHashCode(). I'm getting the hash code of the object instead of the property concerned in the IEqualityComparer.Equals() method.


Weekly Web Nuggets #32 & #33
Feb 22, 2009 10:45 PM
#

This week&rsquo;s web nuggets is going to be a double dose &ndash; last week was busy, busy, busy! Pick of the week: 10 Programming Proverbs Every Developer Should Know General Windows Coming to Amazon EC2 : Windows developers will soon be able to take


Rodrigo
May 06, 2009 12:17 PM
#

Excellent! Thank You!


sp
Jul 01, 2009 3:21 PM
#

Thank you, wonderful example. Much better then MSDN and the books.


Punit
Mar 01, 2010 1:55 PM
#
thanks! simple and clear.

Leave a Comment





Ads Powered by Lake Quincy Media Network