Tuesday, June 9, 2009

Sorting Arrays and Collections (Part 1)

The key to sorting a list of objects is the ability to compare any two of those objects. No matter the sorting algorithm you choose, to sort a list of objects you must be able to compare a pair of those objects and then be able to decide how to proceed based on the result. The first choice for comparing objects in the .NET Framework is the IComparable interface.

The IComparable interface and its generic counterpart define a single CompareTo method. This method compares the current instance to another instance of the same type and returns an Int32 value indicating their relative order. A result less than zero indicates that the current instance comes before the other value, while a result greater than zero indicates that the other value comes first. A zero result indicates that the two values are equivalent.

All the primitive .NET data types, including String, implement the IComparable interface, allowing arrays and collections of such values to be sorted automatically, e.g.

C#

List<int> numbers = new List<int>();
 
numbers.Add(7);
numbers.Add(2);
numbers.Add(5);
numbers.Add(3);
 
Console.WriteLine("Before sorting:");
 
foreach (int number in numbers)
{
    Console.WriteLine(number);
}
 
numbers.Sort();
 
Console.WriteLine("After sorting:");
 
foreach (int number in numbers)
{
    Console.WriteLine(number);
}

VB

Dim numbers As New List(Of Integer)
 
numbers.Add(7)
numbers.Add(2)
numbers.Add(5)
numbers.Add(3)
 
Console.WriteLine("Before sorting:")
 
For Each number As Integer In numbers
    Console.WriteLine(number)
Next
 
numbers.Sort()
 
Console.WriteLine("After sorting:")
 
For Each number As Integer In numbers
    Console.WriteLine(number)
Next

Sorting arrays is basically the same except that the Array.Sort method is a static/Shared method:

C#

int[] numbers = { 7, 2, 5, 3 };
 
Console.WriteLine("Before sorting:");
 
foreach (int number in numbers)
{
    Console.WriteLine(number);
}
 
Array.Sort(numbers);
 
Console.WriteLine("After sorting:");
 
foreach (int number in numbers)
{
    Console.WriteLine(number);
}

VB

Dim numbers As Integer() = {7, 2, 5, 3}
 
Console.WriteLine("Before sorting:")
 
For Each number As Integer In numbers
    Console.WriteLine(number)
Next
 
Array.Sort(numbers)
 
Console.WriteLine("After sorting:")
 
For Each number As Integer In numbers
    Console.WriteLine(number)
Next

The same principle holds for String, DateTime and any other type that implements the IComparable interface. As such, we can define our own types that implement IComparable and then they will be able to be automatically sorted too. For instance, let’s say that we have a Person class and we want them to be inherently sortable by LastName and FirstName. Such a class could look like this:

C#

public class Person : IComparable, IComparable<Person>
{
    private string _lastName;
    private string _firstName;
 
    public string LastName
    {
        get
        {
            return this._lastName;
        }
        set
        {
            this._lastName = value;
        }
    }
 
    public string FirstName
    {
        get
        {
            return this._firstName;
        }
        set
        {
            this._firstName = value;
        }
    }
 
    public Person(string firstName, string lastName)
    {
        this._firstName = firstName;
        this._lastName = lastName;
    }
 
    public int CompareTo(object obj)
    {
        return this.CompareTo((Person)obj);
    }
 
    public int CompareTo(Person other)
    {
        // Compare by LastName by default.
        int result = this.LastName.CompareTo(other.LastName);
 
        // If LastNames are the same...
        if (result == 0)
        {
            // ...compare by FirstName.
            result = this.FirstName.CompareTo(this.FirstName);
        }
 
        return result;
    }
}

VB

Public Class Person
    Implements IComparable, IComparable(Of Person)
 
    Private _lastName As String
    Private _firstName As String
 
    Public Property FirstName() As String
        Get
            Return Me._firstName
        End Get
        Set(ByVal value As String)
            Me._firstName = value
        End Set
    End Property
 
    Public Property LastName() As String
        Get
            Return Me._lastName
        End Get
        Set(ByVal value As String)
            Me._lastName = value
        End Set
    End Property
 
    Public Sub New(ByVal firstName As String, _
                   ByVal lastName As String)
        Me._firstName = firstName
        Me._lastName = lastName
    End Sub
 
    Public Function CompareTo(ByVal obj As Object) As Integer _
    Implements IComparable.CompareTo
        Return Me.CompareTo(DirectCast(obj, Person))
    End Function
 
    Public Function CompareTo(ByVal other As Person) As Integer _
    Implements IComparable(Of Person).CompareTo
        'Compare by LastName by default.
        Dim result As Integer = Me.LastName.CompareTo(other.LastName)
 
        'If LastNames are the same...
        If result = 0 Then
            '...compare by FirstName.
            result = Me.FirstName.CompareTo(other.FirstName)
        End If
 
        Return result
    End Function
 
End Class

Note that the Person class implements both the standard and generic forms of the IComparable interface, which is a good convention to follow. The Person.CompareTo method makes use of the String.CompareTo method. Using the CompareTo methods of primitive types in your own CompareTo methods is also a good convention to follow, where it’s feasible.

As you can see, two Person objects will first be compared by their LastName properties and then, if those are equivalent, they will be compared by their FirstName properties. Note that I say “equivalent” rather than “equal”. We might reimplement our CompareTo method to ignore case, in which case strings that are not equal may still be equivalent.

We can now sort an array or collection of Person objects automatically like so:

C#

List<Person> people = new List<Person>();
 
people.Add(new Person("Mary", "Smith"));
people.Add(new Person("John", "Williams"));
people.Add(new Person("John", "Smith"));
people.Add(new Person("Andrew", "Baxter"));
 
Console.WriteLine("Before sorting:");
 
foreach (Person person in people)
{
    Console.WriteLine("{0}, {1}",
                      person.LastName,
                      person.FirstName);
}
 
people.Sort();
 
Console.WriteLine("After sorting:");
 
foreach (Person person in people)
{
    Console.WriteLine("{0}, {1}",
                      person.LastName,
                      person.FirstName);
}

VB

Dim people As New List(Of Person)
 
people.Add(New Person("Mary", "Smith"))
people.Add(New Person("John", "Williams"))
people.Add(New Person("John", "Smith"))
people.Add(New Person("Andrew", "Baxter"))
 
Console.WriteLine("Before sorting:")
 
For Each person As Person In people
    Console.WriteLine("{0}, {1}", _
                      person.LastName, _
                      person.FirstName)
Next
 
people.Sort()
 
Console.WriteLine("After sorting:")
 
For Each person As Person In people
    Console.WriteLine("{0}, {1}", _
                      person.LastName, _
                      person.FirstName)
Next

In the next instalment we’ll look at how to perform custom sorting.

Part 2 here
Part 3 here

2 comments:

Anonymous said...

Hey John,

Great to see you finally got a blog up and running, after all you have a very clear writing style.

Glad to see someone write about the important topics that are fundamental to being a good programmer rather than yet another bleeding edge concept blog.

DeanMc

LipSen said...

i readed, and know i knowledged it,
thanks a lot