Immutable types: understand their benefits and use them

There is a powerful and simple concept in programming that I think is really underused: Immutability

Basically, an object is immutable if its state doesn’t change once the object has been created. Consequently, a class is immutable if its instances are immutable.

There is one killer argument for using immutable objects: It dramatically simplifies concurrent programming. Think about it, why does writing proper multithreaded programming is a hard task? Because it is hard to synchronize threads accesses to resources (objects or others OS things). Why it is hard to synchronize these accesses? Because it is hard to guarantee that there won’t be race conditions between the multiple write accesses and read accesses done by multiple threads on multiple objects. What if there are no more write accesses? In other words, what if the state of the objects threads are accessing, doesn’t change? There is no more need for synchronization!

Of course I simplify here so let’s dig a bit.

A famous immutable class

There is one famous immutable class: System.String. When you think that you are modifying a string, you actually create a new string object. Often, we forget about it and we would like to write …

string str = ”foofoo”;

str.Replace(“foo”, “FOO”);

…where we need to write instead:

str = str.Replace(“foo”, ”FOO”);

Of course, doing so comes at the cost of creating multiple string objects in memory when doing some intensive string computation. In this case you need to use the System.Text.StringBuilder class that provides a safe way to work with mutable string.

Actually, string objects are not that immutable and as far as I know there are at least 2 ways to break string immutability. With pointers as shown by this code example and with some advanced System.Reflection usage.

So why .NET engineers decided that string should be immutable? Because programmers will never get a race conditions because of a corrupted string. Also because string are well adapted to be key in hashtables (i.e Sytem.Collections.generic.Dictionary<K,V>). Hashtables are almost a magic way to enhance dramatically performance of your code. (I said magic because under the hood hashtables rely on prime numbers properties and prime numbers are magic!).

The objects on which the hash values are computed must be immutable to make
sure that the hash values will be constant in time. Indeed, hash value is
computed from the state of the object (or eventually a sub-state of the object,
then only this sub-state must be immutable).

Another cool thing about string immutability is that even though System.String is a class, string objects get compared with equivalence, as a value type. This is possible because we can consider that the identity of an immutable object is its state. For example:

string str1 = ”foofoo”;

string strFoo = ”foo”;

string str2 = strFoo + strFoo;

// Even thought str1 and str2 reference 2 different objects

// the following assertion is true.

Debug.Assert(str1 == str2);

Purity vs. Side-effect
So we now know at least 3 great benefits of immutable objects:

  • They simplify multithreaded programming.
  • They can be used as hashtables key.
  • They simplify state comparison.

We can now be more general and say that the primary benefit of immutable types come from the fact that they eliminate side-effects. I couldn’t say it better than Wes Dyer so I quote him:

We all know that generally it is not a good idea to use global variables.  This is basically the extreme of exposing side-effects (the global scope). Many of the programmers who don’t use global variables don’t realize that the same principles apply to fields, properties, parameters, and variables on a more limited scale: don’t mutate them unless you have a good reason.(…)

One way to increase the reliability of a unit is to eliminate the side-effects. This makes composing and integrating units together much easier and more robust.  Since they are side-effect free, they always work the same no matter the environment.  This is called referential transparency.

Immutable classes in C#

C# supports immutability thanks to 2 keywords: const and readonly. They are used by the C# compiler to ensure that the state of a field won’t be changed once an object is created. Why 2 keywords? Because the readonly keyword allows state modification within constructor(s) while the const keyword doesn’t. For example:

class Article {
Article(string name,int price)    {
m_Name = name; // <- Compilation error
m_Price = price;
}
   const string m_Name = ”Ballon”;
   readonly int m_Price;
}

At this point you might wonder what if my object references another object through a read-only fields? Does the state of the referenced object can change? The answer is yes, this is the classical shallow vs. deep paradigm. Eric Lippert from the C# team has a nice post describing these different kinds of immutability. Actually Eric did a great range of posts on how to code efficient immutable collections, which might sound paradox since collections are known as super-mutable objects. Check its posts done since november 2007! Also Eric wrote:

Immutable data structures are the way of the future in C#.

So stay tuned!

Immutable closures in C#2

The functional coding-style is more adapted to immutable state than the imperative one. Wesner Moise often praises on its blog the use of functional closures as a mean to achieve immutability. For example, here is a simple program that defines immutable objects. These objects are affine transformer that take an integer x and that compute a*x+b. Thus, the state of our immutable affine transformer object are the a and b parameters. Here is the code:

using System.Diagnostics;
class Program   {
delegate int DelegateType(int x);
static DelegateType MakeAffine(int a, int b) {
return delegate(int x) { return a * x + b; };
}
static void Main() {
DelegateType affine1 = MakeAffine(2, 1);
DelegateType affine2 = MakeAffine(3, 4);
Debug.Assert(affine1(5) == 11);  // 2*5+1 == 11
Debug.Assert(affine2(6) == 22);  // 3*6+4 == 22
}
}

If you are not acquainted with closure this code might surprise you. Behind your back, the C# compiler has created an immutable class to represent these affine transformer objects! I wrote an article about closure in C# that explains all this.

Immutable anonymous types in C#3 and VB9

C#3 comes with the interesting anonymous types feature. Anonymous types built by the C#3 compiler are immutable, all fields are private and all properties are read-only (it is always instructive to check by yourself with Reflector).

var affine = new { A = 3, B = 4 };

affine.A = 3; // <- Compilation error

On this post, Tim Ng from the VB.NET team, wrote:

The motivating factor for driving the immutable anonymous types was because the LINQ APIs used hash tables internally and returning projections of anonymous types that could be modified was a dangerous situation.

Interestingly enough, the VB team decided that their anonymous types wouldn’t be immutable by default, which means that you can write such things:

Dim affine = New With {.A = 3, .B = 4}

affine.A = 5

However, the VB team added the possibility to specify that a certain property would be read-only thanks to the Key keyword:

Dim affine = New With {Key .A = 3, .B = 4}

affine.A = 5    ‘ <- Compilation error

About the motivation behind having mutable anonymous types in VB Tim Ng wrote:

…but for Visual Basic, we decided that because we have the ability to late bind on top of anonymous types, making them immutable is unexpected.

Immutable support in NDepend and CQL

As we saw, immutability is a feature that can be enforced at compile-time. In other words it can be enforced by static analysis tools. Thus, the Code Query Language (CQL) that comes with the static analysis tool NDepend has an IsImmutable condition that applies on types. To know which types of your code base are immutable it is as easy as
writing this CQL query:

SELECT TYPES WHERE IsImmutable

To constraint a particular type MyNamespace.Foo to be immutable you can write this CQL constraint:

WARN IF Count != 1IN SELECT TYPES WHERE IsImmutable AND FullNameIs”MyNamespace.Foo”

To constraint a range of types used by the class MyNamespace.Foo to be immutable:

WARN IF Count > 0 IN SELECT TYPES WHERE IsUsedBy ”MyNamespace.Foo” AND !IsImmutable

To constraint a range of types declared in the namespace MyNamespace to be immutable:

WARN IF Count > 0 IN SELECT TYPES FROM NAMESPACES “MyNamespace “ WHERE !IsImmutable

To constraint a range of types tagged with an attribute MyNamespace.MyImmutableAttribute to be immutable:

WARN IF Count > 0 IN SELECT TYPES WHERE IsDirectlyUsing ”MyNamespace.MyImmutableAttribute” AND !IsImmutable

In a near future we will add the condition HasAttribute and we will be able to write more properly:

WARN IF Count > 0 IN SELECT TYPES WHERE HasAttribute “MyNamespace.MyImmutableAttribute” AND !IsImmutable

There is something about immutability that we didn’t mention: all primitive value types (int, double, decimal, byte, bool…) are immutable. More generally it is recommended to make all structures immutable. It makes them suited for hash value computation, suited for equivalence comparison, and creating more structure instances won’t overwhelm the GC. To guarantee that all your structures are immutable just write this constraint:

WARN IF Count > 0 IN SELECT TYPES WHERE IsStructure AND !IsImmutable

So, what’s behind the IsImmutable CQL condition? Here are the rules that NDepend uses to decide if a type is immutable or not:

  • A type with at least one non-private instance field is considered as mutable (because such a field can be eventually modified outside of the type).
  • A type that has a method that is not a constructor and that assign an instance field is considered as mutable.
  • A class that derives directly or indirectly from a mutable type is considered as mutable.
  • Enumeration, static type, type defined in tier assemblies and delegate classes are never considered as immutable. Although these types might match the definition of immutability, considering them as immutable would disturb developers while they care for their own immutable types.
  • Particularly, classes that derive directly or indirectly from a class defined in a tier assembly that is not the System.Object class, is never considered as immutable.

Beside the IsImmutable condition on types, we also added 2 conditions on methods: ChangesObjectState and ChangesTypeState. As the name suggest, ChangesObjectState match methods that are assigning an instance field of its class and the ChangesTypeState match methods that are assigning a static field of its class.

The condition ChangesObjectState matches constructors. A static method or a class constructor can also be matched, for example, if it modifies the state of an instance passed by reference. Notice also that an instance method of a type T that modifies an instance of T that is not the one referenced by the this reference is also matched.

The condition ChangesTypeState matches class constructor, constructors, instance methods and static methods.

These 2 conditions can be used to see at a glance which methods can change the state of your program, in other words, which methods are mutable or non-const, or better said, which methods provoke side-effects. And as Wes Dier wrote: One way to increase the reliability of a unit is to eliminate the side-effects.

SELECT METHODS WHERE ChangesObjectState AND !IsConstructor AND !IsClassConstructor

SELECT METHODS WHEREChangesTypeState AND !IsConstructor AND !IsClassConstructor

ChangesObjectState can be also used as the good old C++ keyword const that make sure that a method won’t corrupt the object state:

WARN IF Count != 1 IN SELECT METHODS WHERE ChangesObjectState AND FullNameIs “MyNamespace.Foo.MyConstMethod()”

This entry was posted in code organization, Code Query, Code Rule, code structure, Immutability. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Federico Orlandini

    Good post.

  • http://javarevisited.blogspot.com Javin Paul

    good explanation ,keep posted.

  • http://javarevisited.blogspot.com Javin Paul

    I have compiled some reasons for String immutability,

    See here

    http://javarevisited.blogspot.com/2010/10/why-string-is-immutable-in-java.html

  • http://codebetter.com/members/ya3mro/default.aspx ya3mro

    thnx for this great article but i have a question about ur sentence :
    Hashtables are almost a magic way to enhance dramatically performance of your code. (I said magic because under the hood hashtables rely on prime numbers properties and prime numbers are magic!)

    could u tell me what’s magic in usgin prime numbers ?

  • http://opxyiea.com/yoyrryo/5.html Pharmg653

    Very nice site!

  • Duncan Godwin

    I’m evaluating NDepend at the moment with an eye to identifying types that potentially don’t use locking correctly. The above looks snippets look like they could save me significant time – thanks.

    Do you have any plans on adding direct support to help with detecting concurrency issues?

  • http://www.myspace.com/yamum yamum

    Patrick clearly cant write a blog to save his own life.

  • http://www.NDepend.com Patrick Smacchia

    Of course states are changing, don’t take me wrong.

    The idea is that when a thread is changing a state it needs to create a new object that is not visible from other threads as in…

    string str1 = “foofoo”;
    string str2 = str1.Replace(“foo”, “FOO”);

    …the object referenced by the str2 reference is not visible from other threads and still you get your new “FOOFOO” state.

  • David Heffernan

    I’m doing concurrency because I want to calculate lots of things in parallel. I want them to change. Immutable types don’t help concurrency much. Sure you can prevent accidental changes to things that shouldn’t change. Of course that’s just as valuable when coding single threaded. I really don’t think you can have done much concurrent programming.

  • http://weblogs.asp.net/bleroy Bertrand Le Roy

    Luca Bolognese a une très bonne série d’articles sur les types immuables:
    http://blogs.msdn.com/lucabol/archive/2007/12/03/creating-an-immutable-value-object-in-c-part-i-using-a-class.aspx

  • http://codebetter.com/blogs/peter.van.ooijen/ pvanooijen

    Perhaps nice to note in the context:
    Since version 3(.1?) Resharper will give you hints on fields which can be declared as readonly. With code optimization just an alt-enter away.

  • http://www.NDepend.com Patrick Smacchia

    TIA,

    As explained, immutable types have numerous benefits and come at the cost of more memory consumption.
    There is no rule if business objects, technical objects, or any kind of other concern objects should be immutable or not. You have to evaluate case by case if it is worth it or not. Often, having some code with a lot of concurrent access is a good indication that immutability might be a good choice.

  • http://www.e-Crescendo.com jdn

    I’m glad you posted this as I’ve been thinking about it a bit lately.

    Since I’m not clear on any number of things about the concept, let me ask a basic question:

    Most of my ‘types’ are domain objects, like Order, Customer, etc.

    Can I apply the concept of immutability to them? If so, how? If not, then what does the concept of immutability bring me?

    TIA.

  • http://www.bluespire.com/blogs Christopher Bennage

    Good stuff! Thanks!