There is a powerful and simple concept
in programming that I think is really underused: Immutability
Basically, an object is immutable if
its state doesn’t change once the object has been created. Consequently, a class
is immutable if its instances are immutable.
There is one killer argument for using
immutable objects: It dramatically simplifies concurrent programming. Think
about it, why does writing proper multithreaded programming is a hard task? Because
it is hard to synchronize threads accesses to resources (objects or others OS things). Why it
is hard to synchronize these accesses? Because it is hard to guarantee that
there won’t be race conditions between the multiple write accesses and read
accesses done by multiple threads on multiple objects. What if there are no
more write accesses? In other words, what if the state of the objects threads
are accessing, doesn’t change? There is no more need for synchronization!
Of course I simplify here so let's dig a bit.
A famous immutable class
There is one famous immutable class: System.String. When you think that you
are modifying a string, you actually create a new string object. Often, we
forget about it and we would like to write …
string str = "foofoo";
str.Replace("foo",
"FOO");
…where we need to write instead:
str
= str.Replace("foo", "FOO");
Of course, doing so comes at the cost
of creating multiple string objects in memory when doing some intensive string
computation. In this case you need to use the System.Text.StringBuilder
class that provides a safe way to work with mutable string.
Actually, string objects are not that
immutable and as far as I know there are at least 2 ways to break string
immutability. With pointers as shown by this code example and with some advanced System.Reflection usage.
So why .NET engineers decided that
string should be immutable? Because programmers will never get a race
conditions because of a corrupted string. Also because string are well adapted
to be key in hashtables (i.e Sytem.Collections.generic.Dictionary<K,V>).
Hashtables are almost a magic way to enhance dramatically performance of your
code. (I said magic because under the
hood hashtables rely on prime numbers properties and prime numbers are magic!).
The objects on which the hash values are computed must be immutable to make
sure that the hash values will be constant in time. Indeed, hash value is
computed from the state of the object (or eventually a sub-state of the object,
then only this sub-state must be immutable).
Another cool thing about string
immutability is that even though System.String
is a class, string objects get compared with equivalence, as a value type. This
is possible because we can consider that the identity of an immutable object is
its state. For example:
string str1 = "foofoo";
string strFoo = "foo";
string str2 = strFoo + strFoo;
// Even thought str1 and str2 reference 2 different
objects
// the following assertion is true.
Debug.Assert(str1 == str2);
Purity vs. Side-effect
So we now know at least 3 great benefits
of immutable objects:
- They
simplify multithreaded programming.
- They can be
used as hashtables key.
- They simplify
state comparison.
We can now be more general and say
that the primary benefit of immutable types come from the fact that they
eliminate side-effects. I couldn’t say it better than Wes Dyer so I quote him:
We
all know that generally it is not a good idea to use global variables. This is basically the extreme of exposing side-effects
(the global scope). Many of the programmers who don't use global variables
don't realize that the same principles apply to fields, properties, parameters,
and variables on a more limited scale: don't mutate them unless you have a good
reason.(…)
One
way to increase the reliability of a unit is to eliminate the side-effects.
This makes composing and integrating units together much easier and more
robust. Since they are side-effect free,
they always work the same no matter the environment. This is called referential transparency.
Immutable classes in C#
C# supports immutability thanks to 2
keywords: const and readonly. They are used by the C#
compiler to ensure that the state of a field won’t be changed once an object is
created. Why 2 keywords? Because the readonly
keyword allows state modification within constructor(s) while the const keyword doesn’t. For example:
class Article
{
Article(string
name,int price) {
m_Name = name; //
<- Compilation error
m_Price = price;
}
const string
m_Name = "Ballon";
readonly int
m_Price;
}
At this point you might wonder what if
my object references another object through a read-only fields? Does the state
of the referenced object can change? The answer is yes, this is the classical
shallow vs. deep paradigm. Eric Lippert
from the C# team has a nice post describing these different kinds of immutability. Actually Eric did a great range
of posts on how to code efficient immutable collections, which might sound
paradox since collections are known as super-mutable objects. Check its posts
done since november 2007! Also Eric wrote:
Immutable
data structures are the way of the future in C#.
So stay tuned!
Immutable closures in C#2
The functional coding-style is more
adapted to immutable state than the imperative one. Wesner Moise often praises on its blog the use of functional closures as a mean to
achieve immutability. For example, here is a simple program that defines
immutable objects. These objects are affine transformer that take an integer x and that compute a*x+b. Thus, the state of our immutable affine transformer object
are the a and b parameters. Here is the code:
using System.Diagnostics;
class Program
{
delegate int DelegateType(int
x);
static DelegateType MakeAffine(int
a, int b) {
return delegate(int x) { return a * x + b; };
}
static void Main() {
DelegateType
affine1 = MakeAffine(2, 1);
DelegateType
affine2 = MakeAffine(3, 4);
Debug.Assert(affine1(5)
== 11); //
2*5+1 == 11
Debug.Assert(affine2(6)
== 22); //
3*6+4 == 22
}
}
If you are not acquainted with closure
this code might surprise you. Behind your back, the C# compiler has created an immutable
class to represent these affine transformer objects! I wrote an article about closure in C# that explains all this.
Immutable anonymous types in C#3 and VB9
C#3 comes with the interesting anonymous
types feature. Anonymous types built by the C#3 compiler are immutable, all
fields are private and all properties are read-only (it is always instructive
to check by yourself with Reflector).
var affine = new
{ A = 3, B = 4 };
affine.A
= 3; // <- Compilation error
On this post, Tim Ng from the VB.NET team, wrote:
The motivating factor for driving the immutable
anonymous types was because the LINQ APIs used hash tables internally and
returning projections of anonymous types that could be modified was a dangerous
situation.
Interestingly enough, the VB team
decided that their anonymous types wouldn’t be immutable by default, which
means that you can write such things:
Dim affine = New
With {.A = 3, .B = 4}
affine.A =
5
However, the VB team added the
possibility to specify that a certain property would be read-only thanks to the
Key keyword:
Dim affine = New
With {Key .A =
3, .B = 4}
affine.A
= 5 ‘ <- Compilation error
About the motivation behind having
mutable anonymous types in VB Tim Ng wrote:
…but for Visual Basic, we decided that because
we have the ability to late bind on top of anonymous types, making them
immutable is unexpected.
Immutable support in NDepend and CQL
As we saw, immutability is a feature
that can be enforced at compile-time. In other words it can be enforced by
static analysis tools. Thus, the Code Query Language (CQL) that comes with the static analysis tool NDepend has an IsImmutable condition that applies on
types. To know which types of your code base are immutable it is as easy as
writing this CQL query:
SELECT TYPES WHERE IsImmutable
To constraint a particular type MyNamespace.Foo to be immutable you can
write this CQL constraint:
WARN IF Count != 1 IN
SELECT TYPES WHERE IsImmutable AND FullNameIs "MyNamespace.Foo"
To constraint a range of types used by
the class MyNamespace.Foo to be
immutable:
WARN IF Count > 0 IN
SELECT TYPES WHERE IsUsedBy "MyNamespace.Foo"
AND !IsImmutable
To constraint a range of types
declared in the namespace MyNamespace
to be immutable:
WARN IF Count > 0 IN
SELECT TYPES
FROM NAMESPACES "MyNamespace " WHERE !IsImmutable
To constraint a range of types tagged
with an attribute MyNamespace.MyImmutableAttribute
to be immutable:
WARN IF Count > 0 IN
SELECT TYPES WHERE IsDirectlyUsing "MyNamespace.MyImmutableAttribute"
AND !IsImmutable
In a near future we will add the
condition HasAttribute and we will
be able to write more properly:
WARN IF Count > 0 IN
SELECT TYPES WHERE HasAttribute "MyNamespace.MyImmutableAttribute"
AND !IsImmutable
There is something about immutability
that we didn’t mention: all primitive value types (int, double, decimal, byte,
bool…) are immutable. More generally it is recommended to make all structures immutable.
It makes them suited for hash value computation, suited for equivalence comparison,
and creating more structure instances won’t overwhelm the GC. To guarantee that
all your structures are immutable just write this constraint:
WARN IF Count > 0 IN
SELECT TYPES WHERE IsStructure AND !IsImmutable
So, what’s behind the IsImmutable CQL condition? Here are the
rules that NDepend uses to decide if a type is immutable or not:
- A type with
at least one non-private instance field is considered as mutable (because such
a field can be eventually modified outside of the type).
- A type that
has a method that is not a constructor and that assign an instance field is
considered as mutable.
- A class
that derives directly or indirectly from a mutable type is considered as
mutable.
- Enumeration,
static type, type defined in tier assemblies and delegate classes are never
considered as immutable. Although these types might match the definition of
immutability, considering them as immutable would disturb developers while they
care for their own immutable types.
- Particularly,
classes that derive directly or indirectly from a class defined in a tier
assembly that is not the System.Object
class, is never considered as immutable.
Beside the IsImmutable condition on types, we also added 2 conditions on
methods: ChangesObjectState and ChangesTypeState. As the name suggest, ChangesObjectState match methods that
are assigning an instance field of its class and the ChangesTypeState match methods that are assigning a static field of
its class.
The condition ChangesObjectState matches constructors. A static method or a class constructor can
also be matched, for example, if it modifies the state of an instance passed by
reference. Notice also that an instance method of a type T that modifies an
instance of T that is not the one referenced by the this reference is
also matched.
The condition ChangesTypeState matches class constructor, constructors, instance methods and static methods.
These 2 conditions can be used to see
at a glance which methods can change the state of your program, in other words,
which methods are mutable or non-const, or better said, which methods provoke side-effects.
And as Wes Dier wrote: One way to
increase the reliability of a unit is to eliminate the side-effects.
SELECT METHODS WHERE ChangesObjectState AND !IsConstructor AND !IsClassConstructor
SELECT METHODS WHERE ChangesTypeState AND !IsConstructor AND !IsClassConstructor
ChangesObjectState can be also used as the good old C++ keyword const that make sure that a method won’t
corrupt the object state:
WARN IF Count != 1 IN SELECT METHODS
WHERE ChangesObjectState AND FullNameIs "MyNamespace.Foo.MyConstMethod()"
Posted
Sun, Jan 13 2008 12:58 PM
by
Patrick Smacchia