In a previous post Lesson
learned from a real-world focus on performance,
I explained how we increased the NDepend
static analysis process performances with a 4 factor. The post quantified
performance gain, and interestingly enough 25% came from micro-optimizations.
Basically the lesson learned was:
- Micro-optimizations
are worth it while premature optimizations are the root of evil (as Donald Knuth said a long
time ago). The difference between micro-optimizations and premature
optimizations is that the first ones are driven by measurement while the second
ones are driven by guesses, intuitions and hunches.
Today, I would like to enumerate
some micro-optimization tips we found effective. Please, constantly keep in mind
that these tips are not advices on how you should develop all your code.
- Do your code your way,
- then measure performance,
- then apply
micro-optimization tips where performance problems are detected.
Also as explained in the
previous post, the bulk of performance gain will come anyway from better algorithms.
Micro-optimizations should come after having rationalizing your algorithms. Keep the bang for buck tenet in mind: Invest first your time and energy in things that will give you the best value.
Prefer for loops over foreach loops
I wrote a controversial
post,
claiming that using for loop instead of foreach loops has been proven to be effective for us. Try for your code and see if it also works for you.
Prefer array over List<T> if possible
In the loop post, I also explained
that using arrays instead of List<T>
is also a good way to enhance performance. This is not a surprise considering
that on one hand the CLR and the IL are both optimized to work with arrays, and
on the other hands, List<T>
implementation relies on arrays. It is interesting to decompile List<T> and see that it comes with
a _items : T[] field assigned each
time that the capacity of the list is changing. Of course arrays length remains
constant and this is why List<T>
is sometime more suited. We even noticed that when doing intensive loops on a list, it might be worth getting a temporary array from the List<T> through the method List<T>.ToArray(), and then perform your loops on the array.
Refactor your algorithms to rely on hash table
constant time search
The absolutely killer
performance collections are hash tables, implemented through HashSet<T>
and Dictionary<K,V>. Here also I wrote a post
on comparing different hash table implementations performance.
Obviously, hash
table should be used only when there is the need to look out for an
object
amongst many other objects. But with a bit of astute, it seems to me
that most
of algorithms can be rewritten in order to rely on the hash table magic
which is: checking if a set contains a particular object, in a constant time (i.e independently from the size of the set) and perform
a search operation in a set in a constant time. More explanation here.
Maybe the hash table
trick is more an algorithm rewriting tips than a micro-optimization tips, but I
really do think it is essential to spread the word about it.
Use CLR primitive types instead of typed data
The set of primitive type
is pretty limited: integers types (int, short, ulong, byte…), float numbers types
(double, float, decimal), boolean, string and char (maybe we could add also enumerations). But think about it: no
matter how complex an object grape is, it is an aggregation of data stored with
primitive types instances + maybe some unmanaged handles.
The fact is that both the
CLR/JIT compiler and the IL language are super optimized to cop with primitive
types. So it is performance wise to use a primitive types instead of a
typed data. For example, in .NET paths are represented with simple string. This
is something that really bothers me and for NDepend we created our own
library to represent and manipulate path: NDepend.Helpers.FilePathDirectory.
With this library paths are typed and frankly when doing intensive path
operations this is a blessing. But if in the future we figure out that some specific
algorithms relying on typed paths are costly, we won’t hesitate to temporarily represent
paths with raw strings.
Use POCO instead of aggregated object
POCO stands for Plain Old
CLR Object.
A POCO is an object that doesn’t reference nor require complex infrastructure
to be used. A complex infrastructure can be a database connection or a file
handle. IMHO (and only In Humble My Opinion) a complex infrastructure can be any object that is not a CLR
primitive type. A POCO is then an object that doesn’t use anything else that
just CLR primitive types. As mentioned above, primitive types are optimized and
having POCO is consequently better for performance. Another good advocate for POCO is that such objects increase also
design and maintenance. Indeed, POCO avoid creating overly complex custom
classes interactions.
Notes: There are debates in comments of this post about my quick definition of POCO. I won't play with words and, please, refer to the POCO wikipedia definition, which is not very different from what I wrote anyway.
Bufferize properties getter result in local
variable
Here is an example:
void MethodNotOptimized(Order order) {
string firstName = order.Client.PersonName.FirstName;
string lastName = order.Client.PersonName.LastName;
}
void MethodOptimized(Order order) {
PersonName personName = order.Client.PersonName;
string firstName = personName.FirstName;
string lastName = personName.LastName;
}
No matter how smart is the
JIT compiler, I doubt it does the analysis to see that the object returned by
some property getters (here order.Client.PersonName) is constant.
Of course here the performance gain is very small, it is just saving 2 getter/method
calls. But browse your code and you’ll see that this tips can be applied also in
many costly scenario, especially those involving loops, meaning you can save N getter/methods calls.
This example shows well
that unfortunately, micro-optimization often degrade the code elegance.
void MethodNotOptimized(Order order) {
foreach(Product product in order.Products) {
product.SetClientName(
order.Client.PersonName.FirstName,
order.Client.PersonName.LastName);
}
}
void MethodOptimized(Order order) {
PersonName personName = order.Client.PersonName;
string firstName = personName.FirstName;
string lastName = personName.LastName;
IList<Product> products = order.Products;
int productsCount = products.Count;
for(int i=0; i< productsCount; i++) {
Product product = products [ i ] ;
product.SetClientName(firstName, lastName);
}
}
Non-private fields can make sense
Indeed, and I know that
this one might shock you. But even if the JIT compiler might be smart enough to do it for you, it can
save a few CPU cycles to access directly a field instead of using a property. I
personally use this trick scarcely, when I measured that this will indeed
improve performance. And some algorithms like ranking computation spend 60% of the time reading and assigning fields.
If possible, use the readonly keyword if you have the chance
to have an immutable field. For other scenario, I still protect my field accesses
with a CQL rule
that warns if the field is not read or assigned in specific scenario. And harnessing
the possibility to declare CQL rule in source code is particularly interesting
here:
using NDepend.Framework.CQL;
class Foo {
[CQLConstraintAttribute(
@"// <Name>m_Field should only be witten by FooUser.MethodWriter().</Name>
WARN IF Count > 0 IN SELECT METHODS WHERE
IsDirectlyWritingField ""Foo.m_Field"" AND
!NameIs ""FooUser.MethodWriter(Foo)""
", Active = true, DisplaySelectionViewInReport = true)]
internal int m_Field;
}
class FooUser {
internal void MethodWriter(Foo foo) {
foo = 3;
}
}
Test tier API to get the best of it
When performance is an
issue, it can be worth to re-think about how your code is using tier API. I have
a real-world example in mind. Once I wrote code that first checked if a file
existed, and then read the file last write time. After testing, it appears to
be cheaper to use a FileInfo object
to perform these 2 operations. I suppose that the FileInfo class bufferises some information to avoid some sort of path
operation the second time.
using System.IO;
void MethodNotOptimized(string path) {
if(File.Exists(path)) {
DateTime lastWriteTime = File.GetLastWriteTime(path);
}
}
void MethodOptimized(string path) {
FileInfo fileInfo = new FileInfo(path);
if(fileInfo.Exists(path)) {
DateTime lastWriteTime = fileInfo.LastWriteTime;
}
}
Know which branches of your code are the most used
And this last example illustrates
an important thing: know which branches of your code is the most used. In this previous
example, I knew that most of the time the file was supposed to exist. Thus I
didn’t even have to test if creating a FileInfo
object and then check for the file existence was more costly than just calling File.Exists().
Posted
Sun, Apr 19 2009 2:28 PM
by
Patrick Smacchia