An easy and efficient way to improve .NET code performances

 

Currently, I am getting
serious about measuring and improving performances of .NET code. I’ll post more
about this topic within the next weeks. For today I would like to present an efficient
optimization I found on something we all use massively: loops on collections. I talk here about the cost of the loop itself,
i.e for and foreach instructions, not the cost of the code executed in the
loop. I basically found that:

 

  • for loops on
    List<T> are a bit more than 2
    times cheaper
    than foreach loops on
    List<T>.
  • Looping on array is
    around 2 times cheaper than looping
    on List<T>.
  • As a consequence, looping
    on array using for is 5 times cheaper than looping on
    List<T> using foreach (which I
    believe, is what we all do).

 

The foreach syntax sugar is a blessing and for many complex loops it is
not worth transforming foreach into for. But for certain kind of loops,
like short loops made of one or two instructions nested in some other loops, these 5 times factor can become a huge
bonus. And the good news is that the code of NDepend has plenty of these nested short
loops because of algorithm such as Ranking,
Level
and dependencies transitive closure computation.

 

Here is the code used to benchmark all this:

 

 

 
Here are the results obtained with the System.Diagnostics.Stopwatch class:

 


 

The fact that for is more efficient than foreach results from the fact that foreach is using an enumerator object
behind the scene.

 

The fact that iterating on array is
cheaper than iterating on List<T> comes from the fact that the IL has
instructions dedicated for array iteration. Here is the Reflector view of the
methods:

 

 

 

In the console output, we can see that foreach on array is a tiny bit more
costly than for on array. Interestingly
enough, I just checked that the C# compiler doesn’t use an enumerator object
behind the scene for foreach on array.

 

Some more interesting finding.
The profiler dotTrace v3.1 gives some unrealistic results for this benchmark. IterateForeachOnList is deemed more than 40 times more costly than IterateForOnArray (instead of 5 times).
I tested both RecordThreadTime and RecordWallTime mode of the profiler:

 

 

 

Thus I also ran the
benchmark with ANTS Profiler v4.1
and got some more realistic results, still not perfect however (IterateForeachOnList is now deemed 8
times more costly than IterateForOnArray,instead
of 5 times).

 

 

 

You can download the code
of the benchmark here.
All tests have been done with release compilation mode.

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Yogesh

    BTW: I forgot to mention that I took the third iteration’s time so that none of the time goes in initialisation or overheads.

  • Yogesh

    Dear patrick,

    This article is more of a generalised one. The results are not always true. Here is some code which uses a View Model for a wpf tree transversal and the results are there for you to see:
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int groupCount = commandTree.Count;
    for (int i = 0; i < groupCount; i++)
    {
    var subGroupList = commandTree[i].Children;
    int subGroupCount = subGroupList.Count;

    for (int j = 0; j < subGroupCount; j++)
    {
    var itemList = subGroupList[j].Children;
    int itemCount = itemList.Count;

    for (int k = 0; k < itemCount; k++)
    {
    action(itemList[k]);
    }
    }
    }
    sw.Stop();
    Trace.WriteLine(“Part 0 EvaluteCommandTree: ” + sw.ElapsedTicks);

    sw.Reset();
    sw.Start();
    foreach (CommandViewModel group in commandTree)
    {
    foreach (CommandViewModel subGroup in group.Children)
    {
    foreach (CommandViewModel item in subGroup.Children)
    {
    action(item);
    }
    }
    }
    sw.Stop();
    Trace.WriteLine(“Part 1 EvaluteCommandTree: ” + sw.ElapsedTicks);

    The result is:
    Part 0 EvaluteCommandTree: 30392
    Part 1 EvaluteCommandTree: 943

    Atleast it shows that complex for’s are slower than complex foreach’s!!

  • http://blogs.msdn.com/lucian Lucian Wischik

    I couldn’t replicate your results. For me, “for” on a list and “foreach” on a list had pretty much identical performance. (on some runs one was better, on some runs the other was better). With this code:

    public class Program
    {
    static int store;
    static void Main(string[] args)
    {
    var max = 100000000;
    System.Console.WriteLine(“constructing”);
    var xs = new System.Collections.Generic.List(max);
    for (int i = 0; i < max; i++) xs.Add(i);
    System.Console.WriteLine(“testing”);
    for (int iloop = 1; iloop <= 2; iloop++)
    {
    System.Console.WriteLine(“for”);
    var length = xs.Count;
    var s = new System.Diagnostics.Stopwatch();
    s.Start();
    for (int i = 0; i < length; i++) store = xs[i];
    s.Stop();
    System.Console.WriteLine(“for: {0}ms”, s.ElapsedMilliseconds);
    s.Reset(); s.Start();
    foreach (int i in xs) store = i;
    s.Stop();
    System.Console.WriteLine(“foreach: {0}ms”, s.ElapsedMilliseconds);
    }
    }
    }

  • http://www.NDepend.com Patrick Smacchia

    Greg looking at IL is not my approach, my approach is to measure perf with Stopwatch and also .NET profilers.

    I look at IL just to give a coherent explanation of the result exposed but not vice-versa. I agree with you, just looking at IL without measuring perf doesn’t make sense.

  • Greg

    Patrick for discussion:

    http://codebetter.com/blogs/gregyoung/archive/2006/06/11/146343.aspx

    Your idea of looking at IL is fundamentally flawed as the CLR has an optimizing JIT as well.

    Cheers,

    Greg

  • http://www.NDepend.com Patrick Smacchia

    Here are the result Paulo, on List using list.Foreach() is the same cost as foreach while on array it is a much higher cost. In this example, the cost of the extra-delegate invocation seems to be the same as the cost of foreach.

    IterateForeachOnList: 5196157
    IterateForOnListWithoutCountOptimization: 2239475
    IterateForOnListWithCountOptimization: 1838549
    IterateListForeachWithDelegate: 5109180

    IterateForeachOnArray: 1011873
    IterateForOnArrayWithoutCountOptimization:1027431
    IterateForOnArrayWithCountOptimization: 1020517
    IterateArrayForeachWithDelegate: 4689408

  • http://PauloMorgado.NET/ Paulo Morgado

    Hi Patrick,

    Have you tried to benchmark Array.ForEach and List.ForEach?

    I would expect the use of array.Length in the for loop to be more performant than the use of a variable. Was this a debug build?

    Rico Mariani’s (http://blogs.msdn.com/ricom/) advice is always to measure. Looks like that’s what you did and what everyone should do. Your analysis shows that always using foreach is not always good but should not be inferred that using foreach is always bad.

  • http://www.NDepend.com Patrick Smacchia

    >This has been discussed to death through out the history of .NET.

    wekempf, I didn’t found some detailed analysis with solid benchmark on the web so from my point of view it hasn’t been discussed at all. One needs to test and measure before rambling on a subject, this is the only way scientist can work.

    Applying this trick for some part of my code improved performance from more than 50% and so I thought that readers could benefit from it.

    If I was only listening to mainstream advices such as ‘don’t switch to for’ I would have miss this huge performance improvement.

  • Vincent Jacquet

    IEnumerable has its virtue: isolation from the type of collection and its count of items, possible execution in parallel (the next value begin prepared on a thread while the current one is handled on another), possible lower memory footprint (once items are enumerated, they can be recycled).

    But of course, when dealing with optimization, arrays are often a better choice. Not only their “enumeration” using a for loop is faster, but they are also easier to use in “divide for conquer” types of parallel algorithm.

    Copying data from a List into an T[] can also provide some isolation from the data structure in the UI thread, while processing them in several background threads.

  • http://wekempf.spaces.live.com wekempf

    This has been discussed to death through out the history of .NET.

    On an Array, for vs. foreach is a wash. On other types, it’s basically anyone’s guess, but you can expect them to be no better than the same operation on an Array.

    However, as Eric Gunnerson said on this topic (http://blogs.msdn.com/ericgu/archive/2004/04/18/115566.aspx), changing to a for loop is generally a premature optimization. If you’ve got specific code that’s proven to be a bottle neck, and the switch works for you, then great! But this isn’t a general optimization, and shouldn’t just be applied to code under the assumption that it’s universally a good idea. There’s benefits to foreach even in the cases where it’s less performant, so if it’s not proven to be a hot point in your code, do NOT simply switch to using for.

  • Vincent Jacquet

    @Sergey: It looks like the “int j = i;” inside the foreach loop is removed in release mode. This could explain wihy it is a little bit faster than expected.

  • http://shishkin.org Sergey Shishkin

    After modifying your tests to eliminate JIT-compilation (pre-run all the test methods before the measurements) and compiling them in “Release” I got different results. Foreach over List is still slower than everything else, but not as dramatically:

    IterateForeachOnList: 2398126
    IterateForOnListWithoutCountOptimization: 1317010
    IterateForOnListWithCountOptimization: 1336468
    IterateForeachOnArray: 1039196
    IterateForOnArrayWithoutCountOptimization:1033318
    IterateForOnArrayWithCountOptimization: 879804

  • http://www.NDepend.com Patrick Smacchia

    What a nice idea, to use a tool like PostSharp to tweak the ‘foreach IL’ to transform it into ‘for IL’. This way we could get the best of both world, the syntax + the performance.

  • Kamran Sahhid

    Very Nice Article

  • will

    I haven’t used PostSharp yet but maybe it could be used to swap out loop code without mangling code readability?

  • aaronjensen

    ah yea, you’re right. my head was in a distant land.

  • http://www.cuttingedge.it/blogs/steven/pivot/entry.php?id=33 Steven

    aaronjensen,

    There is no boxing/unboxing happening here.

  • http://www.NDepend.com Patrick Smacchia

    Actually there is no boxing/unboxing here. Array is ‘generic’ since .NET V1.0 and doesn’t lead to boxing/unboxing. List neither lead to any boxing/unboxing, this is one big benefits of generic implementation in .NET (and a major advantage over Java).

    I did the test with string instead of int and the result is even more interesting:

    IterateForeachOnList: 947 956
    IterateForOnListWithoutCountOptimization: 529 602
    IterateForOnListWithCountOptimization: 486 664
    IterateForeachOnArray: 228 194
    IterateForOnArrayWithoutCountOptimization:178 033
    IterateForOnArrayWithCountOptimization: 178 902

  • aaronjensen

    Have you tried this with a non-primitive type? I wonder if unboxing the int skews the results.