LINQ Performance: Some Case Studies

One essential requirement while developing Code Query and Rule through LINQ (CQLinq) has been performance, both performance of query compilation and performance of query execution. The reason is simple: we want hundreds of CQLinq rules to be verified right inside Visual Studio in a few seconds, to let the developer be quickly informed of problems. Hence, we’ve targeted and achieved 100 queries compiled/executed per second against a real-world large code base.

To achieve this goal, a large research effort has been made. In the product itself, a query is compiled and then executed live at edition time. In the screenshot below, we can see the query execution duration measurer at the top right (here 4 millisecond).

This instant measurer has been proven very useful to optimize CQLinq and the 200 default rules. Hopefully we’ve learnt a lot about LINQ performances. Several re-usable patterns are listed in the documentation CQLinq Performance and I’d like here to digress a bit deeper on these findings.

HashSet<T>, Dictionary<K,V>, Lookup<K,V> my dear friends

Often in my past posts, I claimed that hash tables (HashSet<T>, Dictionary<K,V>, Lookup<K,V) are one of the best tool to optimize performances. Of course they played a major role to optimize many code queries. Last year I wrote about LINQ Intersect() 2.7x faster with HashSet and in the v4 RTM, the concrete conclusion of this study is the class NDepend.Helpers.ExtensionMethodsSet of the NDepend.API. For example without the single call to ToHashSet() the query below would be 200 times slower! This is because the call to one overload of Intersect() is very fast, because it takes account of the fact that one of the collections to intersect is a HashSet<T> with O(1) search:

It is disappointing that these simple, common yet efficient optimizations are not proposed by default by the .NET Fx.

Below is the rule Do not hide base class methods source code. It took us significant effort to make it fast enough for our requirement. Here again the key to high performance has been a hash table, more specifically a lookup table. Methods are indexed by their names (including their signature) in the lookup table to let us retrieve quickly the set of methods potentially hidden by a method:

Another case where HashSet<T> was the key to performance, is the rule Avoid naming types and namespaces with the same identifier. Here we want to match types whose name is also a namespace simple name (for example the simple name of the namespace System.Collections.Generic is Generic) . The fastest approach is to first create a hashset containing all namespaces simple name, and then for each type, check if some namespace share the same name:

Tricky Performance Enhancement

Some rules were pretty tricky to write to achieve acceptable performance. One of them is Methods should be declared static if possible. This rule is especially computation intensive since for each instance method we need to check if it is not using any other instance method or field of its parent type, or any of its base class. Here two optimisations were harnessed:

  • First, we filter types that cannot have instance method to turn into static method (interface, static class, delegates…). Then we filter methods of selected types (static, abstract, virtual, access the this reference). Filtering first objects with trivial/fast-to-check criterias is the rule of thumb to optimize LINQ queries.
  • Second, we call the extension method FirstOrDefault() to evaluate as fast as possible if the instance method is calling another instance member. This is a good example where FirstOrDefault() can replace and be faster than a call to Count() or Any(). Many approach were tried here, like first getting all base classes of the parent type and reuse this set accross all members. But the query execution measurer tells us without any doubt that this approach is the fastest one.

LINQ Expression and Performances

Before compiling a CQLinq query with the C# compiler, we’ve included a smart preprocessor that solves in a variety of cases, the classic dilemma readability vs. performance. For example, if you wish to use a regular expression in your query, it is tedious to first create and compile your regular expression object, and then use it. It would be more readable to create, compile and use the regular expression in the same code spot, but it would be terrific for performance if this code spot is in a loop.

This is why when calling methods like NameLike() like for example, in the rule below, Methods name should begin with an Upper character, the CQLinq preprocessor transforms the LINQ expression to:

  1. relieve the user from creating and compiling the regular expressions.
  2. make sure that the regular expression is created and compiled just once.

This particular LINQ expression magic trick has been used extensively to make many rules both more readable and more performant. This is why we’ve created the dedicated namespace NDepend.Reserved.CQLinq that contains these methods whose calls are transformed by the CQLinq preprocessor. Obviously these methods shouldn’t be used from a program that relies on NDepend.API since in this situation, there is no CQLinq preprocessor.

A Word on Time-Out

On a related topic, we’ve also implemented a tunable time-out set to 2 seconds by default for CQLinq query execution. Internally, the time-out manager code is based on this great piece of OSS code that is concise, elegant, efficient and AFAIK bug-free.

Conclusion

Clearly pretty much any realistic performance goal is reachable with a sufficient amount of effort. One more time, with a lot of effort we’ve verified this tenet.

We can observe the same gain on the Visual Studio side. VS2010 was so slower than VS2008 that personally, I never really upgraded to VS2010. But having used VS2012 Beta and RC for months, I can attest that the performance boost is huge. I can’t wait to upgrade all our dev from VS2008 to VS2012 RTM :)

This entry was posted in CQLinq, LINQ, Performance. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Patrick Smacchia

    Yes we did and using array is always faster than using anything else, including yield return. Arrays come with their own IL and processor instruction and array is the mother of all other collections implementation. 

    Especially in the method ToEnumerable() it is faster to return new[] { element };than toyield return elem;

  • Robert Slaney

    Just out of interest, have you tried the same benchmarking using yield return instead of building internal lists/arrays like the WithNameIn method does.