Rambling on Cyclomatic Complexity

Normal

0

21

false

false

false

FR

X-NONE

X-NONE

Share/Bookmark

After the
number of Lines of Code,
the Cyclomatic Complexity (CC) is certainly the second most widely used code
metric. Here is its definition from Wikipedia
:

Cyclomatic complexity (or conditional
complexity) is a software metric (measurement). It was developed by Thomas J.
McCabe Sr. in 1976 and is used to measure the complexity of a program. It directly measures the number of
linearly independent paths through a program’s source code.

Despite its
formal definition number of linearly
independent paths
, CC is a practical code metric. Indeed there is an intuitive
way to compute CC : The CC value of a method is 1 +{the number of conditional statements contained in the
method body}. A conditional statement
is something very intuitive, in C# a conditional statement is a statement in
the group…

if | while | for | foreach | case |
default | continue | goto | && | || | catch | ternary operator ?: | ??

… while
following statements are not counted for CC:

else | do | switch | try | using | throw
| finally | return | object creation | method call | field access

These 2
lists come from the CC definition in the NDepend documentation.
The implication is that a C# method with a high CC is certainly complex and
hard to understand. How high should be the CC to consider that a method is complex?
Here is a method with a CC of 12, clearly it is not an easy method:

/// <summary>

/// Returns the
relative path from a base directory to another

/// directory
or file.

/// </summary>

public static string
RelativePath( string from, string to ) {

   if (from == null)

      throw new ArgumentNullException (from);

   if (to == null)

      throw new ArgumentNullException (to);

   if (!Path.IsPathRooted (to))

      return to;

   if (Path.GetPathRoot (from) != Path.GetPathRoot (to))

      return null;

 

   string[] _from = from.Split
(PathUtils.DirectorySeparatorChar,

      PathUtils.AltDirectorySeparatorChar);

   string[] _to = to.Split
(PathUtils.DirectorySeparatorChar,

      PathUtils.AltDirectorySeparatorChar);

 

   StringBuilder sb = new
StringBuilder (Math.Max
(from.Length, to.Length));

 

   int last_common, min = Math.Min
(_from.Length, _to.Length);

   for (last_common =
0; last_common < min;  ++last_common) {

      if (!_from
[last_common].Equals (_to [last_common]))

         break;

   }

 

   if (last_common <
_from.Length)

      sb.Append (“..”);

   for (int i = last_common + 1; i < _from.Length; ++i) {

      sb.Append (PathUtils.DirectorySeparatorChar).Append
(“..”);

   }

 

   if (sb.Length >
0)

      sb.Append
(PathUtils.DirectorySeparatorChar);

   if (last_common <
_to.Length)

      sb.Append (_to
[last_common]);

   for (int i = last_common + 1; i < _to.Length; ++i) {

      sb.Append
(PathUtils.DirectorySeparatorChar).Append (_to [i]);

   }

   return sb.ToString ();

}

 

The lower
the CC the better ! IMHO a reasonable and practical CC upper threshold is in
the range 8 to 15.

But if your
goal is to achieve quality through code metrics, CC is one of the many metrics
you should care for. A method can be complex even if it has a lower CC. This is
why, by default NDepend comes with a CQL rule
that mixes several common code metrics:

// <Name>Quick summary of methods to
refactor
</Name>

WARN IF Count > 0 IN SELECT METHODS WHERE

                                   
// Metrics’ definitions

  ( NbLinesOfCode > 30 OR           // http://www.ndepend.com/Metrics.aspx#NbLinesOfCode

    NbILInstructions > 200 OR       //
http://www.ndepend.com/Metrics.aspx#NbILInstructions

    CyclomaticComplexity > 15 OR    // http://www.ndepend.com/Metrics.aspx#CC

    ILCyclomaticComplexity > 20 OR  //
http://www.ndepend.com/Metrics.aspx#ILCC

    ILNestingDepth > 4 OR           // http://www.ndepend.com/Metrics.aspx#ILNestingDepth

    NbParameters > 5 OR             // http://www.ndepend.com/Metrics.aspx#NbParameters

    NbVariables > 8
OR              //
http://www.ndepend.com/Metrics.aspx#NbVariables

    NbOverloads > 6 )               //
http://www.ndepend.com/Metrics.aspx#NbOverloads

 

 

A fundamental way to asses
CC in the .NET world

Formally, the
way to compute CC depends on the underlying language. Practically, when it
comes to conditional statements C#, VB.NET,
Java and C++ syntaxes are pretty similar with just a few peculiarities for each
language. But still, CC is a metric defined on source code.

At the
early stage of NDepend, the tool wasn’t able to parse source code. The only
data analyzed by NDepend was IL code. Hopefully now NDepend can parse IL code +
data in PDB files + C# source code + code coverage by tests files, as explained here.

However at
that time, for a quality oriented tool, CC was a must-have metrics. Thus we
invented a way to asses CC from the IL code itself. Here is the definition of
the ILCyclomaticComplexity metric as stated
in the NDepend documentation
:

ILCyclomaticComplexity is computed from
IL as 1 + {the number of different offsets targeted by a jump/branch IL
instruction}. Experience shows that NDepend CC is a bit more larger than the CC
computed in C# or VB.NET. Indeed, a C# ‘if’ expression yields one IL jump. A C#
‘for’ loop yields two different offsets targeted by a branch IL instruction
while a foreach C# loop yields three.

Both CC and ILCC code metrics can be used to detect complex methods in a program.

 

However,
the ILCC is more fundamental in the sense that it is language independent and
that it can be computed to analyze the quality from assemblies themselves. For
example ILCC can detect complex methods contained in some tier code like the
.NET Framework, even if we don’t have the source code:

 

CC and testing

An interesting
fact about CC is that it helps to evaluate the number of unit tests needed to
cover a method. The idea is that each logical path through a method requires
its own unit test. Practically there is a problem:  .NET coverage tools are all relying on PDB sequence
points when it comes to coverage. As shown on the screenshot below, a shortcoming of
the sequence point artifact (initially used for debugging experience) is that it
doesn’t distinguish between several conditions nested in a if clause:

 

 

In this
example you’ll get 100% coverage if s is a non-empty string that doesn’t
contains a character a. You don’t
need to test the case where s is null, the case where s
is empty, the case where s is not empty but contains a character
a. This is a weakness of the
coverage metric and this is why personally, I often prefer to refactor my code
this way:

 

Now I have
9 sequence points to cover instead of one and I am forced to write more unit
tests to achieve 100% coverage. In a previous post A simple trick to code better and to increase testability, I rambled on this particular point and I explained how to use the CC metric
in combination with the ILNestingDepth
metric
to detect such cases.

Also, NCover provides a second metrics
named branch coverage that is more
precise than the classical coverage: the branch
coverage
computation takes account of nested conditions not covered by
tests. Also NDepend is able to gather branch coverage
values from NCover execution and you can write CQL rules based on branch
coverage like.

// <Name>Namespace XXX.YYY 100% branch
covered
</Name>

WARN IF Count > 0 IN SELECT METHODS

  FROM NAMESPACES “XXX.YYY”

  WHERE PercentageBranchCoverage < 100

Share/Bookmark

This entry was posted in branch coverage, CC, Code metrics, conditional complexity, conditional statement, coverage, Cyclomatic complexity, linearly independent paths, Lines of Code, LoC, McCabe, measurement, software metric, testability, threshold. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://weblogs.asp.net/fmarguerie Fabrice

    Quick side note: ‘throw new ArgumentNullException(from)’ should be ‘throw new ArgumentNullException(“from”)’

  • http://codebetter.com/members/Patrick-Smacchia/default.aspx Patrick Smacchia

    Jim there is trade-off between being draconian on quality and realistic.

    If your team can reach a CC threshold of 5 on your whole code base, then sure, it is much better than abiding by a CC threshold of 10.

  • Jim Cooper

    You’re a bit generous with your maximum CC rating. Who writes 15 unit tests for one method?

    I think 5 is a better target (and apparently so does Steve McConnell, but not the SEI, which sets a limit of 10) for a method to be considered good code.

  • http://codebetter.com/members/Patrick-Smacchia/default.aspx Patrick Smacchia

    >code metrics are just a lame method for estimating code complexity

    Itay, Code metrics are very efficient to pinpoint bad quality. However respecting metrics’ thresholds won’t necessarily lead you to quality code, because quality is a multi-dimensional measure as I explained here:

    http://codebetter.com/blogs/patricksmacchia/archive/2009/06/28/fighting-fabricated-complexity.aspx

    Nevertheless respecting metrics’ thresholds is an essential step forward to quality. You can’t have quality code when you have methods long and complex methods with a CC higher than 15 (except if they are generated code but this is another story I will ramble on later).

  • http://javadots.blogspot.com Itay Maman

    Code metrics are sometimes useful but they are too often just a lame method for estimating code complexity and this is not one of the cases where one the “this is the best we have” argument holds. Here’s why:

    Suppose we work in a language where curly braces induce a closure. Here are two equivalent fragments of code:

    // Fragment 1
    class SomeClass {
    void f(int n) {
    if(n > 0)
    this.x += n;
    }
    }
    }

    // Fragment 2
    class SomeClass {
    void check(bool b, clousre c) { if(b) c.invoke() }
    void f(int n) {
    check(n > 0, {
    this.x += n;
    })
    }
    }

    From a Cyclomatic-complexity standpoint, the f() method in the 2nd fragment is better than the one in the first which doesn’t make much sense.