Why is it useful to count the number of Lines Of Code (LOC) ?

My previous post explained How to count the LOC of your .NET application and Tim B and Keith answered that LOC should never be counted because it is not useful. I don’t agree at all.

Often developers get stressed when we talk about LOC because some company unfortunatly use LOC as a yardstick to assess developers productivity. LOC has nothing to do with productivity: you can create 1000 LOC in a day by creating a Form with many controls and it can also take 2 days to correct a hard bug that will be fix by changing a single line.

First, LOC is useful to compute the automatic test coverage ratio, since coverage necessarily represents a fraction of LOC.

Second, LOC and especially logical LOC is useful to estimate software. Estimation in software is a difficult task. In his excellent book Software Estimation Demysitfying the Black Art, Steve Mc Connell explains well that LOC is the most efficient way to compare applications that have been developed within the same context. You can then use this comparison to plan new development, refactoring, migration etc… For example it is interesting to know that the code base of Windows Vista (around 70M LOC) is roughly 100 times bigger than the .NET framework codebase (around 500K LOC).

Estimating software is a complex subject because the estimation is not proportional to LOC, the bigger a codebase is, the higher it costs to add a new feature.

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Me, myself and I

    @keith: not so. Each metric is suitable for measuring something different. For apps of the same type, roughly of the same architectural quality, for the same business domain, LOCs are a sensible metric for estimating effort. Whereas CC is a measure of quality.

  • http://www.NDepend.com Patrick Smacchia

    Fortunatly the tool NDepend, the prodcut I’m responsible for, support CC computed from C# code and CC computed from IL code and logical LOC.
    http://www.ndepend.com/Metrics.aspx#CC
    it supports also 60 other metrics

  • http://www.afex2win.com keith

    if i HAVE to use a metric for my code, i’d rather use CC ( http://en.wikipedia.org/wiki/Cyclomatic_complexity ) than LOC. as you said: a form with a bunch of code is big, but a small recursive algorithm is harder to debug. which one makes a bigger impact onto software quality and/or cohesion, etc? functions with too large of a CC probably need to be refactored, regardless of their LOC.

    but i’d rather just measure my progress with a working copy of the software itself.

  • http://www.NDepend.com Patrick Smacchia

    > no two applications (at least within the same company) are developed in the same context. They are written for different purposes and often by different developers, especially on large projects.

    From my experience and from the studies made within the book ‘Software Estimation: Demystifying the Black Art’ advocate for the fact the same context is maintained across large company. There is same HR, so developers with same education. There is the same demand for quality and correctness accross the applications. If the turn over is not high, the managers and technical leads end up by converging toward same goals.

    Of course, inside a company such as NASA, there can be significant correctness demand between the code onboard the shuttle and the code of their website :o)

  • http://bigtunatim.wordpress.com Tim B

    First, thanks for following up Patrick. I’m used to being a blog lurker so it’s a nice change to actively discuss a topic. With that said, your point is certainly not lost on me but it’s been my experience that

    A) if a metric is available, it will be used for planning and estimating regardless of its suitability for those purposes, and

    B) no two applications (at least within the same company) are developed in the same context. They are written for different purposes and often by different developers, especially on large projects. I’ve read very little about project estimation so I may be misunderstanding this point.

    If LOC analysis works well for a situation then by all means it should be used. It just happens that I haven’t yet been in a situation where counting LOC was appropriate. I’ve also just recently parachuted out of a 6-months-and-counting death march so I’m a little jaded about productivity issues ;-).