KLOCs and Golf Scores

I’m always amused when I hear “our system has one million lines of code” bandied about as a point of pride. All that says to me is that most likely you’ve inherited a big ball of mud because when a single system gets that big, well, chances are it’s a Byzantine nightmare.

Starting from green fields, and with new approaches, I’d incline toward keeping your KLOCs low. That is, it’s great to see the first few thousand lines of tested code come on board quickly, but when a system tips 5-10,000? To me that’s a good time to start looking for ways to partition into separate modules, assemblies, packages whatnot. Mind you, I’m not saying that lines of code shouldn’t be the signal, pain in finding stuff and in maintaining should drive your decision and I think that coincides with an increase in lines of code.

Martin Fowler makes a couple nice points about the relative meaningless of KLOCs as a measure of productivity:

  1. A high KLOC count says nothing about the quality of the design.
  2. A high KLOC count says nothing about business value delivered.

So are fewer lines of code better?

One of the many appealing things about Ruby culture is that the LOC measure is a sort of golf score: lower is better. It’s a bragging right for the Rubyist that they can express things in not only a more readable form but in a smaller space.

Granted an important consideration for the Rails developer is that they’ve got a framework that’s amplifying their efforts and keeping their application code brief. As of this post Rails has just south of 100,000 lines of code a good chunk of which are tests.

Sidebar: Now let’s take a look at something from the .NET world. Let’s take NHibernate. You know it, you use it, you love it, right? Well there’s over 400,000 Lines of Code! Still interested in writing your own, full-featured ORM? So with rails you get an ORM and the View/Controller framework (and a bunch of other stuff) for 1/4 the price. Wow. To be fair language choice probably has a lot to do with total lines of code and comparing NHibernate with ActiveRecord is really an apple-orange situation.

What should be important to all of us, as application developers, is choosing the right stuff from the community or vendors (if you have to) that can cut down on the code we have to write to get a job done. When your codebase gets unwieldy evolve your design by looking for pieces you can replace with community or by introducing partitions in the form of subsystems, modules, packages, etc.

This entry was posted in Metrics, NHibernate, Rails, Ruby. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

12 Responses to KLOCs and Golf Scores

  1. Dave Laribee says:

    Great comment.

    All very good arguments for the essential meaninglessness of LOC, I guess, no? I mean one day/project it’s meaningful, the next day/project it’s not… and the language factor… and the test code factor… and the Not Invented Here factor… and the…

    Another thing you can say is that systems with more executable specifications (and/or tests) will have more lines of code but (usually) be better designed and (usually) be more maintainable than systems w/o accompanying verification/specification code.

    Any way you cut it, seems to me that it’s a moving target this LOC metric, so let’s just forget about it.

    One thing I tried to convey but wasn’t very successful at is that big systems could benefit from “chunking up” or partitioning. This leads us to mashups or SOA (I prefer the former as a term) or to taking OSS/COTS in to our solution code in a wrap/adapt manner.

    > Ruby – a disgusting little language with a culture that also sucks!

    Heh. What? Don’t like hobbits?

    Really though I put that Ruby/NHibernate thing in as a kind of throwaway. I use and love NHibernate and I imagine it scales much better than ActiveRecord, etc. I’m just stirring the pot in an innocent what comes in my head is what gets blogged kinda way :)

  2. mawi says:

    Well, LOC as a indicator in general is an approximation, but attractive since it is easy to get an exact measure. The issue of LOC as productivity is an old one, but I think most agree that it is suboptimal.
    (“relative meaningless of KLOCs as a measure of productivity:”)

    Humans have narrow limits to handling information, and thus size becomes an issue of complexity. In my view, abstraction deals with this by grouping, etc. At the same time, creating that abstraction will add more lines of code. So, I would think that the opposite is true – all other things being equal.

    Note the condition that I added. Comparing apple LOCs to oranges LOCs isn’t really helpful. Nevertheless, “less is more” may not be true for LOC.

    An example: Lets say we have two systems built using the same technologies (language, say your favorite, etc) and providing the same services. Both are basically well developed, having eliminated duplication. (This incidentally sets both ahead of most existing software today.) The first is not very well structured however, perhaps leaning more towards procedural practices but more importantly not focused on readability and maintainability. The second project is so focused, however. I would think that the second project would have a higher LOC.

    Another concrete example: Long one liners. 1 LOC. Usually not readable. Making a long one liner readable usually makes me extract several local variables and maybe even a method. That will add severals LOC, but increase readability and thus maintainability.

    So, in the former example, the first project is the ball of mud.

    Make sense?

    Or, on a less serious note; you write: “One of the many appealing things about Ruby culture is that the LOC measure is a sort of golf score: lower is better.”

    *Eww*, perl or maybe ioccc anyone? Ruby – a disgusting little language with a culture that also sucks! Muuhhahahahah! (this is great fodder to tease my mac-lugging collegues with)


  3. Dave Laribee says:

    @Tomas – True, True

    @Javier – Less is almost always better. I’d definately rather work on a suite of partitioned, discrete subsystems where the applications are integrations over, say, an ERP platform! Some people must like the challenge or just not consider this dimension or some combo of the two.

  4. Great point, less is better! You bring up a great point that sometimes developers miss is that the more code you write, the more you maintain. Now, if you want to maintain code for the rest of your career, go for it. I sure don’t want to.

  5. @Dave: Absolutely, I agree. just mentioning that the boundary is pretty important :)

  6. laribee says:

    @Jeremy – I love me some NHIbernate, don’t get me wrong. I just think the readability and syntactic brevity of Ruby is pretty darned compelling. If you find the link to that project send it along, eh?

    @Eric – Good catch. So using ohloh puts C# code at around 148K lines. But still more than Rails! Arguably it’s a who cares situation as NHibernate performs great for us. The mini/frisbee golf thing is hilarious.

    @Tomas – Yes, I am assuming that. I guess people use LOC as a braggy metric sometimes which to me is more of a “gee, I’m sorry for you.” The boundary is the thing. I think there’s a lot more value in this day and age to taking stuff off the shelf and mixing it up. Integration. I wouldn’t consider NHibernate’s LOC count in our application’s LOC count; it’s in a library boundary. Similarly we partition our accounting features into a subsystem w/ an api. So we can segment those LOCs from any other application’s LOCs; it’s just a brick. We might agree there, not sure, more clarifying than anything.

    @Garry – That’s really funny, I’m going to steal that.

  7. Eric Nicholson says:

    That 400,000 lines of code for NHibernate sounds suspicious, especially when you consider that it’s supposedly %68 XML…

    Koders.com lists it at 78,000 lines of code http://www.koders.com/info.aspx?c=ProjectInfo&pid=GLQ6KNN3NM2H3QM2QL1DREMU8E

    But they unfortunately don’t have RoR for comparison.

    Another thing to consider is that C# has a lot of mostly empty lines ( single braces for instance). So yeah, it’s like comparing golf scores, but you might be looking at PGA tournimant, a family at a minigolf course, and some college kids playing frisbee-golf.

  8. Garry Shutler says:

    Person1: “My last project had 1 million LoC!”
    Person2: “Then you learnt about ‘for’ statements”

  9. You bring up an interesting issue when you say that “…but when a system tips 5-10,000? To me that’s a good time to start looking for ways to partition into separate modules, assemblies, packages whatnot”.

    Aren’t you assuming there that whoever brought up their 1 million LOC system, they meant “Our unmaintainable, monolithic, unchangable System has 1 million lines of code!”, versus saying “Our System has 1 million lines of of well organized, properly modularized, maintainable code”?

    Granted, I don’t know many cases of the latter :), but just pointed out that a large LOC count for a *complex* application doesn’t need to necessarily mean it’s a big pile of mud. After all, complex applications will be naturally gather a certain level of complexity (with a higher LOC being one sideeffect).

    IOW, a certain LOC number is meaningless unless it is attached to a boundary (system, application, module, assembly, whatever), and can be considered slightly useful (broadly speaking) only in relation to that boundary

  10. keith says:

    i agree. i always tell my coworkers that code is a liability, not an asset. as opposed to features, which are assets. so the higher feature-to-code ratio we can get, the better.

    the more code you have, the higher chance that more of it isn’t going to work right. one of the reasons that refactoring and being DRY are important.

  11. Any recommendations for a good tool to count lines of code?

    And I agree, using a high KLOC count as a source of pride is like showing off how large a garage you need to hold all of your stuff. Unless you’ve managed to fill it with great, meaningful stuff, it’s pretty embarrassing.

  12. I have no doubt though that if we were crazy enough to rewrite NHibernate we could write it in much less code. All you need is that 20/20 hindsight from reading the existing code.

    I know there’s an OSS project going to write a Ruby version of Hibernate. That would be an interesting comparison.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>