CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Dave Laribee

"Whoso would be a man must be a nonconformist." - Ralph Waldo Emerson

KLOCs and Golf Scores

I'm always amused when I hear "our system has one million lines of code" bandied about as a point of pride. All that says to me is that most likely you've inherited a big ball of mud because when a single system gets that big, well, chances are it's a Byzantine nightmare.

Starting from green fields, and with new approaches, I'd incline toward keeping your KLOCs low. That is, it's great to see the first few thousand lines of tested code come on board quickly, but when a system tips 5-10,000? To me that's a good time to start looking for ways to partition into separate modules, assemblies, packages whatnot. Mind you, I'm not saying that lines of code shouldn't be the signal, pain in finding stuff and in maintaining should drive your decision and I think that coincides with an increase in lines of code.

Martin Fowler makes a couple nice points about the relative meaningless of KLOCs as a measure of productivity:

  1. A high KLOC count says nothing about the quality of the design.
  2. A high KLOC count says nothing about business value delivered.

So are fewer lines of code better?

One of the many appealing things about Ruby culture is that the LOC measure is a sort of golf score: lower is better. It's a bragging right for the Rubyist that they can express things in not only a more readable form but in a smaller space.

Granted an important consideration for the Rails developer is that they've got a framework that's amplifying their efforts and keeping their application code brief. As of this post Rails has just south of 100,000 lines of code a good chunk of which are tests.

Sidebar: Now let's take a look at something from the .NET world. Let's take NHibernate. You know it, you use it, you love it, right? Well there's over 400,000 Lines of Code! Still interested in writing your own, full-featured ORM? So with rails you get an ORM and the View/Controller framework (and a bunch of other stuff) for 1/4 the price. Wow. To be fair language choice probably has a lot to do with total lines of code and comparing NHibernate with ActiveRecord is really an apple-orange situation.

What should be important to all of us, as application developers, is choosing the right stuff from the community or vendors (if you have to) that can cut down on the code we have to write to get a job done. When your codebase gets unwieldy evolve your design by looking for pieces you can replace with community or by introducing partitions in the form of subsystems, modules, packages, etc.



Comments

Jeremy D. Miller said:

I have no doubt though that if we were crazy enough to rewrite NHibernate we could write it in much less code.  All you need is that 20/20 hindsight from reading the existing code.

I know there's an OSS project going to write a Ruby version of Hibernate.  That would be an interesting comparison.

# September 19, 2007 12:53 PM

Eric Marthinsen said:

Any recommendations for a good tool to count lines of code?

And I agree, using a high KLOC count as a source of pride is like showing off how large a garage you need to hold all of your stuff. Unless you've managed to fill it with great, meaningful stuff, it's pretty embarrassing.

# September 19, 2007 12:54 PM

keith said:

i agree.  i always tell my coworkers that code is a liability, not an asset. as opposed to features, which are assets.  so the higher feature-to-code ratio we can get, the better.

the more code you have, the higher chance that more of it isn't going to work right.  one of the reasons that refactoring and being DRY are important.

# September 19, 2007 1:17 PM

Tomas Restrepo said:

You bring up an interesting issue when you say that "...but when a system tips 5-10,000? To me that's a good time to start looking for ways to partition into separate modules, assemblies, packages whatnot".

Aren't you assuming there that whoever brought up their 1 million LOC system, they meant "Our unmaintainable, monolithic, unchangable System has 1 million lines of code!", versus saying "Our System has 1 million lines of of well organized, properly modularized, maintainable code"?

Granted, I don't know many cases of the latter :), but just  pointed out that a large LOC count for a *complex* application doesn't need to necessarily mean it's a big pile of mud. After all, complex applications will be naturally gather a certain level of complexity (with a higher LOC being one sideeffect).

IOW, a certain LOC number is meaningless unless it is attached to a boundary (system, application, module, assembly, whatever), and can be considered slightly useful (broadly speaking) only in relation to that boundary

# September 19, 2007 1:18 PM

Garry Shutler said:

Person1: "My last project had 1 million LoC!"

Person2: "Then you learnt about 'for' statements"

# September 19, 2007 1:35 PM

Eric Nicholson said:

That 400,000 lines of code for NHibernate sounds suspicious, especially when you consider that it's supposedly %68 XML...

Koders.com lists it at 78,000 lines of code www.koders.com/info.aspx

But they unfortunately don't have RoR for comparison.

Another thing to consider is that C# has a lot of mostly empty lines ( single braces for instance).  So yeah, it's like comparing golf scores, but you might be looking at PGA tournimant, a family at a minigolf course, and some college kids playing frisbee-golf.

# September 19, 2007 1:53 PM

Dave Laribee said:

@Jeremy - I love me some NHIbernate, don't get me wrong. I just think the readability and syntactic brevity of Ruby is pretty darned compelling. If you find the link to that project send it along, eh?

@Eric - Good catch. So using ohloh puts C# code at around 148K lines. But still more than Rails! Arguably it's a who cares situation as NHibernate performs great for us. The mini/frisbee golf thing is hilarious.

@Tomas - Yes, I am assuming that. I guess people use LOC as a braggy metric sometimes which to me is more of a "gee, I'm sorry for you." The boundary is the thing. I think there's a lot more value in this day and age to taking stuff off the shelf and mixing it up. Integration. I wouldn't consider NHibernate's LOC count in our application's LOC count; it's in a library boundary. Similarly we partition our accounting features into a subsystem w/ an api. So we can segment those LOCs from any other application's LOCs; it's just a brick. We might agree there, not sure, more clarifying than anything.

@Garry - That's really funny, I'm going to steal that.

# September 19, 2007 3:46 PM

Tomas Restrepo said:

@Dave: Absolutely, I agree. just mentioning that the boundary is pretty important :)

# September 19, 2007 10:34 PM

Javier Lozano said:

Great point, less is better!  You bring up a great point that sometimes developers miss is that the more code you write, the more you maintain.  Now, if you want to maintain code for the rest of your career, go for it.  I sure don't want to.

# September 19, 2007 10:37 PM

Dave Laribee said:

@Tomas - True, True

@Javier - Less is almost always better. I'd definately rather work on a suite of partitioned, discrete subsystems where the applications are integrations over, say, an ERP platform! Some people must like the challenge or just not consider this dimension or some combo of the two.

# September 19, 2007 11:36 PM

Ayende @ Rahien said:

Line count

# September 21, 2007 6:44 AM

mawi said:

Well, LOC as a indicator in general is an approximation, but attractive since it is easy to get an exact measure. The issue of LOC as productivity is an old one, but I think most agree that it is suboptimal.

("relative meaningless of KLOCs as a measure of productivity:")

Humans have narrow limits to handling information, and thus size becomes an issue of complexity. In my view, abstraction deals with this by grouping, etc. At the same time, creating that abstraction will add more lines of code. So, I would think that the opposite is true - all other things being equal.

Note the condition that I added. Comparing apple LOCs to oranges LOCs isn't really helpful. Nevertheless, "less is more" may not be true for LOC.

An example: Lets say we have two systems built using the same technologies (language, say your favorite, etc) and providing the same services. Both are basically well developed, having eliminated duplication. (This incidentally sets both ahead of most existing software today.) The first is not very well structured however, perhaps leaning more towards procedural practices but more importantly not focused on readability and maintainability. The second project is so focused, however. I would think that the second project would have a higher LOC.

Another concrete example: Long one liners. 1 LOC. Usually not readable. Making a long one liner readable usually makes me extract several local variables and maybe even a method. That will add severals LOC, but increase readability and thus maintainability.

So, in the former example, the first project is the ball of mud.

Make sense?

PS

Or, on a less serious note; you write: "One of the many appealing things about Ruby culture is that the LOC measure is a sort of golf score: lower is better."

*Eww*, perl or maybe ioccc anyone? Ruby - a disgusting little language with a culture that also sucks! Muuhhahahahah! (this is great fodder to tease my mac-lugging collegues with)

;-)

# September 25, 2007 6:20 PM

Dave Laribee said:

Great comment.

All very good arguments for the essential meaninglessness of LOC, I guess, no? I mean one day/project it's meaningful, the next day/project it's not... and the language factor... and the test code factor... and the Not Invented Here factor... and the...

Another thing you can say is that systems with more executable specifications (and/or tests) will have more lines of code but (usually) be better designed and (usually) be more maintainable than systems w/o accompanying verification/specification code.

Any way you cut it, seems to me that it's a moving target this LOC metric, so let's just forget about it.

One thing I tried to convey but wasn't very successful at is that big systems could benefit from "chunking up" or partitioning. This leads us to mashups or SOA (I prefer the former as a term) or to taking OSS/COTS in to our solution code in a wrap/adapt manner.

> Ruby - a disgusting little language with a culture that also sucks!

Heh. What? Don't like hobbits?

Really though I put that Ruby/NHibernate thing in as a kind of throwaway. I use and love NHibernate and I imagine it scales much better than ActiveRecord, etc. I'm just stirring the pot in an innocent what comes in my head is what gets blogged kinda way :)

# September 25, 2007 8:08 PM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add
Check out Devlicio.us!