CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Karl Seguin

.NET From Ottawa, Ontario - http://twitter.com/karlseguin/

don't use string.format ?!

In response to my post evangilizing string.format, Travis commented that my advice was wrong, and that it was "pathetic" to give priority to mere code cosmetics rather than performance. It's obvious that Travis feels strongly about the topic - as do I.

In some cases I think Travis is right. For example, if you're building a missile guidance system, or a video game, or maybe software that drives a heart pump. But for _any_ piece of software where you adopt a general-purpose managed language (like C#, VB.NET or Java), well, I think Travis is dead wrong.

I'll be the first to admit that I've never actually done performance testing on String.Format - but I have looked inside the class at how it works. True enough, there's a lot going on in there.

My point though is that in this day and age, the primary cost of developing software is maintenance. I've seen some studies pinning maintenance at over 100% of the total cost of ownership. In other words, the money you'll save by making code readable and maintainable is going to eclipse your hardware cost. But even that isn't really my main issue with what Travis said, because, in my humble opinion, Travis is guilty of massive premature micro-optimization. I'm pretty sure that even on outdated hardware (say a 5 year old computer), the difference in the majority of cases is going to be completely unnoticeable.

Of course, in cases where it is noticeable, you create nice code first, profile, and optimize. Travis seems to be suggesting you skip the first two step. I'm sure someone has a clever saying about this. (Both Red-Gate and JetBrains offer good .NET profiler, I highly suggest one or the other).

It’s one of those 80/20 things. Optimizing the first 80% tends to be relatively easy, but it quickly gets a lot harder to find anything else. Once you've identified and implemented the two or three big hits, things become a lot harder to tweak. If you're having performance issues, look at table indexes and caching opportunities. I've seen single indexes improve performance by orders of magnitude.

The fairly linear evolution of computer languages is directly related to hardware advances.  C was possible as an abstraction above assembly because of it. Same goes for OO, garbage collectors, exceptions, webcontrols and advanced formatting.

 



Comments

Marcelo Ruiz said:

Can you paste the link to the Travis article please? Thanks!

Personally, I started using your approach some time ago and I have found the same benefits you described (in the use string.format article) and no performance issue at all have been found, so I totally agree with you.

Thanks!

# October 5, 2006 6:15 AM

Raymond Lewallen said:

I've been guilty of micro-optimization in the past.  Do you know what I gained from it?  Absolutely nothing.  However, I did lose valuable time that could have been better spent refactoring design issues.  String.Format certainly improves readability, but the performance gain you receive is negligable by removing it.  Like you said, concentrate on the 20% of the application that is doing 80% of the work, which almost always involves a database with indexing and caching techiniques to be tweaked.

# October 5, 2006 8:36 AM

Travis001 said:

Karl,

I obviously overstated my point. I completely agree with you about the optimization 80/20 rule. My point is that (to me, and maybe only to me) that something that we can all agree produces a 4-6x performance loss so the developer can have slightly "prettier" code (again to me it is only slightly prettier) is crazy.

This is one reason why people now need machice with multicore-gigihertz processors do run even the most simple applications with out unreasonable response. Sure we can decide to only apply this knowledge to Missle Systems or other high requirement software but my question is why? Why when MS wrote this function did they write it to be so slow? Because they appear to have an attitude of, if it is too slow buy faster hardware.

I appologise for even bringing this up.

# October 5, 2006 8:39 AM

karl said:

Macrelo:

It's one of the last comments to my blog post here:

http://codebetter.com/blogs/karlseguin/archive/2006/04/10/142602.aspx

# October 5, 2006 8:42 AM

Aardvark said:

I've been using sprinf() and various string class Format() methods for years (I thought everyone did??). I often like to store these format strings out in whatever resource files and/or configuration files (DBs, .config files, etc...) the system I'm developing under supports. This way you can change/translate these strings without code changes. Of course you have to document what each numbered parameter is. This was hard with sprintf() (or those methods inspired by sprintf) since in the format-string you had to use the input variables in the order they were passed in, making it often hard to radically reword the string. .NET made this much easier allowing you to use any parameter in any order as many times as you want by using the index.

Now we all are talking in generalities, but, in general, I've found most developers are really bad at understanding code performance. Often mirco-optimizing (did you invent that term just now?) code that didn't need it and missing huge issues that, on the surface, seem innocent enough.

IMOHO, code maintainability, reliability, and readability are often just as important as performance. It is not just about nice looking code! Developer cycles are expensive, CPU cycles are not. CPUs keep getting faster and cheaper, good developers are not. As long as your code is fast enough to meet (and maybe beat by a small margin) your requirements then I don’t see the point of not using techniques that make better *code*.

I'm a huge fan of code profiling. It was a huge pain in the ass in unmanaged C++, but worth it (the tools were often buggy). I only assume it is much better in higher level languages (I haven't really used it seriously in .NET).

# October 5, 2006 8:43 AM

DaveThieben said:

I think a lot of you are maybe taking Karl a little too literally.  Use common sense.  If you have a tight loop running 5 million iterations, don't use string.format!  Optimize it.  On the other hand, if you have a string with 6 parameters and it gets run only a handful of times, it will be 10x more readable by using string.format and the performance is a nonissue.  

I dont think you can ever make a blanket statement about how to code, because inevitably there will be a counter-example.

# October 5, 2006 9:11 AM

karl said:

Travis:

I think it's good that you brought it up. If nothing else, it highlights the need for developers not to blindly use code. Joel calls this "The law of leaky abstraction". Any good developer should be diligent and strive for a deeper understanding than what seems obvious at first. You should *know* that string.Format might performe slow (relatively speaking anyways) and you should have some idea as to *why*, and THEN you can make the educated decision of whether to use it or not.

I strongly believe that. Tools like Reflector for .NET (free) are real enablers when it comes to this type of self-education.

# October 5, 2006 9:28 AM

karl said:

Thanks for sharing you thought Aardvark - completely agree. If anything developer cycles are becoming MORE expensive again (atleast in some circle, and compared to very recent years).

# October 5, 2006 9:32 AM

Aardvark said:

On the surface I don’t see why string concatenation is *that* much faster then string.format. I wonder why that is? I also wonder if string.Format starts beating the performance of the “+” operator on string building with lots of variables since less temporary strings should have to be created and destroyed. I know I’ve seen instances with MFC’s string class’s format where this was true.

Sort of related: Raymond Chen showed some interesting string performance issues (mostly w/ STL & Unicode issues) and optimizations in his series on a Chinese to English translator app he developed. (google for the The Old New Thing blog -- and subscribe -- it’s a favorite). I bring that up since today he commented again on it, although the string issues were posted awhile ago…

# October 6, 2006 9:54 AM

Aardvark said:

By the way, does anyone else hate coding analogies to “missile guidance systems” or “satellites” or whatever thing where “Instruction at 0x98643967 cannot read from 0x0000000” == 2 Billion $ lost or lives lost?

Do we REALLY want every application to be held those standards?  How much would desktop software cost?? Think how this would stifle innovation?

I also wonder how great some of this code really is?

# October 6, 2006 10:03 AM

karl said:

Aardvark:

My boss used to work on Missigle guidance systems...hence why I threw it in there :)

# October 6, 2006 10:53 AM

Aardvark said:

File this under "beating a dead horse":

OK, I broke out DevPartner Studio's profiler (aka TrueTime). After a great call to their support fixing a licensing issue (how it broke, I dunno)...

I wrote an app that used "+" and "format()".

First I built a string with 3 ints seperated by 10 space strings.

Plus = .03ms

Format = 3.42ms

Wow format sucks, right? Well not really...

If I then call format a second time, same formatting, but 3 different int variables. Format = .5ms! If I passed the same 3 int variables in I got .04ms - not sure why that is?

Next I built strings with 23 variables (mixes of ints, longs, doubles, and strings). Values seperated by "\r\n".

Plus = 2ms

Format = .53ms

I think its safe to say the first time you call format is a killer - I'm sure some .NET guru could explain why... after that, however, things aren't so bad - sometime better. Now this is a very synthetic test so YMMV.

Another lunch break down the tubes...

# October 6, 2006 12:00 PM

Sebastián said:

I generally will go with String.format...but testing this in Java gives me a x50 difference between that ugly static method and the pretty String.format:

import java.util.Date;

public class Prueba{

 public static void main(String[] args){

   String a = "9";

   new Date();

   System.out.println((new Date().getTime()));

   System.out.println(String.format("%012d",new Integer(a)));

   System.out.println((new Date()).getTime());

   System.out.println(completezero(a));

   System.out.println((new Date()).getTime());

 }

public static String completezero(String cad)

 {

   String str_complete="";

   String[] temp = cad.split("\\.");

   cad=temp[0];

   if(cad.length()<=12)

   {

     int count_zero=12 - cad.length();

     String str_zero="";

     for(int i=0;i<count_zero;i++)

     {

       str_zero+="0";

     }

     str_complete=str_zero+cad;

   }

   else

   {

     str_complete=cad;

   }

   return str_complete;

 }

}

# September 19, 2007 5:34 PM

Colin said:

Also string concatenation like this:

var s = "hello";

var s += " you";

Actually stores two strings in memory:

"hello" and "hello you"

Using a StringBuilder or string.Format only has one string in memory

# July 10, 2008 5:16 AM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add
Check out Devlicio.us!

Our Sponsors

Free Tech Publications