CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Karl Seguin

developer @ Fuel Industries ottawa, ontario

Fast string to integer conversion

I'm dealing with string-based inputs that actually represent integers. The system is under very tight performance constraints - tens of thousands of users on it at the same time with frequent server interaction (i.e., they aren't spending 10 minutes reading a page).

The hubris starts to kick in and I tell myself "I can write this faster!". I open up Reflector to scope out my competition and see that Int32.Parse looks pretty heavy..I mean, just look at ParseNumber.ParseNumber.

The first thing I do is write a simple load tester. It looks something like:

string[] testString = {... 1000 randomly generated string values representing VALID positive integers ...}
DateTime startTime = DateTime.Now;       
for (int j = 0; j < 100; ++j)
{
    for (int i = 0; i < testString.Length; ++i)
    {               
      Int32.Parse(testString[ i ]);
    }
}   
Console.WriteLine(DateTime.Now.Subtract(startTime).TotalMilliseconds);

 
That's 100 000 iterations. It takes....33 milliseconds. Let's be honest, my performance requirements are tight, but not nearly that tight. I can probably tweak 1 index and get 50x the performance boost from it.

But...curiosity has the best of me. So I see what I can do. Here's my first attempt:

internal static int AsciiToInteger(string stringToConvert)
{
   int zeroValue = '0';
   char[] characters = stringToConvert.ToCharArray();        
   int value = 0;
   for (int i = 0; i < characters.Length; ++i)
   {
      char c = characters[ i ];
      if (c < '0' || c > '9')
      {
         throw new FormatException("Input string was not in the correct format");
      }
      value = 10 * value + (c - zeroValue);
   }
   return value;
}

It runs in 11 milliseconds...not bad.  Using unsafe code I manage to cut that down to 5milliseconds:

public unsafe static int MyParse2(string stringToConvert)
{
    int value = 0;       
    int length = stringToConvert.Length;
    fixed(char* characters = stringToConvert)
    {
        for (int i = 0; i < length; ++i)
        {                   
            value = 10 * value + (characters[ i ] - 48);
        }   
    }
    return value;
}

I'll revisit the issue again once I profile the app and see if there's any benefits to the overall system by moving to something other than Int32.Parse, but thought I'd share my experience.

Oh, and incase anyone's wondering, I did write some unit tests for it since I do realize that my version is far less robust:

[Test, ExpectedException(typeof(FormatException))]
public void AtoIDoesntAcceptNegatives()
{
   AsciiToInteger("-32235");
}
[Test, ExpectedException(typeof(FormatException))]
public void AtoIDoesntAcceptNonNumericCharacters()
{
   AsciiToInteger("3d2a35");
}
[Test, ExpectedException(typeof(NullReferenceException))]
public void AtoIThrowsExceptionWhenNull()
{
   AsciiToInteger(null);
}
[Test]
public void AtoIReturnsZeroForEmptyString()
{
   Assert.AreEqual(0, AsciiToInteger(""));
}     
[Test]
public void AtoIReturnsProperInteger()
{        
   Assert.AreEqual(Int32.MaxValue, AsciiToInteger(Int32.MaxValue.ToString()));
   Assert.AreEqual(0, AsciiToInteger("0"));
   Assert.AreEqual(43, AsciiToInteger("43"));
   Assert.AreEqual(54354, AsciiToInteger("054354"));
}


Published Jan 15 2007, 01:03 PM by karl
Filed under:

Comments

Damien Guard said:

Sounds like premature optimization to me.

If Int32.Parse isn't fast enough for you then maybe .NET isn't what you should be using.

[)amien

# January 15, 2007 1:37 PM

karl said:

Damien...I'm thinking you didn't make it through more than a quarter of what I wrote.

Of course it's a premature optimization. I didn't know that Int32.Parse would handle 100 000 strings in 33milliseconds. You don't know if something is a premature optimization before knowing it's actual performance. That's the first thing I did...tested it's actual performance AND concluded that it was more than good enough.

The rest was just curiosity...

# January 15, 2007 1:42 PM

Eric Wise said:

FYI- Using Int32.TryParse greatly exceeds any performance you might get using Int32.Parse with a Try Catch block and is the preferred method for attempted conversions from entry forms.

# January 15, 2007 5:06 PM

Haacked said:

Actually, this is a great demonstration of perf optimization. You took a measurement, you realized the implementation is fast enough, and you stopped there.

Sounds to me the rest of the post was just to satisfy your curiosity. A couple of points though.

1. What percentage of bad inputs do you expect over all inputs. If it's large (because of user error), then I'd make the perf test take that into consideration. That might show that Int.Parse does not perform as well as you'd like. As ewise points out, you could test TryParse in that situation.

2. What if a user types in "0x0f". That's an int! ;)

# January 15, 2007 8:12 PM

karl said:

Thanks haacked..I thought so too!!

I didn't go into any details on the project, but the % bad input should be very close to 0%, which is why I figured I could do it better. All our inputs are coming from another program, which we are also writing. The functionality is only used to persist internal state, so it's never user-input.

If someone hacked our encryption layer and started dumping in fake data, then we'd have a problem..and bound checking isn't my worry in that case.

# January 15, 2007 8:18 PM

Baz Web Development: AJAX, Joomla, CSS said:

First off, let me apologize for the many weeks of no posts. I&#8217;ll give you some more information later, but basically I got a new job (yaaaaay) and I had to move about 150 miles to search for apartments and all the stuff that goes wit...

# January 18, 2007 7:40 PM

Dennis' Blog said:

Development Karl Seguin does some testing on his blog about the fastest way to convert a string to an

# January 30, 2007 8:14 AM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add
Check out Devlicio.us!