Fast string to integer conversion

I'm dealing with string-based inputs that actually represent integers. The system is under very tight performance constraints – tens of thousands of users on it at the same time with frequent server interaction (i.e., they aren't spending 10 minutes reading a page).

The hubris starts to kick in and I tell myself "I can write this faster!". I open up Reflector to scope out my competition and see that Int32.Parse looks pretty heavy..I mean, just look at ParseNumber.ParseNumber.

The first thing I do is write a simple load tester. It looks something like:

string[] testString = {… 1000 randomly generated string values representing VALID positive integers …}
DateTime startTime = DateTime.Now;       
for (int j = 0; j < 100; ++j)
{
    for (int i = 0; i < testString.Length; ++i)
    {               
      Int32.Parse(testString[ i ]);
    }
}   
Console.WriteLine(DateTime.Now.Subtract(startTime).TotalMilliseconds);

 
That's 100 000 iterations. It takes….33 milliseconds. Let's be honest, my performance requirements are tight, but not nearly that tight. I can probably tweak 1 index and get 50x the performance boost from it.

But…curiosity has the best of me. So I see what I can do. Here's my first attempt:

internal static int AsciiToInteger(string stringToConvert)
{
   int zeroValue = '0';
   char[] characters = stringToConvert.ToCharArray();        
   int value = 0;
   for (int i = 0; i < characters.Length; ++i)
   {
      char c = characters[ i ];
      if (c < '0' || c > '9')
      {
         throw new FormatException("Input string was not in the correct format");
      }
      value = 10 * value + (c – zeroValue);
   }
   return value;
}

It runs in 11 milliseconds…not bad.  Using unsafe code I manage to cut that down to 5milliseconds:

public unsafe static int MyParse2(string stringToConvert)
{
    int value = 0;       
    int length = stringToConvert.Length;
    fixed(char* characters = stringToConvert)
    {
        for (int i = 0; i < length; ++i)
        {                   
            value = 10 * value + (characters[ i ] – 48);
        }   
    }
    return value;
}

I'll revisit the issue again once I profile the app and see if there's any benefits to the overall system by moving to something other than Int32.Parse, but thought I'd share my experience.

Oh, and incase anyone's wondering, I did write some unit tests for it since I do realize that my version is far less robust:

[Test, ExpectedException(typeof(FormatException))]
public void AtoIDoesntAcceptNegatives()
{
   AsciiToInteger("-32235");
}
[Test, ExpectedException(typeof(FormatException))]
public void AtoIDoesntAcceptNonNumericCharacters()
{
   AsciiToInteger("3d2a35");
}
[Test, ExpectedException(typeof(NullReferenceException))]
public void AtoIThrowsExceptionWhenNull()
{
   AsciiToInteger(null);
}
[Test]
public void AtoIReturnsZeroForEmptyString()
{
   Assert.AreEqual(0, AsciiToInteger(""));
}     
[Test]
public void AtoIReturnsProperInteger()
{        
   Assert.AreEqual(Int32.MaxValue, AsciiToInteger(Int32.MaxValue.ToString()));
   Assert.AreEqual(0, AsciiToInteger("0"));
   Assert.AreEqual(43, AsciiToInteger("43"));
   Assert.AreEqual(54354, AsciiToInteger("054354"));
}

This entry was posted in Short Outbursts. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

5 Responses to Fast string to integer conversion

  1. karl says:

    Thanks haacked..I thought so too!!

    I didn’t go into any details on the project, but the % bad input should be very close to 0%, which is why I figured I could do it better. All our inputs are coming from another program, which we are also writing. The functionality is only used to persist internal state, so it’s never user-input.

    If someone hacked our encryption layer and started dumping in fake data, then we’d have a problem..and bound checking isn’t my worry in that case.

  2. Haacked says:

    Actually, this is a great demonstration of perf optimization. You took a measurement, you realized the implementation is fast enough, and you stopped there.

    Sounds to me the rest of the post was just to satisfy your curiosity. A couple of points though.

    1. What percentage of bad inputs do you expect over all inputs. If it’s large (because of user error), then I’d make the perf test take that into consideration. That might show that Int.Parse does not perform as well as you’d like. As ewise points out, you could test TryParse in that situation.

    2. What if a user types in “0x0f”. That’s an int! ;)

  3. ewise says:

    FYI- Using Int32.TryParse greatly exceeds any performance you might get using Int32.Parse with a Try Catch block and is the preferred method for attempted conversions from entry forms.

  4. karl says:

    Damien…I’m thinking you didn’t make it through more than a quarter of what I wrote.

    Of course it’s a premature optimization. I didn’t know that Int32.Parse would handle 100 000 strings in 33milliseconds. You don’t know if something is a premature optimization before knowing it’s actual performance. That’s the first thing I did…tested it’s actual performance AND concluded that it was more than good enough.

    The rest was just curiosity…

  5. Damien Guard says:

    Sounds like premature optimization to me.

    If Int32.Parse isn’t fast enough for you then maybe .NET isn’t what you should be using.

    [)amien