We are currently doing an intensive pesky minor bugs session. Bugs are logged into our gmail account as a production bug tracking server. For each logged crash, we get a stack trace and plenty of information about the machine environment.
One recurrent problem with exception stack trace in .NET, is that, one needs to release PDB files with assemblies installed in production code. Without PDB files, you still gets the stack trace of methods called, but you loose the exact source file line from where the exception popups. The problem is that usually the PDBs files weight between one and two times the weight of the target assemblies. Hence, releasing PDBs is usually not an option because this consumes so much more memory (both install hard-drive memory, and process memory at run-time).
The exact source file line from where the exception popups is not just a convenient info, it is an essential info. The typical wrong situation is when a method throws a NullReferenceException, but contains several references that can potentially be null: in such case you are stuck because you are missing the information from where exactly the exception popups. Knowing the exact source file line from where the exception popups, it would become easy to identify the faulty null reference.
To fix this situation, Mike Stall, the MSFT debugging guru, wrote a blog post about Converting a managed PDB into a XML file. This blog post shows a trick about how to use some tooling to both, not release the PDB files, and be able to retrieve the exact source code line from a production exception stacktrace. The problem is that often real-world application assemblies are obfuscated. The correspondence between released obfuscated assemblies and their initial PDBs is then completely broken.
Yesterday, I just realized that somehow, it should be possible to retrieve the IL offset in the obfuscated method body, that provokes the production exception. After googling a bit, it seems that sometime stacktrace contains the IL offset, at the end of each frame, after a plus ‘+’ character. However I haven’t been able to determine what should be done to get this extra information.
Hopefully, as always, more googling lead to exactly what I was looking for: in the step 3 of this blog post Getting file and line numbers without deploying the PDB files Tim Stall (another Stall debug guru?) shows how to retrieve all stack trace and IL offset programatically. Using this info, I’ve been able to rebuild completely the stack trace, including IL Offset formatted the same way as Reflector does (hexadecimal with 4 digits prefixed with L_).
Now we get clean stack trace with IL offset. It is pretty straightforward to infer a source code line from a IL instructions. You just have to know that if the Nth IL instructions throw the exception, you’ll get the offset of the N-1th IL instruction logged.
Find below our source code + associated tests. This code implements another requirement: we remove all localization info from the stacktrace. This way we are able to build a hash code from a stacktrace. Such hash code is pretty useful to group similar crash logs, independently from the underlying Windows machines localization settings.
…and the test code that covers 100% StackTraceHelper: