Howdy and Viewing Unmanaged Code in VS.NET

 


An obligatory bit about me (skip this paragraph to get to real information) … I am an independent consultant from Atlanta Georgia. I live here with my girlfriend Laura and our two dogs Machiavelli (a Neapolitan Mastiff) and Margeaux (our soon to be robotic Labrador). Our previous home was in New Orleans which is still our favorite travel destination. I like to post about a wide variety of items (DDD, Agile Development, Performance, Mono / Cross Platform Development, AOP, ice cream) I do my best to keep them in categories but I will not promise that they are always well categorized.


One of the things I like to post about (performance) involves some ramp up on the part of the reader. I figured I would start off by walking people through some basic unmanaged debugging using Visual Studio .NET so that it is a bit easier to follow later examples, and it allows me to just link to it in the future :D. I personally also use windbg (even though it has an interface which is an acquired taste) but VS.NET is quickly becoming a fully featured debugging tool and is much more accessible for simple optimization questions.


When we are dealing with code with a need to be highly performant looking at IL is often not good enough as the JIT optimizer will optimize our code much further than we could ever hope to on our own without things like unsafe code which also makes our code unverifiable. If you use a tool such as reflector or ildasm you can quickly see that the code generated in release and debug mode are almost identical. This brings to mind the question of why release mode is so much faster; it is the JIT optimizations that are applied. We cannot see these JIT optimizations by looking at the code in a managed way so we will look at the native code generated from our code.


I want to make it clear that I am not suggesting people do this all of the time. I cannot stress enough to be very wary of Premature Optimization. Make your code work first; your aim should be clarity not speed as it is not worth the effort … profile your code once done and re-visit sections that need to be sped up. You will generally see that <10% of your code is using up 70+% of your execution time, go back and optimize that 10%. Your focus should also generally be on measured speed (not just looking at generated code). Also keep in mind that the choice of data structures within your application is far more important than lower level optimizations.


That said knowing what happens under the covers is also fun! Let’s get on to setting up Visual Studio and walk through a simple example. To start with we need some code to expriment with.






        static void Main(string[] args) {


            for (int i = 0; i < 10; i++) {


                Console.WriteLine(“Hello World!”);


            }


        }


 


Now we need to get unmanaged debugging setup in Visual Studio. In order to do this we go to the properties of our project then to the debug tab. There is a checkbox there to “Enable unmanaged code debugging”, select it.


It is important to note that this option only applies to the current configuration that we are running in. We should therefore set it for both of our configurations. There is some additional work that we need to in order to deal with release unmanaged debugging but we will get to that later.


Insert a break point on the beginning of the for loop and go to the debug menu and select start (or press f5 J)


You will hit a breakpoint as usual… your screen should look something like the display in Figure 2 <insert picture>. If you do not have a call stack window, bring it up by going to the debug menu -> windows -> call stack (or by hitting ctrl + d then c)


Once we have our call stack up we can right click on our function and select “go to disassembly” by doing this we can get the following code listing for our code.






        static void Main(string[] args) {


00000000  push        ebp 


00000001  mov         ebp,esp


00000003  push        edi 


00000004  push        esi 


00000005  push        ebx 


00000006  sub         esp,38h


00000009  xor         eax,eax


0000000b  mov         dword ptr [ebp-10h],eax


0000000e  xor         eax,eax


00000010  mov         dword ptr [ebp-1Ch],eax


00000013  mov         dword ptr [ebp-3Ch],ecx


00000016  cmp         dword ptr ds:[00912DC8h],0


0000001d  je          00000024


0000001f  call        792B228E


00000024  xor         esi,esi


00000026  xor         edi,edi


00000028  nop             


            for (int i = 0; i < 10; i++) {


00000029  xor         esi,esi


0000002b  nop             


0000002c  jmp         0000003D


0000002e  nop             


                Console.WriteLine(“Hello World!”);


0000002f  mov         ecx,dword ptr ds:[022B303Ch]


00000035  call        785D9074


0000003a  nop                 


            }


0000003b  nop             


            for (int i = 0; i < 10; i++) {


0000003c  inc         esi 


0000003d  cmp         esi,0Ah


00000040  setl        al  


00000043  movzx       eax,al


00000046  mov         edi,eax


00000048  test        edi,edi


0000004a  jne         0000002E


        }


0000004c  nop             


0000004d  lea         esp,[ebp-0Ch]


00000050  pop         ebx 


00000051  pop         esi 


00000052  pop         edi 


00000053  pop      


 


What we are viewing is the actual native code being generated by the JIT mapped up to our code that we wrote.  We can quite easily see how our for loop works at the native level, if you have never seen one before this probably looks quite odd but it is in fact rather normal. Let’s take a close look at what is going on here.


00000029  xor         esi,esi


0000002b  nop             


0000002c  jmp         0000003D


 


This bit of code initializes our counter in ESI, which is an index register that can be used for indexing arrays. You will notice here that an age old assembly “trick” is being used to set our counter to 0, instead of loading the value 0 into the register we XOR it against itself (anything XOR’ed to itself is 0). The next line Nop meaning (no operation) does just that. The code then tells it to immediately jump to 3D ..  Sometimes jumps like this make us go through our code in a non-top down fashion (just like gotos in higher level languages such as C, VB, or C#). If we go to the other half of the for loop we can continue analyzing our code.


0000003c  inc         esi 



The first instruction in the tail bit of code (3C) increments our counter which is held in the ESI register (You can view the velues in your registers using the “registers” window in the debug windows or by hitting ctrl+d then r). In our first iteration however we will skip this line, the jump to 3D as we do not want to increment our index on the first iteration.


0000003d  cmp         esi,0Ah


00000040  setl        al  


00000043  movzx       eax,al


00000046  mov         edi,eax


00000048  test        edi,edi


0000004a  jne         0000002E



3D (the place the previous code jumped to and the immediately following lines) represent the actual comparison back to our stop value 10 (0A) and if we have not reached that value the last line will jump to 4A (which is the inner body of our loop).


0000002f  mov         ecx,dword ptr ds:[022B303Ch]


00000035  call        785D9074


 


The first line here will load our constant string from memory into the ECX register (which is a common purpose register), you will notice that ECX is always used for passing the first parameter to a method (in the case of instance methods ECX always contains “this”), followed by EDX for the second parameter then a series of “push” statements to store others on the machine’s stack.


The next line actually calls the function. I will at a later point go through how to look this function up but right now we have no issues figuring it out as we can see from our source that it is Console.WriteLine. The code will then fall back through to our increment discussed earlier and back through our verification code (which will then jump back into our inner loop code).


Needless to say, we can see significant waste happening in our trivial example. A perfect example of such waste can be seen here


00000009  xor         eax,eax


0000000b  mov         dword ptr [ebp-10h],eax


0000000e  xor         eax,eax


00000010  mov         dword ptr [ebp-1Ch],eax


 


Where we set EAX to 0 twice in a row… This is due to the fact that we are currently running in debug mode and not allowing optimizations to occur. In other words this code has been JIT’ed without allowing the JIT to do some intelligent things to make our code go faster.


There are a few issues with viewing optimized code. The first issue is the fact that the debugger by default will disable JIT optimizations (believe me I myself have gone through many a long hour late at night not realizing I was looking at un-optimized code). The second issue has to deal with “Just My Code” and the optimized code is therefore skipped. I had originally found the method of getting around this in Vance Morrison’s Post on the subject. (Thanks Vance I had been bothered by this for a while and ended up just looking at raw assembler without source using the other method listed).


To get around these issues we should follow the instructions given


1)      Go to Tools -> Options -> Debugging -> General 


2)      Make certain that box labeled  ‘Suppress JIT optimization on module load’ is UNCHECKED.


3)      Also make certain the box labled ‘Enable Just My Code’ is also UNCHECKED. 


Vance also suggests going to the advanced build screen to insure that your release dll is built with pdb only, but this defaults on my machine. We would then simply run our code the same way we ran it before.


Another way to view our code with JIT is to build in release mode, (add a Console.Read() at the end) . Start the executable without the debugger, then attach the visual studio debugger (select break all from the debug menu or hit ctrl+alt+break to break).  


Using either method we can now view our JIT optimized code. Our code should come out something like the following






        static void Main(string[] args) {


            for (int i = 0; i < 10; i++) {


00000000  push        esi 


00000001  xor         esi,esi


                Console.WriteLine(“Hello World!”);


00000003  cmp         dword ptr ds:[02271084h],0


0000000a  jne         00000016


0000000c  mov         ecx,1


00000011  call        786FC654


00000016  mov         ecx,dword ptr ds:[02271084h]


0000001c  mov         edx,dword ptr ds:[0227307Ch]


00000022  mov         eax,dword ptr [ecx]


00000024  call        dword ptr [eax+000000D8h]


            for (int i = 0; i < 10; i++) {


0000002a  add         esi,1


0000002d  cmp         esi,0Ah


00000030  jl          00000003


00000032  pop         esi 


            }


        }


00000033  ret  


 


Wow this is a lot less code than what we had the first time through. JIT optimizations have really done a great job on this code. This difference is why it is so important to look at the actual disassembly at times as opposed to the IL as the JIT will often times recognize patterns in the IL that it can optimize.


An astute reader will notice that there is in fact more code in the inner part of our loop. This can be scary at first, but in this case the optimizer has inlined the Console.WriteLine method for us, saving us quite a bit of code. I will get more into inlining in a later post but it is a huge optimization that occurs.


We have at this point made our way through to looking at non-optimized and optimized native code in the debugger.  I think this is a good place to start. Over the next few posts we will build off of this knowledge to look more in depth at some common optimizations that occur with the JIT, we will also get into some other tools and look how they can assist in our quest for better code. Hope to see you there.


 


 


Some Resources:


http://en.wikipedia.org/wiki/X86
http://www.codeguru.com/csharp/.net/net_general/il/article.php/c4635/ IL tutorial
http://burks.brighton.ac.uk/burks/language/asm/asmtut/asm1.htm ASM tutorial

This entry was posted in Under The Covers. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

4 Responses to Howdy and Viewing Unmanaged Code in VS.NET

  1. SamGentile says:

    Excellent start – you’re making me proud -)

  2. Greg says:

    Glad to be here Brendan. Nice one on the lagniappe, I had considerred it as well but there goes that idea :)

    My gf is out of town this weekend = hopefully extra time to write ..

    Greg

  3. btompkins says:

    Fascinating stuff and what a great way to start your tenure here at CodeBetter! Glad to have you aboard.

    BTW, I’m a sortof NO native (Dad’s whole family lives there) … I always thought of naming my blog “Lagniappe” :)

    -Brendan

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>