CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Patrick Smacchia [MVP C#]


Dissecting a non-deterministic Windows Forms v2 bug

 

We are glad that we have just released NDepend v2.4 with the thoroughly revamped UI that I talked about a few weeks ago on this blog entry. For those of you that found the NDepend tool for .NET developers too hard to start with, we hope that our work on usability will help.

 

 

Bug description

 

As always while doing our final manual tests before releasing, we found a weird bug. We added docking panels a la VisualStudio to the VisualNDepend UI.  After playing around by moving/collapsing/auto hiding all docking panels, the following exception popup suddenly, while hovering with the mouse one of our DataGridView:

 

************** Exception Text **************

System.ObjectDisposedException: Cannot access a disposed object.

Object name: 'FloatForm'.

   at System.Windows.Forms.Control.CreateHandle()

   at System.Windows.Forms.Form.CreateHandle()

   at System.Windows.Forms.Control.get_Handle()

   at System.Windows.Forms.ToolTip.get_CreateParams()

   at System.Windows.Forms.ToolTip.CreateHandle()

   at System.Windows.Forms.ToolTip.Hide(IWin32Window win)

   at System.Windows.Forms.ToolStrip.UpdateToolTip(ToolStripItem item)

   at System.Windows.Forms.ToolStripItem.OnMouseHover(EventArgs e)

   at System.Windows.Forms.ToolStripItem.FireEventInteractive(EventArgs e, ToolStripItemEventType met)

   at System.Windows.Forms.ToolStripItem.FireEvent(EventArgs e, ToolStripItemEventType met)

   at System.Windows.Forms.MouseHoverTimer.OnTick(Object sender, EventArgs e)

   at System.Windows.Forms.Timer.OnTick(EventArgs e)

   at System.Windows.Forms.Timer.TimerNativeWindow.WndProc(Message& m)

   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

 

 

When you see such a stack panel with none of your method inside, you immediately realize that your evening at work will be longer than expected (and it was the case uh!). We suspected first the framework DXperience from DevExpress on which we rely for docking panels and knock them on their forum. Hopefully, they knew this problem and immediately answered that it is a Windows Forms bug.

 

 

Reproducing the bug

 

The DevExpress support kindly provided a small C# project that reproduces the problem (downloadable from here). To reproduce the bug with this project:


  • 1.   Start the application.
  • 2.   Press the "Click" button (do not hover over the toolstripbutton).
  • 3.       Now hover over the toolstripbutton to display the tooltip.
  • 4.       Close the "test form"
  • 5.       Again, hover over the toolstripbutton => ObjectDisposedException

 

The bug comes from the fact that the docking panel implementation changes the parent window of the underlying ToolTip control assigned to the DataGridView. When hovering the DataGridView after changing its parent window, if the previous parent window object has been disposed, you get the exception.

 

 

An idea for the fix

 

Hopefully, I found here a workaround on the ActiproSoftware forum. As DevExpress, ActiproSoftware is a Windows Forms control vendor and, without surprise, they also faced the problem. The idea is to obtain the private underlying tooltip object with reflection, and then call the method RemoveAll() on it when the parent windows is changing. This way you force re-initialization of the link to parent window. The code looks like this:

 

ToolTip t = (ToolTip)toolStrip1.GetType().GetProperty(
   "ToolTip", BindingFlags.Instance | BindingFlags.NonPublic
).GetValue(toolStrip1, null);
 
t.RemoveAll();
  
    

A fix not that easy to implement

 

This code works well when the problem comes from a ToolStrip control, but, of course, it doesn’t work on DataGridView. I wanted to use Reflector to see where was hidden the underlying ToolTip of a DataGridView but unfortunatly I didn't find it. Indeed, the DataGridView is a monster class with more than 10.000 lines of code, 1053 methods, 322 fields and 13 nested classes. I then wrote the following CQL query with NDepend to make sure that the class DataGridView is using directly or indirectly the class ToolTip.

 

SELECT TYPES WHERE IsUsing "System.Windows.Forms.ToolTip" AND NameIs "DataGridView"

 

The query told me that DataGridView is using a class that uses ToolTip. To find this intermediate class I used the following CQL query: Which class is directly used by DataGridView and uses directly ToolTip:

 

SELECT TYPES WHERE
IsDirectlyUsedBy "System.Windows.Forms.DataGridView" AND
IsDirectlyUsing "System.Windows.Forms.ToolTip"

 

The 2 matching classes are the public classes System.Windows.Forms.ContextMenuStrip and the internal nested class System.Windows.Forms.DataGridView+DataGridViewTool. It was then easy to find were the pesky ToolTip object was hidden and we wrote the following code:


FieldInfo toolTipControlFieldInfo =    

   typeof(DataGridView).GetField(

      "toolTipControl", BindingFlags.Instance | BindingFlags.NonPublic);

 

FieldInfo toolTipFieldInfo =

   toolTipControlFieldInfo.FieldType.GetField(

      "toolTip", BindingFlags.Instance | BindingFlags.NonPublic);

 

object toolTipControlInstance =

   toolTipControlFieldInfo.GetValue(m_DataGridViewItems);

 

ToolTip toolTip =

   toolTipFieldInfo.GetValue(toolTipControlInstance) as ToolTip;

 

if (toolTip != null) {  //Can be null at init.

   toolTip.RemoveAll();

}

 

I know how ugly it is to rely on private implementation but, here we have no choice.

 

 

Checking that the bug is corrected by .NET3

 

We found out that the bug was impossible to reproduce on our main development machines because it is in fact corrected with .NET3. I explained in the post .NET 3.5 Core Stuff that, even though Microsoft made the decision to avoid touching the .NET Framework assemblies (such as System.Windows.Forms.dll), they took a chance to correct some bug.

 

We then used the build comparison feature of NDepend to see if one the method in the buggy stack trace has been modified (interestingly enough, we figured out that 84 methods of System.Windows.Forms.dll were changed, 202 were added and 37 were removed). Here is the CQL query that matches the changed methods in System.Windows.Forms.ToolTip:

 

SELECT METHODS FROM TYPES "System.Windows.Forms.ToolTip" WHERE CodeWasChanged

 

The result is the following…

 

Methods

NbILInstructions

SetTool(IWin32Window,String,ToolTip+TipInfo+Type,Point)

232

CreateHandle()

201

SetToolTipInternal(Control,ToolTip+TipInfo)

147

WmPop()

126

Hide(IWin32Window)

88

SetToolInfo(Control,String)

59

 

 

…and indeed the method ToolTip.Hide(), shown in the buggy trace, has been changed (76 to 88 IL instructions). We then used Reflector to see the code change and indeed there is a test…

 

 if (this.GetHandleCreated())

 

… to check the parent window when hiding the tooltip.

 

 

Making sure that our users won't be annoyed by the bug

 

With the hack described, it seemed that everything worked fine. However, I was not confident since the bug is indeterminist and might be still luring around others controls and we then decided to not popup exception whose stack trace contains the string
"System.Windows.Forms.ToolTip.CreateHandle()".

 

I know how ugly is this last choice but in the real-world you sometime not have the choice.



Comments

Li Yang said:

Nice to see how to use NDepend and Reflector to locate a fix. Yes, wonderful demo for NDepend. This is a great post.

# September 12, 2007 9:40 PM

dama said:

Gr8 post. How did you stop the exception when the stack trace contains the string

"System.Windows.Forms.ToolTip.CreateHandle()".

thx

# September 13, 2007 6:06 AM

Patrick Smacchia said:

Dama, this is a tricky thing not very well documented.

I'll certainly write a blog post to clarify things but basically you need to handle the event

System.Windows.Forms.Application.ThreadException += UnhandledExceptionOnUIThread;

...

private void UnhandledExceptionOnUIThread(object sender,

ThreadExceptionEventArgs e) {

  // teh exception is reachable in e.Exception

  // eventually show here you own errorForm dialog

  if( mustAbort) {

     m_MainForm.Close();

  }

}

this will remove the WindowsForms default error dialog and let you a chance to swallow the exception / resume the program or abort ...

This is not obvious at all because the behavior is not the same when debugging your app.

Hope this help.

# September 15, 2007 8:50 AM

Johann Holzel said:

You know, this kind of thing is exactly why I don't use Microsoft technologies anymore.

Back in '97, I had two really nasty bugs in the same product, one on the Windows GUI, and one on the Linux GUI. The first one, we quickly traced down to MFC; the second, to Gtk.

It only took a day or two to find exactly what was wrong with both libraries. In both cases, there was no easy way to "hook" the code to fix it (Gtk is a C API; MFC is all C++, but the relevant function in CView wasn't virtual), and a workaround would have been a nasty mess in the MFC case, maybe not even possible in the Gtk case.

So, what did I do? First I submitted a bug report, along with a patch, to Microsoft, and likewise to the Gimp/Gtk/Gnome/XCF team.

The Microsoft bug was never closed; about two years later, VC6 came out with a new version of MFC without that bug, but even then our code had to be reworked to build on VC6 and use the new MFC42. In the meantime, we had to recode big chunks of CView and the subclasses we were using, wrap them up in an MFC extension DLL, add 40% to the total size of our install, and hold our release back for two weeks while legal verified that we were actually allowed to ship.

My Gtk patch was accepted almost immediately. It was in the main 1.0 tree within a few days, and in the Redhat packages within a few weeks. By the time we shipped, most of our customers probably already had it. (Just in case, we put RPMs, raw binaries, the diff, and complete sources on our site, but nobody downloaded any of them.)

Sure, the VC form designer was more polished than Glade, but in the end, the 3 minutes saved building dialog templates were not worth the huge costs of being tied to their libraries.

Today, I still write software for Windows, and for .NET. And I use their compilers. But I don't use WinForms unless I have to. With Gtk# or Qt#, I can fix problems myself, or get them fixed quickly, and my customers don't have to install an entirely new .NET runtime; just one DLL.

# September 17, 2007 10:50 PM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add

About Patrick Smacchia

Patrick Smacchia is a Visual C# MVP involved in software development for over 15 years. After graduating in mathematics and computer science, he has worked on software in a variety of fields including stock exchange, airline ticket reservation system as well as a satellite base station at Alcatel. He's currently a software consultant and trainer on .NET technologies as well as the lead developer of the tool NDepend which provides numerous metrics and caveats on any compiled .NET application. He is the author of Practical .NET2 and C#2, a .NET book conceived from real world experience with 647 compilable code listings. Check out Devlicio.us!