Sam Gentile

Sponsors

The Lounge

Syndication

News

  • This Blog has moved to samgentile.com. If you have subscribed via FeedBurner, you do not have to do a thing, feed has been re-pointed

Advertisement

Images in this post missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at imagehelp@codebetter.com
Parallel Computing and Concurrency on .NET

I mentioned back here that I was designing a Parallel Calculation Engine using Microsoft .NET. I actually am doing this as part of my team of three and we have learned quite a bit the last three weeks. For me, I haven't done much with just "straight" multithreading under .NET, by which I mean outside of WCF or outside of using BackgroundWorker in our Smart Client since 2002. Most of my MT work has been in 17 years of C++/Win32 and other places. So I went and relearned multi-threaded .NET programming as well as diving deep into the internals of Oracle for some of the optimization work we wanted to do (that should be another post). Let's just say, on the Oracle front for now, that Explain Plans and TKPROF are your best friends. I can't talk a lot about what I am doing (as a lot of it is proprietary) but I did want to share some insights and links to material that influenced my design.

Of course, Herb Sutter, among many others has been proclaiming that the "Free Ride is Over" since 2005, stating that Moore's Law no longer held and chip manufacturers are turning en masse to hyperthreading and multi-core architectures requiring a programming revolution. Well, that time is now. When going to design a financial calculation engine, sheer performance and being able to do many things in parallel are the requirements that trump all others.

The place I started with is Joe Duffy's excellent "Using concurrency for Scalability" and Vance Morrison's "What Every Dev Must Know about Multithreaded Apps." I highly recommend that all .NET developers read both of these articles. Joe is actually doing a lot of work in the area of Parallel Computing and Concurrency; his latest post is on CLR Monitors and sync blocks. Joe's work got me fascinated to go deeper into the world of Parallel Computing, Parallel Algorithms/Calculations and the work that had been done in the MP and Java worlds.

I should also mention that there a lot of books that mention or cover .NET multithreading, but the book that I found most valuable is Juval Lowy's Programming .NET Components, 2nd Edition. His multi-threading chapter is available here as a PDF: www.ftponline.com/.../NETComponents.pdf. More specifically, Juval doesn't just teach you MT, he goes far beyond to tell you what to use and what to avoid. I ended up bringing in his better substitute class for System.Thread.

There are a number of key patterns in this space that I saw my design being influenced by or coalesce around. The first key one is Master Slave that was identified by Frank Buschmann in POSA that belongs on every developer's shelve. I actually have been greater influenced by POSA than GOF personally. The issue here, of course, is that it is hard coming up with Tasks and exploitable concurrency. You have to consider many things such as what is CPU-bound vs. being I/O bound, the costs of threads, blocking issues and so much more. As Joe Duffy talks about, thread's just aren't free: "The cost of creating a Windows thread is approximately 200,000 cycles, whereas the cost of destroying one is about 100,000 cycles." That matters a lot in design and led me to the ThreadPool to amortize the costs of the threads in parts of the design. The ThreadPool is not a universal good thing however, as the it has a limit of 25 threads and you don't want to just add to the upper limit (we found that was counterproductive). You also want to consider that WCF and other parts of the framework are also using the Thread Pool. Back to the breaking down the problem into Tasks: the best resource I found here is the very useful book Patterns for Parallel Programming. When coming up with Tasks that can be executed in Parallel, the Task Decomposition, Data Decomposition, Group Tasks Patterns all are super useful.

After you have your Tasks, a pattern that I ended up using is the Fork/Join Pattern, where a master task forks multiple child tasks (which themselves can also fork child tasks), and each master task subsequently joins with its children at some well-defined point. One other thing that I am considering using is Software Transactional Memory and specifically Ralf's .NET Software Transactional Memory Class. This class and concept makes the concurrent code look clean and free of all the gunk.

It's been 2 and 1/2 hours writing this so I am going to stop here -).

Now playing School by Supertramp on album Paris (Reissue Remastered)


Posted Sun, Jul 8 2007 5:00 PM by Sam Gentile

[Advertisement]