Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Better multithreading support needed in .NET

This is the kind of post you make, and immediately get 4 comments saying the class you requested is already in the framework….

I’ve been working a lot on multithreaded applications lately. I’m not specifically talking about taking advantage of multi-core CPUs on the client, but it’s been clear for a while now that these are quickly becoming the norm and developers are going to have to adapt.

I’ve mostly been learning as I go, but it’s made me realize that the tools available to developers are lacking. To be honest, I don’t want to have to understand what the heck Greg Young is blogging about – he speaks a different language than me – I want tools and my programming language to help me out as much as possible. PLinq is interesting, but it’s a minor part of the overall picture.

Java seems ahead of .NET when it comes to multithreading programming.  They have true thread-safe collections that use lock-free algorithms. If you’re doing multithreading programming, do yourself a favor and look up Lock Free Data Structures, you’ll quickly realize that most of what you’ve read on MSDN as well as Microsoft own thread-safe collections are garbage. I’ve used Julian Bucknall’s collections with quite a bit of success. It’d also be nice to have truly atomic variables in .NET, like those found in Java’s java.util.concurrent.atomic package – there’s something unsettling about playing with the Interlock class.

Like Chris Mullins, I wish there was an exposed ThreadPool class. I don’t know if he ever solved his problem of thread starvation, but we’ve also had to write our own implementation because we couldn’t guarantee all of our operations would be asynch. Coincidentally, if you’re creating a  socket server, there are numerous blog posts on the Conversant blog worth reading.

At the top of my wish list though is support for multithreading testing. Multithreaded applications might be hard to write, but they are a nightmare to test. It’s easy enough to run multiple threads over your code, but I haven’t yet seen (which doesn’t mean it doesn’t exist), something that lets you transparently test specific race and deadlock conditions. Was a waiting thread properly signaled? Was a deadlock properly resolved? So far my approach has been simply to put huge loads on the application and hope problems surface.
 

This entry was posted in Featured. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

18 Responses to Better multithreading support needed in .NET

  1. Csharper says:

    When you are talking about programming, you have to be careful about which group of audience you are addressing. As an application architect, I think issues as complex as utilizing the power of multi cored CPUs most efficiently would be better handled by the virtualization effort put together by products such as VmWare. For us in appllication development, we really do not require lock free, thread safe collections and variable. Using such variables to implement multithreaded applications would unnecessarily increase the effort required for QA and post release support. I preferred using multithreads ten years back when virtualization was still a fun project and not part of the mainstream release architecture. Right now using them would unnecessarily complicate the overall architecture. Its just like how cool analogue computers were. But dumb digital computers took over any way.

    Thinking simple is the way to go. The idea is to create a single thoughroughly tested unit without any multi threading code of your own that runs on a single box. You should have enough communication architecture to support heartbeats and common communication protocol between these simple units. And just to get more power from your architecture, run as many instances of tis single well tested unit on as many Virtual environments as you like. Quick, easy and reliable (given that the virtual environment you have selected is reliable too). I really don’t see any advantage that you gain by having slow performing thread safe collections and variables for the purpose of application programming. Hence, I do agree with a lot of people who have commented that you are not using the right tool for the right job. Think simple just like digital computers and not like the analogue computers. Otherwise you will soon find your skills extinct.

  2. Rafaelsc says:

    October MSDN is dedicate in Parallel and Currently.
    http://msdn.microsoft.com/msdnmag/issues/07/10/default.aspx

    The is a New Library for Currently, Called “Parallel FX Library” for dotNet (Not Release). (I thing this is a new name for Microsoft CCR).

  3. shebert says:

    Thanks guys for the followup. This is really interesting stuff and I’d love to hear more! The more details the better! :)

    To my second point – after a little more digging it’s not as critical as I thought. The effective CAS function is using InterlockedExchange(…) which is essentially a threading primative that ensures atomicity in the address change. The OS scheduler should not abandon the thread during the exchange and even if it did the other threads waiting on it would effectively have a lower scheduler priority until the running thread completes the exchange (that’s my read anyway). Cool!

  4. karl says:

    Greg I don’t mind at all. I’d actually appreciate any follow up blogs (even if they are detailed 😉 ) about the topic.

  5. Greg says:

    btw: steve for the first one … a check in/out with optimistic concurrency … or following the command pattern (UOW applies commands) with optimisic concurrency is probably your best bet.

  6. Greg says:

    Steve: I can put up some blog posts in *great detail* on the performance characteristics of lock-free (ie lock free list based) and hey-I-don’t-give-a-crap-if-its-slightly-off (ex: lock free hash tables being used as a heuristic) data structures.

    The big issues with lock-free structures are two fold. The first is that they are inherently non-deteministic (infinite loops etc). There is no assurance you save anything from a context switch as it may still occur. The second is that starvation is a huge concern as they are non-deterministically unfair (one giant race condition where performance is probabalistic)

    I’d be *very* careful about using lock free structures in your code aside from (wait-free) immutable structures. My benchmarks generally favor lock free when i hit about 6-8 processors…..

    Sorry for hijacking Karl :)

  7. karl says:

    Steve:
    I don’t have an answer to your first question (how to do it without locking).

    As for the second one,I can just say that I’ve observed improvements moving to a lock-free data structure in specific cases (from having 200+ queued up requests under heavy load to having very little). I do see what you mean, and I can only guess that the result will depend on the situation.

  8. Steve says:

    Hi Karl,

    Lock free (as opposed to wait free) synchronization is pretty interesting but has always been puzzling to me. It would be interesting to see an article that goes beyond the individual datastructures to see how this can be applied.

    For example – I find it to be a far bigger problem to make cross-collection operations atomic than modifications to single collections. In the case where I have 4 collections (named A, B, C and D) and two UnitOfWork functions operating on these:

    UoW1: A->B->C->D
    UoW2: C->B

    How do I ensure that the end result is deterministic regardless of the overlap of these two processes without using locks?

    The other question I have revolves around the core question of “what do we really gain by this”? It’s illustrated by Bucknell’s LockFreeQueue class in the “while(! UpdatedNewLink)” loop.

    Depending on where (i.e. which core) the other updating thread is running, that routine burns processor cycles until either (1) another core happens to be executing the other updating thread and it completes or (2) the OS scheduler eventually moves on to another thread. The lock free approach relieves the cost of context switching, but to what end?

    Maybe I’m missing something here, but it’s something that has always left me scratching my head with this topic.

  9. Greg says:

    btw: LOL

  10. Greg says:

    mc# is very cool….

    have you looked at annex F of ECMA 335? there is some cool stuff coming like auto threaded loops etc.

  11. karl says:

    I guess I’ll have to look over MbUnit again.

    Brian thanks for that. I knew about Erlang but had forgotten the name and had wanted to look it up to learn more. I can’t agree though that Erlang is the right tool for the job, not that I doubt the language, just our ability to adequately staff in it.

  12. Steve Dunn says:

    Software Transactional Memory looks good, although I haven’t used it on any projects yet (http://research.microsoft.com/research/downloads/Details/6cfc842d-1c16-4739-afaf-edb35f544384/Details.aspx).
    Testing of race conditions and deadlocks can be achieved using the thread features of MbUnit.
    Cheers,
    Steve

  13. Brian says:

    It sounds like you might be using the wrong tool for the job. I haven’t used it myself, but I have been hearing a lot of good stuff about Erlang. You might want to check it out.

    http://www.pragmaticprogrammer.com/titles/jaerlang/

  14. James Kovacs says:

    I agree that we need better multi-threaded support in our programming languages and frameworks given the coming wave of multi-core architectures, which we are just starting to see. You point out the lack of thread-safe, lock-free collections and variables as problems in .NET. Honestly I believe these are the least of our worries. Synchronizing at the level of the individual collection or variable doesn’t help us much. You’re synchronizing at the wrong level. The real multi-threaded challenge is that you need to synchronize user-meaningful operations, which often involve more than a single data structure. Once you are dealing with more than a single data structure, low-level locking is just overhead. (I’m assuming that you’re performing appropriate higher-level locking.) If you have a thread-safe hashtable and linked list, you can make no assurances about algorithms that use both being thread-safe.

    So yes, we need better support in our programming languages and frameworks with respect to multi-threading, but not at the level of collections and variables typically. Technologies like PLINQ and software transactional memory hold more promise, IMHO.

  15. test says:

    http://www.mcsharp.net/

    MC# programming language is an extension of C# language and is based on .NET platform. This language is an adaptation of the basic idea of the Polyphonic C# language for the case of multi-threaded distributed computations.

  16. Scott says:

    Good points.

    I could swear I’ve read about ThreadPool improvements in 3.5 (namely the ability to create multiple pools), but I’ll be darned if I can find it right now…

  17. Tim B says:

    With the introduction of multicore chips and all that software out there that can’t leverage them, I’ve started thinking that maybe the time has come to push all threading responsibilities down into the OS layer. I have absolutely no idea how that would work out, but I imagine many of the same arguments against it were also made against managed code 10 years ago. :)