CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

DevTeach session materials posted

My session materials are posted here on my google code repository

 

Blackbelt configuration for new projects
Jeffrey Palermo - ARC439
Any architect knows the challenges of setting up configuration management for a new project. Architecture isn't just for the application. The manner in which source control, dependencies, and the Visual Studio solution is set up can have profound impacts on the productivity of the team. In this session, we'll set up a source control repository, a VS.Net solution and a build script to enable a team to move quickly on the project. We'll used advanced techniques to reduce friction while working with the code base on a day-to-day basis.

 

 

ASP.NET MVC Framework Submersion
Jeffrey Palermo - NET328
The move from ASP 3.0 to ASP.Net was a very dramatic move, and it forced developers to learn a completely new way for building web applications on Windows servers. From Web projects with v1.1 to websites in v2.0 and then web application projects in v2.0+ , working with ASP.Net can be a more difficult than necessary due to viewstate, postbacks and the control lifecycle for post-back eventing. Microsoft is providing an extension to ASP.NET to provide an easy way to implement the Model-View-Controller pattern using ASPX as a view engine (templating). With all presentation logic residing in the Controller, the View (ASPX) is left to concentrate on what it does best: rendering html. This new MVC framework is pluggable and testable and even allows for Controller classes to be created with your IoC container of choice. This presentation will include a primer on programming with the MVC pattern and will also cover unit testing controllers and creating controllers that use dependency injection.

Silverlight Consuming REST Services

I just finished writing the first draft of a sample I am including in my upcoming book tentatively titled Data Access with Silverlight 2 by O'Reilly. Without giving too much away yet since the final details of the contract are not set in stone, the application example consumes a REST service, manipulates it through LINQ to XML, and binds it to various controls and some composite controls. The interaction with the REST (REpresentational State Transfer) services is pretty slick and quite easy when using Silverlight and LINQ to XML. Of course there are always issues to deal with, but overall it works very nicely.

Why use REST? Well, REST services are becoming more abundant on the web. They do not expose a contract like WCF so when you deal with this type of data you can parse the XML using LINQ to XML or some other XML tools (though LINQ TO XML is so smooth why bother with anything else in this case). So this raw XML comes barreling into your Silverlight application asynchronously, LINQ to XML makes it fall in line, and its bound to where it needs to go via XAML.

Sending data back via REST is also very cool. I've got that working now too. I have to be careful not to go overboard fine tuning the examples though or the book will never get written :) Interacting with REST from Silverlight applications is just one piece of the data access puzzle, but its pretty cool.

 

Cross posted from johnpapa.net

devTeach: Strategic Domain Driven Design

Here is Dave Laribee's session on Strategic Domain Driven Design from devTeach.

 

 

 

I have always admired the simplicity and art of Dave's slides, this time I particularly enjoyed the Mr. Roger's slide!

 

Of course Dave provides some awesome technical content as well good artistic taste; enjoy!

devTeach after thoughts

Just got home from the airport; what a trip. I flew out Tuesday night on a red eye arriving at about 6:30 am in Toronto (On my way out I listened to the new altdotnet podcast this is great stuff!). I spent all of Wednesday in a barely-able-to-walk-because-I-was-so-tired state popping in and out of various sessions and hanging out with many of the great people at the conference.

Today I checked out a few sessions including Owen's continuous integration and Dave Laribee's Strategic Domain Driven Design (I took videos of my session and Dave's session and will be placing them up here in the next few days). I really wanted to see two of Oren's talks (DSLs and advanced DI but was unable to make either). I also wanted to see JP's DDD session but it was unfortunately scheduled at the same time as my talk so I was unable to make it. 

On a side note I would really love it if at one of these conferences we could do a bunch of sessions like this one after another (or a pre/post con tutorial) to jump start people who are not familiar with the subject (drop me a line if something like a pre/post con tutorial would interest you as I am curious on the level of interest). As an example at devTeach we could have come close to this with some small schedule changes as the sessions could almost be seen in a progression of JP->Dave->Me with each session adding to the knowledge learned in the session before. If we were to add in another 2-3 sessions there could be a huge amount of effective knowledge transfer. Maybe I will look at following up on this idea for devTeach Montreal but there are actually 2 of these tracks coming up!

Unfortunately the two tracks are both in the same week in November :( One being Oredev in Sweden and the other being QCon SF. To be honest I don't know how Eric does it, it sounds insane to me to be presenting in Sweden and in SF within 2 days... You may notice that there is also an alt.net track listed for QCon ... cool stuff! QCon is the best/highest level conference I have ever been to, its will be great to have a good showing from the .NET community.

As always, the discussions outside of the sessions were great. When you get so many smart people in one place good things are bound to happen!

JR and crew did an great job (as always) with the conference. Setup was good, especially the ability to find power for my laptop. I heard there were problems with the wireless early in the conference but my experience with it was great.

DevConnections Las Vegas - Nov 2008

I'll be speaking at DevConnections in Las Vegas this Fall along with others Julie Lerman and Paul Litwin. The 3 topics I will be presenting are:

  • Data Access with Silverlight 2
  • Integrating Enterprise Library's Data Access Application Block with your Project
  • Practical Strategies with the Entity Framework

image

The past few events have been awesome and getting stronger and more fun each time. If you are planning on attending the conference in the Fall, please stop by and say hi.

My upcoming book tentatively titled Data Access with Silverlight 2 should almost be available by the conference. I am targeting December, but I am hoping I can pull it off for November.

 

Cross posted from johnpapa.net

Why do we Refactor?

I've been refactoring the StructureMap codebase in preparation for wrapping up the 2.5 release.  I've been doing this work with a couple goals in mind.

  1. Experimentation.  I'm purposely revisiting some code just to see if I can come up with a better solution and structure.
  2. Improving the structural quality of the code in order to make the code more approachable to others.
  3. Some of the code is flat out embarrassing.
  4. Changing the structure to allow for easier extensions.  There's a couple specific new pieces of functionality that I thought would be easier to implement if I did some refactoring first.
  5. To more closely align the architecture with its functionality and usage.  StructureMap was originally envisioned as a generic mechanism to build an object graph from some sort of Xml representation.  Today, we do vastly more things with an IoC tool than I had in mind back in the summer of 2003.

 

As the effort proceeds, I've been thinking about the reasons that we do refactoring, or more importantly, the reasons that justify refactoring.  When the topic of refactoring as a described practice was first being broached earlier in this decade, I read a lot of people upset over the idea of refactoring.  Refactoring was variously described as:

  • A result of insufficient upfront design (which I still think is silly because refactoring is often a result of doing upfront design too early and/or getting that upfront design wrong)
  • Undisciplined hacking
  • A dangerous activity that needlessly risked destabilizing working code (a very real fear that can nevertheless be mitigated by good automated test coverage with well written code and well written tests)
  • A waste of resources.  Useless goldplating, or as my buddy Chad Myers would say, "polishing the nosecone."

That being said, what are the responsible reasons that lead us to refactor our code?  I can think of two major reasons.

  1. To remedy a deficiency in code, design, or architectural quality.
    1. We shouldn't declare a coding task complete until it reaches the quality demanded by the team.  These refactoring's are generally small, cheap items like renaming methods and extracting methods that can be performed safely and quickly with an automated tool.
    2. I'm doing a lot of work on StructureMap as a result of static analysis results from NDepend.  NDepend helpfully pointed out some flaws in the class structure that I've since remedied.  In specific, I've been working on reducing the complexity of any class with a high Cyclomatic Complexity (all the classes are now under 30 with a few exceptions that I'm ignoring because the methods are all simple).  I would also worry about efferent and afferent coupling plus the size of the classes and methods in your system. 
  2. To make a forthcoming change to the code easier. 
    1. If I'm changing a big method or complex class I'll often start by refactoring towards a Composed Method by extracting methods or even smaller, more focused classes.  On a project last year I did some quick renaming of variables from "m0, m1, m3" to more descriptive names.  My goal in these cases is to simply make the existing code easier to understand *before* I start to make modifications.
    2. For a new feature, I might need part of the functionality of an existing method or a class, but not the rest of it.  Rather than duplicate that functionality by copying and pasting, I'll extract the functionality to be shared into a new class or method so that it can be easily reused by the code for the new feature.
    3. In several cases, I've wanted to change one aspect of the behavior of an existing class while leaving the rest of the class's behavior intact.  In that case, I think the class is violating the Single Responsibility Principle, and I've split the class up along responsibility lines to isolate the changes to a smaller surface area. 
    4. Before making a change, you may need to decrease or even eliminate coupling on an existing code structure.  In StructureMap, I've frequently found myself wanting to change the internal structure of a class for various reasons, but first being blocked because other classes are coupled to internal details of the existing code structure.  In many cases, the culprit has been a Law of Demeter violation. 
    5. Introducing new abstractions often gives us a new seam in the class structure that we can exploit to add new behavior with minimal change to existing classes
    6. If we have to change any functionality that has duplicated representations in the code, it's often worth the while to first refactor the code to eliminate that duplication before making the change.

 

Either way, you're main goal with refactoring is to make the code easier to work with in the future.  In some cases (the second category), that desired change is immediate.  In the first category, you're simply keeping the code cleaner to handle any type of extension and make the system easier to understand.  In regards to Lean thinking, you only refactor on a project when the proposed refactoring will lead to better throughput of the development work.  If a structural problem in the code isn't causing any immediate harm or friction, you probably leave it alone for now.

 

Any thoughts?  What did I miss?

Concurrency with MPI in .NET

In my previous post, I looked at some of the options we have for concurrency programming in .NET applications.  One of the interesting ones, yet specialized is the Message Passing Interface (MPI).  Microsoft made the initiative to get into the high performance computing space with the Windows Server 2003 Compute Cluster Server SKU.  This allowed developers to run their given algorithms using MPI on a massive parallelized scale.  And now with the Windows Server 2008 HPC SKU, it is a bit improved with WCF support for scheduling and such.  If you're not part of the beta and are interested, I'd urge you to go through Microsoft Connect. 

When Is It Appropriate?

When I'm talking about MPI, I'm talking in the context of High Performance Computing.  This consists of having the application run within a scheduler on a compute cluster which can have 10s or hundreds of nodes.  Note that I'm not talking about grid computing such as Folding@Home which distributes work over the internet.  Instead, you'll find plenty of need for this in the financial sector, insurance sector for fraud detection and data analysis, manufacturing sector for testing and calculating limits, thresholds and whatnot, and even in compiling computer animation in film.  There are plenty of other scenarios that are out there, but it's not for your everyday business application.

I think the real value comes with .NET to be able to read from databases, communicate with other servers with WCF or some other communication protocol, instead of being stuck in the C or Fortran world which the HPC market has been relegated.  Instead, they can cut down on the code necessary for a lot of these applications by using the built-in functions that we get with the BCL.

MPI in .NET

The problem has been to run these massively parallel algorithms left us limited to Fortran and C systems.  This was ok for most things that you would want to do, cobbling together class libraries wasn't my ideal.  Instead, we could use a lot of the things that we take for granted in .NET such as strong types, object oriented and functional programming constructs.

The Boost libraries were made available for MPI in C++ very recently by the University of Indiana.  You can read more about it here.  This allowed the MPI programmer to take advantage of many of the C++ constructs that you can do in regular C, such as OOP.  Instead of dealing with functions and structs, there is a full object model for dealing with messaging.

At the same time as the Boost C++ Libraries for MPI were coming out, the .NET implementation has been made available based upon the C++ design through MPI.NET.  It's basically a thin veneer over the msmpi.dll which is the Microsoft implementation of the MPICH2 standard.  For a list of all operation types supported, check the API documentation here for the raw MSMPI implementation.  This will give you a better sense of the capabilities more than the .NET implementation can.

What you can think of this is that several nodes will be running an instance of your program at once.  So, if you have 16 nodes assigned through your scheduled job, it will spin up 16 instances of the same application.  When you do this on a test machine, you'll notice 16 instances of that in your task manager.  Kind of cool actually.  Unfortunately, they are missing a lot of the neat features in MPI which includes "Ready Sends", "Buffered Sends", but they have included nice things such as the Graph and Cartesian communicators which are essential in MPI.

You'll need the Windows Server 2003/2008 HPC SDK in order to run these examples, so download them now, and then install MPI.NET to follow along.

Messaging Patterns

With this, we have a few messaging patterns available to us.  MPI.NET has given us a few that we will be looking at and how best to use them.  I'll include samples in F# as it's pretty easy to do and I'm trying to get through on the fact that F# is a better language for expressing the messaging we're doing instead of C#.  But, for these simple examples, they are not hard to switch back and forth.

To execute these, just type the following:

mpiexec - n <Number of Nodes You Want> <Your program exe>

Broadcast

A broadcast is a a process in which a single process (ala a head node) sends the same data to all nodes in the cluster.  We want to be efficient as possible when sending out this data for all to use, without having to loop through all sends and receives.  This is good when a particular root node has a value that the rest of the cluster needs before continuing.  Below is a quick example in which the head node sets the value to 42 and the rest will receive it.

#light

#R "D:\Program Files\MPI.NET\Lib\MPI.dll"

open System
open MPI

let main(args:string[]) =
  using(new Environment(ref args))(fun _->
    let commRank = Communicator.world.Rank

    let intValue = ref 0
    if commRank = 0 then
      intValue := 42
     
    Communicator.world.Broadcast(intValue, 0)
    Console.WriteLine("Broadcasted {0} to all nodes", !intValue)
  )
main(Environment.GetCommandLineArgs())

Blocking Send and Receive

In this scenario, we're going to use the blocking send and receive pattern.  This will not allow the program to continue until I get the message I'm looking for.  This is good for times when you need a particular value before proceeding to your next function from the head node or any other particular node.

#light

#R "D:\Program Files\MPI.NET\Lib\MPI.dll"

open System
open MPI

let main (args:string[]) =
  using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size
    let intValue = ref 0
    match commRank with
    | 0 ->
      [1 .. (commSize - 1)] |> List.iter (fun i ->
        Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
        Console.WriteLine("Result: {0}", !intValue))
    | _ ->
      intValue := 4 * commRank
      Communicator.world.Send(!intValue,0, 0)
  )

What I'm doing here is letting the head node, rank 0, to do all the receiving work.  Note, that I don't care particularly where the source was, nor what the tag was.  I can specify however, if I wish to go ahead and receive from a certain node and of a certain data tag.  If it's a slave process, then I'm going to go ahead and calculate the value, and send it back to the head node of 0.  The head node will wait until it has received that value from any node and then print out the given value.  The methods that I'm using the send and receive are generic methods.  Behind the scenes, in order to send, the system will go ahead and serialize your object into an unmanaged memory stream and throw it on the wire.  This is one of the fun issues when dealing with marshaling to unmanaged C code.

Nonblocking Send and Receive

In this scenario, we are not going to block as we did before with sending or receiving.  We want the ability to continue on doing other things while I sent the value, while the other receivers might need that value before continuing.  Eventually we can force getting that value from the node through the communication status, and then at a certain point, we can set up a barrier so that nobody can continue until we've hit that point in our program.  The below sample is a quick sending of a multiplied value and letting it continue.  The other nodes will have to wait until that broadcast comes, and then we'll wait at the barrier until the job is done.

let main (args:string[]) =
  using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size
   
    let intValue = ref 0
    if commRank = 0 then
      [1 .. (commSize - 1)] |> List.iter (fun _ ->
        Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
        Console.WriteLine("Result: {0}", !intValue))
    else
      intValue := 4 * commRank
      let status = Communicator.world.ImmediateSend(!intValue,0, 0)
      status.Wait() |> ignore
     
    Communicator.world.Barrier()
  )
 
main(Environment.GetCommandLineArgs())

Gather and Scatter

The gather process takes values from each process and then sends it to the root process as an array for evaluation.  This is a pretty simple operation for taking all values from all nodes and combining them on the head node.  What I'm doing is a simple calculation of gathering all values of commRank * 3 and sending it to the head node for evaluation.

let main (args:string[]) =
  using(new Environment(ref args))( fun e ->
    let commRank = Communicator.world.Rank
    let intValue = commRank * 3
   
    match commRank with
    | 0 ->
      let ranks = Communicator.world.Gather(intValue, commRank)
      ranks |> Array.iter(fun i -> System.Console.WriteLine(" {0}", i))
    | _ -> Communicator.world.Gather(intValue, 0) |> ignore
  )
 
main(Environment.GetCommandLineArgs())

Conversely, scatter does the opposite which takes a row from the given head process and splits it apart to be spread out among all processes.  In this exercise I will go ahead and create a mutable array that only the head node will modify.  From there, I will scatter it across the rest of the nodes to pick up and do with whatever they please.

let main (args:string[]) =
  using(new Environment(ref args))( fun e ->
    let commSize = Communicator.world.Size
    let commRank = Communicator.world.Rank
    let mutable table = Array.create commSize 0
   
    match commRank with
    | 0 ->
      table <- Array.init commSize (fun i -> i * 3)
      Communicator.world.Scatter(table, 0) |> ignore
    | _ ->
      let scatterValue = Communicator.world.Scatter(table, 0)
      Console.WriteLine("Scattered {0}", scatterValue)
  )
 
main(System.Environment.GetCommandLineArgs())

There is an AllGather method as well which performs a similar operation to Gather, but the results are available to all processes instead of the root process. 

Reduce

Another collective algorithm similar to scatter and gather is the reduce function.  This allows us to combine all values from each process and perform an operation on them, whether it be to add, multiply, find the maximum, minimum and so on.  The value is only available at the root process though, so I have to ignore the result for the rest of the processes.  The following example shows a simple

let main (args:string[]) =
  using(new Environment(ref args))( fun _ ->
    let commRank = Communicator.world.Rank
    let commSize = Communicator.world.Size
   
    match commRank with
    | 0 ->
      let sum = Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0)
      Console.WriteLine("Sum of all roots is {0}", sum)
    | _ ->
      Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0) |> ignore
  )
 
main(Environment.GetCommandLineArgs())

There is another variation called the AllReduce which does very similar operations to the Reduce function, but instead makes the value available to all processes instead of just the root one.  There are more operations and more communicators such as Graph and Cartesian, but this is enough to give you an idea of what you can do here. 

LINQ for MPI.NET

During my search for MPI.NET solutions, I came across a rather interesting one called LINQ for MP.NET.  I don't know too many of the details figuring the author has been pretty aloof as to providing the complete design details.  But it has entered a private beta if you do wish to contact them for more information.

The basic idea is to provide provide some scope models which include for the current scope, the world scope, root and so on.  Also, it looks like they are providing some sort of multi-threading capabilities as well.  Looks interesting and I'm interested in finding out more.

Pure MPI.NET?

Another implementation of the MPI in .NET has surfaced through PureMPI.NET.   This is an implementation of the MPICH2 specification as well, but built on WCF instead of the MSMPI.dll.  Instead, this does not rely on the Microsoft Compute Cluster service for scheduling and instead, uses remoting and such for communication purposes.  There is a CodeProject article which explains it a bit more here.

More Resources

So, you want to know more, huh?  Well, most of the interesting information is out there in C, so if you can read and translate it to the other APIs, you should be fine.  However, there are some good books on the subject which not only provide some decent samples, but also some guidance on how to make the most of the MPI implementation.  Below are some of the basic ones which will help on learning not only the APIs, but the patterns behind their usage.


Wrapping It Up

I hope you found some of this useful for learning about how the MPI can help for massive parallel applications.  The patterns learned here as well as the technologies behind them are pretty powerful to help you think about how to make your programs a bit less linear in nature.  There is more to this series to look at thinking of concurrency in .NET, so I hope you stay tuned.

More Posts Next page »



What's New