One of the small costs of transparency

If you’ve been following, you’ll notice that we’ve had a flurry of activity over the last couple of weeks, culminating in the release of Web API preview 5.  If you’ve been following our Codeplex repository for a while, you’ll also note that this flurry of activity was preceded by months of inactivity.  This wasn’t because we were just sitting around talking waiting for the code to update itself.  We had been wanting to update the repo for a while.  The problem was that the process to update the code was very hard and very manual (meaning it was also very error prone).  This is largely due to the fact that our source code is completely self-contained in its repository.  Usually, you would think of self-containment as a good thing.  However, in the scenario where you want to be able to share your code with others (like on Codeplex, for example), this property becomes a major problem.  This is because code in our repository takes 0 dependencies on anything outside of our repository – including things like .NET Framework assemblies.  Additionally, our build files contain all sorts of includes to custom tools, custom build targets/properties, etc.  As a result, getting code out of our repository and then updating it so that it will actually build is a non-trivial task – until now.

I went through this manual process once and quickly decided that I would never do that again.  I already had an interest in PowerShell, so decided to kill multiple birds with one stone and automate the process using PowerShell.  Additionally, because one of the major steps in the automation workflow is transforming the project files to remove internal assembly and tool references, I called into some XSLT from my PowerShell script.

At a high level, the workflow looks as follows:

  1. Sync our product code from our internal repository
  2. Build our product code (this will generate needed files that rely on internal build tools)
  3. Create a new clone of the Codeplex repository
  4. In the Codeplex clone, delete everything in the source and test folders (this will better help to identify delete files in the hg repo when committing)
  5. Copy all source and test files from our internal repo to our hg clone
  6. Run XSLT over all *.csproj files in the hg repository to remove all internal references
  7. Run RegEx over all AssemblyInfo.cs files to remove delay signed info from InternalsVisibleTo attributes
  8. Run RegEx to clean up some assembly strong name references in some test config files
  9. Open the solution in Visual Studio and make sure that all projects build and that all tests run

As you can see, there are a few things going on here – even with the automation – so imagine what the manual process looked like. 

Like I mentioned earlier, this automation project gave me an excuse to do something that I had been wanting to do anyway – learn PowerShell.  As such, I want to show you the PowerShell script in part because you may currently (or one day) be facing a similar challenge, and in (larger) part because I want your feedback on how I can improve the PowerShell (like I said, I was learning).

Anyways, here’s the PowerShell:

Additionally, a big part of the workflow is transforming project files to remove any dependencies on internal libraries and tools.  For this workflow task, XSLT was the right fit – and that XSLT looks like this:

I hope that this proves helpful if you ever find yourself in the position where you have to regularly keep code in sync between 2 different repositories (or even the same repositories with different folder structures).  And like I said, I welcome feedback on how I can improve my PowerShell.

About Howard Dierking

I like technology...a lot...
This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Pingback: | Blog | Web API Preview 5 Is Now Available

  • Anonymous

    I’m not removing any references or IVT attributes.  For the references, I’m simply replacing static file references for .NET Framework assemblies with the GAC equivalent (I’m also updating the path details for project references).  For the IVT attributes (as well as some assembly references in config files), I’m not removing them – just stripping out the strong name information.

    “Why not just always use only codeplex, instead of having two repos?”

    Glad you mentioned that – this is something that’s being floated around the team.  At the moment, the barrier is just inertia – but hopefully this will change over time.

  • Anonymous

    as Justin said, the problem is not CodePlex – we’re just using CP as a host for Mercurial – the same way we would use a provider like bitbucket.  The problem is the huge chasm between how code is structured in our internal source control vs. how it needs to be structured in order to simply be sync’d/downloaded and built.

  • Justin Chase

    It sounds like they’d have the same issue with those other sites as well. It’s not codeplex that’s the problem its all the dependencies they don’t want to push into the public repositories.

  • Justin Chase

    I don’t understand how you can delete references to things and remove InternalsVisibleTo and still have it build and have unit tests pass. That sounds like either that stuff isn’t truly necessary or you’re removing some important stuff. No? Why not just push all that stuff into the repository too? Why not just always use only codeplex, instead of having two repos?

  • Scott White

    Why even use CodePlex if this is what is required.  I mean seriously, these these sites like Google Code or GitHub are supposed to be a means to an end.  I tried CodePlex once and I could tell right away it was going to be a pain in the ass and support was weak so I migrated off a long time ago.