When Surrogate Keys Attack (and validation isn’t there)

I found an interesting bug today in Live Mesh that I wanted to share because I think that it illustrates a problem that we see time and time again as we’re building and reviewing systems.  I won’t go into all the details about how I stumbled my way into finding this, but here’s the cause (and how you can replicate).

First, on your Live Mesh desktop, create a new folder.  When you create the folder, take a look at the synchronization settings – you will see that by default, the folder is not set to sync with any of your registered devices.  That is not really a big deal because usually by this point the Mesh client has created that cool faded folder icon on your desktop, and when you double click on that icon, you’re prompted to specify synchronization settings.  However, let’s bypass that step for the moment.

1

Now, go to the folder on your file system where you want your Mesh folder to be located (My Documents, in my case) and create a new folder with the exact same name.  Then, from that folder’s context menu, select “Add folder to Live Mesh…”.

2

Now, flip back to your Live Mesh desktop and you’ll notice that you have 2 folder with the exact same name – something that’s generally considered a bad idea in file system design…

3

Now, just to test whether validation was completely non-existent or whether it was simply being bypassed in my scenario, I went to the Live Mesh desktop and tried creating another folder named “New Folder”.  As you can see, the behavior is correct in this context.

4

Also, if you were paying attention, you’ve likely noticed that half way through the walkthrough, I switched the name of the folder from “Some New Folder” to “New Folder”  This is because it seems as though a part of the problem illustrated here is one of timing (e.g. – the client hasn’t yet had a chance to sync up with the server to know that the new folder has been created).  By the time I came around to reproducing the error again (to get screen shots), I wasn’t able to use the same folder names.  Nonetheless, even with the folder name switch, you should still be able to get a sense of the bug I’m talking about.

So the moral of the story – as apps become more and more distributed and enable multiple entry points and user experiences, it is more important than ever to get the fundamentals of code organization and architecture right.  I have absolutely no knowledge of how the Live Mesh codebase is organized.  However, one very definite possibility is that there’s not a central domain model that enforces basic rules around folder names.  This leaves it up to the various teams of developers to enforce basic rules, such as name uniqueness – and as we all know, that’s many times a recipe for disaster.

There’s another potential cause for this behavior – one that I’m going to raise more as a question than as a statement.  It seems possible that because this is an application running in the cloud across multiple, distributed servers, that the new folder created on the Live Mesh desktop hadn’t been replicated to, or otherwise wasn’t yet available to the server/service that was talking to the desktop client.  I don’t have a great sense of what kinds of circumstances could contribute to this sort of problem, but it seems plausible.  So I’ll throw that out as a follow-on question of whether any of you have run into this kind of back-end synchronization problem and what you did to work past it.

So even if the cause for the bug I discovered has to do with some rocket-science-level distributed architectural issue, I think it’s still reason enough to emphasize the need for a solid domain model which encapsulates all of the rules that matter for the application.  Because while many apps don’t have the level of complexity that Live Mesh does, many seem to have this same problem.

About Howard Dierking

I like technology...a lot...
This entry was posted in architecture. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • jeremy

    yes, every Live Folder is assigned a unique identifier and is accessed via a unique URI. see http://msdn.microsoft.com/en-us/library/dd136539.aspx for exhuastive docs on how this all works.

  • Alex Cavnar

    I am sure there are keys in the background there– like unique IDs in a database table or something. You *see* “New Folder”, but you’re really accessing “Folder with ID of 222741″, or something similar.

    If we were talking about filenames on a filesystem, I’d totally agree. Maybe they need to suffix the name of the folder with the original user who created it, for clarity?

  • http://blogs.msdn.com/livemesh Jeremy Mazner

    Hi Howard, two quick comment from the Mesh team.

    First, Mesh is designed to support sharing scenarios. So I might create a “Great Music” folder for myself, you might create a folder with the same title indepently, and then some time later invite me to share it with you. We don’t want to block this sharing scenario just because both folders were created with the same name, but that means you can in fact see duplicate names for a Live Folder.

    Second, we designed our platform to support multiple data types, so under the covers we store everything as feeds (think of RSS or Atom). We made a specific design decision to not enforce rules beyond what the existing feed formats specify. There’s no rule in Atom that says two entries can’t have the same title, so we have to allow that. For feeds that are marked as representing file system folders, we have extra app logic that detects when we get into an incosistent state (duplicate filenames, hierarchy violations like cycles or orphans, etc) and display UX to the user about that.

  • http://wizardsofsmart.net/author/riles/ Ryan Riley

    Hear, hear!