Architecting LINQ To SQL Applications, part 5

Let’s return to the series on how to architect applications that use LINQ To SQL. First of all, for those you who missed it, a summary of where we have been:


Part 1, Introduction
Part 2, Layered Architectures
Part 3, DAOs and Repositories
Part 4, Dynamic Queries


Introduction


I had a couple of requests to provide some sample code. So I delayed this installment to start building something that will work to that purpose as the backbone for these articles. In the example we are modelling a system that allows us to store usernames and passwords for systems we own, just because that is a side project I am working on. It will go up on Google code toward the end of the series. I also had a number of requests to talk more about mapping and TDD, so this installment returns to that territory in more detail.


First, imagine that we have built some domain objects through Test-Driven Development. In our model an ITSystem has a collection of Keys (username and password) to systems and belongs to a Category.


   public class ITSystem
    {
        public Category Category {get;set;}
        public string Comments {get;set;}
        public string Name {get;set; }
        public List<Key> Keys {get;set;}
       
        …
    }

We’ve built our domain model before we create a database schema to store it against. We know we do need to store it, but we want to be malleable until we do.


Aggregates and Repositories 


Within our domain we have repositories, which represent a collection of our objects and negotiates with infrastrtucture services to get that list of objects. We don’t define a repository  for all of our entities, but for the root entities within an aggregate:


“An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes. Each AGGREGATE has a root and a boundary. The boundary defines what is inside the AGGREGATE. The root is a single, specific ENTITY contained in the AGGREGATE. The root is the only member of the AGGREGATE that outside objects are allowed to hold references to, although objects within the boundary may hold references to each other. ENTITIES other than the root have local identity, but that identity needs to be distinguishable only within the AGGREGATE, because no outside object can ever see it out of the context of the root ENTITY” [Evans]


In our example the root is ITSystem and the boundary includes the Keys for that system. We don’t include Category within the boundary. Although an ITSystem has a category it does not manage the lifecycle of a Category. Deleting an ITSystem does not delete the Category it belongs to.  As a corollary we might delete a Category, without deleting the IT System, if, for example, we were re-categorizing our systems.


Within our Category and ITSystem aggregates we can hold references to each other’s root objects, but nothing below them. This does not make a huge difference here, because there are only two aggregates we care about: ITSystem and Category, but the point remains.


When working with an ORM I tend to view the point of crossing an aggregate boundaries as a good candidate for lazy loading. As well as using the collection based lazy loading common in ORMs you might also want to think about using a ghost [Fowler]. Of course beware the select N+1 problem if you know that you are likely to load a collection.


Working in memory 


So we need a repository that allows us to retrieve/persist the ITSystem root and its child Keys. We can write a unit test for that:


    [TestClass]
    public class ITSytemRepositoryFixture
    {
        [TestMethod]
        public void Should_Be_Able_To_Find_All_Systems()
        {
            //setup
            MockRepository mocks = new MockRepository();
            ISession session = mocks.Stub<ISession>();
            session.Systems = CreateSystemList();


            SystemRepository systemRepository = new SystemRepository(session);


            //exercise
            List<ITSystem> systems = systemRepository.FindAll();


            //verify
            AssertSystemListMatches(session, systems);                       
        }
    }


We are not showing the creation method CreateSystemList or the verification method: AssertSystemListMatches here, for simplicity.


The session will end up abstracting DataContext. It is our unit of work and has both the methods we need to submit changes, and contains the collections that we are accessing.


Remember how in part 3 we stated that when we crossed a layer boundary we wanted to depend upon an abstraction and not a concrete type. A proof here is that depending on an abstract type allows us to replace the real data context for unit tests. In the unit test we replace the dependency from the DataContext based repository with a in-memory stub implementation which acts as a Test Double for Db operations.


Note that I am not accessing the Db here, I am just building up the functionality of the repository. Within the implementation we use LINQ to return the result set. Note that this is so simple a case, as an example, that the use of LINQ may seem overcomplex, but we don’t want to distract with complexity here.


My implementation looks like this:


    public class SystemRepository
    {
        private ISession session;


        /// <summary>
        /// Initializes a new instance of the SystemRepository class.
        /// </summary>
        /// <param name=”session”></param>
        public SystemRepository(ISession session)
        {
            this.session = session;
        }


        /// <summary>
        /// Find all the systems
        /// </summary>
        /// <returns></returns>
        public List<ITSystem> FindAll()
        {
            return
                (from s in session.Systems
                    select s)
                    .ToList<ITSystem>();
        }
    }


Mapping 


We can go a long way working with this in-memory representation, building up the queries our repository manages for us against the  domain layer without ever touching peristence to a Db. Because we are not touching the Db as yet we are very malleable. We don’t have to worry about versioning Db schemas. Where there is a lack of resistance to change, we tend to make more effort to keep our code clean.


Of course at some point we want to write the code that will persist our domain model. Again we can drive with tests.


Because these tests will be slow, we are touching the Db, we put them in a seperate assembly and tag them as Slow or Acceptance tests. Developers may want to skip running these tests when confirming their refactoring up until they check-in, so they don’t break the rhythm.


We could have an equivalent test to that above:


    [TestClass]
    public class ITSytemRepositoryFixture: ITSystemTests
    {
        [TestMethod]
        public void Should_Be_Able_To_Find_All_Systems()
        {
            //setup
            ISession session = new KeySafeSession(new DataContext(ConfigurationManager.ConnectionStrings["KeySafe"].ConnectionString));
  
            SystemRepository systemRepository = new SystemRepository(session);


            //exercise
            List<ITSystem> systems = systemRepository.FindAll();


            //verify
            AssertSystemListMatches(session, systems);                       
        }
    }


But note how this time we are using a concrete session implementation initialized from our DataContext.


In order to get our domain classes working with DataContext we have to tell DataContext how to map them. We can add mappings in a seperate mapping file, or as attributes.


Which you prefer is somewhat to personal preference. One tradeoff is between polluting your domain class with mapping information and managing XML files. I have always leaned toward the mapping file, because I like to leave my domain clean of persistence knowldege, others are happy with attributes for what is an orthogonal concern. Here we will show use of attributes.


One thing to point out here though is that as we are building from our domain first, we have no schema to map to at this point. That’s fine we are going to tag our domain classes for persistence, and then generate the Db from them.


First we need to identify the table that our domain class maps to. We simply add the Table attribute to our class to do that. [Table] allows us to map to a different name.


Although there are limitations to mapping (lack of implicit support for many-to-many, single table for inheritance hierachies) working from a clean domain gives us one of the simplest options for using LINQ To SQL as we are defining how to persist our domain, not trying to figure out what domain classes map from an existing schema. Working with a legacy schema with LINQ To SQL will always shape our domain to that schema.


We do not have an inheritance hierachy, so we don’t need to use the [InheritanceMapping] attribute.


We need to add this to all of the classes we wish to persist.


    [Table]
    public class ITSystem


Next we tag those fields that we wish to persist as columns. Again we can rename if we don’t like the domain name.


        [Column]
        public string Comments {get;set;}
        [Column]
        public string Name {get;set; }


Our entities must have an identity, that seperates them from other instances of the class. We need to tag this identity so that we can use it to identify instances of our identity. The Db will use a primary key for this so we tag a column as our primary key.


We can modify the name property of ITSystem as follows:


        [Column(IsPrimaryKey= true)]
        public string Name {get;set; }


Not all of our fields will be primitives. Where they are instances of another class the Db will need to represent this as a foriegn key pointing to a row in a table to which we persist the other class. We can show this mapping by using the attribute [Association]. We need to tell it how to hook up to our child i.e what property of the child holds the parent identifier. Our Key class looks like this at the minute:


    [Table]
    public class Key
    {
        [Column(IsPrimaryKey=true)]
        public string UserName {get;set;}
        [Column]
        public string Password {get;set; }


        public Key() {}


        public Key(string userName, string password)
        {
            UserName = userName;
            Password = password;
        }
    }


So we add a SystemName to Key so that we have the child knows its parent, and in instance of the ITSystem System so we can navigatge the relationship bidiectionally. Now we add an Association attribute to ITSystem to indicate how to map the child collection.


        [Association(OtherKey="SystemName")]
        public List<Key> Keys {get;set;}


And then we can add an Association in Key for the ITSystem so that we can traverse the relationship in both directions.


Note we need to switch to using  EntitySet for the backing storage for the many of side of our associations. We also need to use EntityRef to support the one side of our associations. This supports lazy loading, but pushes a persistence specific class info into our domain. I’m tolerant to this because it can be hidden in the public interface. We only intend to support lazy loading across an aggregate boundary, so from our ITSystem we will lazy load the category, but eager load the Keys. We will use DataLoadOptions to eager load where required. Read the documentation to see the restrictions here. So we end up with this:


        private EntitySet<Key> keys;


        [Association(OtherKey="SystemName")]
        public IList<Key> Keys
        {
            get{return keys;}
            set{keys.Assign(value);}
        }


Where we have an EntityRef<> or EntitySet<> we need to initialize the collection in the constructor, for example:


   category = default(EntityRef<Category>);


If you are too uncomfortable with using EntityRef and EntitySet see also Mathew Charles’s notes on how to work around this restriction:  (http://blogs.msdn.com/digital_ruminations/archive/2007/08/28/linq-to-sql-poco-support.aspx) on supporting Persistence Ignorance with LINQ To SQL.


Building the Db 


Once we have all of our classes mapped out, we can generate a Db to support them. I add a utility console project to do this.


We build a KeySafeContext that looks like this, to help us get generation done. We’ll come back to talk about linking it up to our ISession implementation KeySafeSession.


    public class KeySafeContext : DataContext
    {
        private const string DbName = “KeySafe”;
        public Table<ITSystem> Systems { get; set; }
        public Table<Key> Keys { get; set; }
        public Table<Category> Categories { get; set; }


       
        /// <summary>
        /// Initializes a new instance of the KeySafeContext class.
        /// </summary>
        public KeySafeContext() : base(ConfigurationManager.ConnectionStrings[DbName].ConnectionString) 
        {
            Systems = GetTable<ITSystem>();
            Keys = GetTable<Key>();
            Categories = GetTable<Category>();
        }
    }


we can then use this with a simple script to build our Db:


    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(“Generate new version of Db”);


            try
            {
                KeySafeContext db = new KeySafeContext();
               
                if (db.DatabaseExists())
                {
                    Console.WriteLine(“Deleting old version”);
                    db.DeleteDatabase();
                }


                db.CreateDatabase();


                Console.WriteLine(“Database created successfully.”);
            }
            catch(Exception ex)
            {
                Console.WriteLine(“Error creating Db…”);
                Console.WriteLine(ex.Message);
                throw;
            }
        }
    }


This will pop out any errors for you in your mapping as you build, so its a great way to check that your domain is configured correctly. I certainly got one or two errors when I set this up for the first time.


Now you can use the generated Db to check how your mappings are coming out. You have a fair amount of flexibility to specify details of how you want your final mapping to look in your attributes. Review the docs. You can use the script above to modify and build until you are happy that your domain model is being mapped out in your preferred form.


For example the default storage for our strings is nvarchar(4000), which is way larger than we want, so we specify the size and constraints of the columns directly:


        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false, IsPrimaryKey= true)]
        public string Name {get;set; }
       
I can keep adjusting the metadata in the attributes, and re-generating the Db using the utility until I get something I like.


This may also be the point where I decide to use an Id field as a key, instead of relying on natural keys, or add a version/timestamp field to support optimistic currency.


As it is possible that someone might choose to change the names of the systems and categories, I opt to add an Id column and use that as the primary key instead of the name. That change flows through to child objects, which now need to store an Id to represent the foriegn key portion of the association.


        [Column(AutoSync = AutoSync.OnInsert, IsDbGenerated = true, IsPrimaryKey = true)]
        public int Id { get; set; }


At the same time I need to adjust the mappings to reflect this. First the association with the Key class:


       [Association(OtherKey="SystemId")]
        public IList<Key> Keys {… }


and the Key class itself



    [Table]
    public class Key
    {
        [Column(AutoSync = AutoSync.OnInsert, IsDbGenerated = true, IsPrimaryKey = true)]
        public int Id { get; set; }
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string Password { get; set; }
        [Column]
        public int SystemId { get; set; }
        private EntityRef<ITSystem> system;
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string UserName { get; set; }
    …
   
            [Association(OtherKey="Id", ThisKey="SystemId")]
        public ITSystem System
        {
            get{return system.Entity;}
            set{system.Entity = value;}
        }


    }


I also decide that as this is intended to be a multi-user system I’m going to support a versions to make optimistic concurrency work better i.e. not through comparing every field.


        [Column(AutoSync=AutoSync.Always, IsDbGenerated=true,    IsVersion=true)]
        public byte[] Version {get;set;}


Finally, where I want the field within the Db to be nullable, I switch to using a nullable type on my domain.


Summarizing the Mapping


Because things may have got a little confusing through all that let’s review the mappings:


    [Table]
    public class ITSystem
    {
        [Column]
        public int CategoryId { get; set; }
  private EntityRef<Category> category = default(EntityRef<Category>);
        [Column]
        public string Comments {get;set;}
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string Name {get;set; }
        [Column(AutoSync = AutoSync.OnInsert, IsDbGenerated = true, IsPrimaryKey = true)]
        public int Id { get; set; }
        private EntitySet<Key> keys = new EntitySet<Key>();
        [Column(AutoSync=AutoSync.Always, IsDbGenerated=true, IsVersion=true)]
        public byte[] Version {get;set;}
        …
     
              [Association(OtherKey="SystemId")]
        public IList<Key> Keys
        {
            get{return keys;}
            set
            {
                foreach (Key key in value)
                {
                    key.System = this;
                }
                keys.Assign(value);
            }
        }



        [Association(OtherKey="Id", Storage="category", ThisKey="CategoryId")]
        public Category Category
        {
            get
            {
                return category.Entity;
            }
            set
            {
                this.CategoryId = value.Id;
                this.category.Entity = value;
            }
        }
      }
     
    [Table]
    public class Key
    {
        [Column(AutoSync = AutoSync.OnInsert, IsDbGenerated = true, IsPrimaryKey = true)]
        public int Id { get; set; }
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string Password { get; set; }
        [Column]
        public int SystemId { get; set; }
        private EntityRef<ITSystem> system = default(EntityRef<ITSystem>);
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string UserName { get; set; }
        [Column(AutoSync=AutoSync.Always, IsDbGenerated=true, IsVersion=true)]
        public byte[] Version {get;set;}
        …
     
                [Association(OtherKey="Id", ThisKey="SystemId")]
        public ITSystem System
        {
            get{return system.Entity;}
            set{system.Entity = value;}
        }
    }


    [Table]
    public class Category
    {
        private EntitySet<Category> children = new EntitySet<Category>();
        [Column(AutoSync = AutoSync.OnInsert, IsDbGenerated = true, IsPrimaryKey = true)]
        public int Id { get; set; }
        [Column(DbType="NVarChar(50) NOT NULL", CanBeNull=false)]
        public string Name {get;set; }
        [Column(DbType="Int", CanBeNull=true)]
        public int? ParentId { get; set; }
        private EntityRef<Category> parent;
        private EntitySet<ITSystem> systems = new EntitySet<ITSystem>();
        [Column(AutoSync=AutoSync.Always, IsDbGenerated=true, IsVersion=true)]
        public byte[] Version {get;set;}
        …
   
            [Association(ThisKey="ParentId", OtherKey="Id")]
        public IList<Category> Children
        {
            get{return children;}
            set
            {
                foreach (Category child in value)
                {
                    child.Parent = this;
                }
                children.Assign(value);
            }
        }


        [Association(ThisKey="ParentId")]
        public Category Parent
        {
            get{return parent.Entity;}
            set
            {
                if (value == null)
                {
                    ParentId = null;
                }
                else
                {
                    ParentId = value.Id;
                    parent.Entity = value;
                }
            }
        }
    }


The good, the bad, and the ugly 


Mapping can be a little tricky, so put aside sufficent time in your project to make it happen. A couple of tips are that when you get null reference exception when submitting changes, consider commenting out parts of the mapping, until you can isolate the mapping that causes the error. In particular, watch your foreign keys, its easy to get confused and end up with a broken mapping. Secondly, once you have built the Db you can always create a new solution and within that use the designer to generate the suggested mappings and re-apply anything that is missing to your domain.


Some folks are going to look at the work involved in mapping from domain to Db, panic, and just adopt a schema first approach. Don’t forget however that by going domain first, we focus on the ubiquitous language we share with the user, and by using TDD we have tests to preserve our domain behavior while we refactor to support our mapping.


As you can see, as the attributes build up, they can cause an accretion of ‘noise’ in our domain model whose pupose is orthogonal to the domain itself. If that bothers you, use an XML based mapping file instead.


We could probably, and should, revise these mappings a little more, but they will do for now.


LINQ and designing for testability


Once we have our Db generated we need to review our Session. It currently exposes its member collections as an IList<T>. While this is good enough for LINQ To Objects, LINQ To SQL needs an IQueryable. Our KeySafeContext works with a Table<T> for each class we are persisting.


For this purpose I have a helper library of classes, called Strain, that I have written that exposes a common interface, IQuerySource, which derives from both IQueryable<T> and IEnumerable<T> and adds Add and Remove methods. Both Table<T> and List<T> sourced collections can implement this:


    public interface IQuerySource<T> : IQueryable<T>, IEnumerable<T>
    {
        void Add(T item);
        void Attach(T item);
        void Remove(T item);
        void RemoveAll(IEnumerable<T> items);
    }


We expose the IQueryable<T> from a List<T>  by virtue of the conversion availabe from the Queryable.AsQueryable() method.


        public ListQuery(List<T> source)
        {
            impl = source;
            queryableImpl = impl.AsQueryable();
        }


and then just implement the IQueryable<T> in terms of this.


A Table<T> is already an IQueryable<T> so we don’t need to do this conversion. We can simply implement the Add, Attach, etc. methods we are after from the Table e.g.


      public void Add(T item)
        {
            dataTable.InsertOnSubmit(item);
        }


I don’t show the code for this helper library here, but I’ll post it at a later date. However it should be possible to implement this yourself from this.


This allows us to modify our ISession to expose an IQuerySource<T> instead of a List<T> so that we can source it from either.


We also ensure that our context has the DataLoadOptions we need to populate our eager-loaded collections.



        public KeySafeContext() : base(ConfigurationManager.ConnectionStrings[DbName].ConnectionString) 
        {
            Systems = GetTable<ITSystem>();
            Keys = GetTable<Key>();
            Categories = GetTable<Category>();


            DataLoadOptions dataLoadOptions = new DataLoadOptions();
            dataLoadOptions.LoadWith<ITSystem>(s=>s.Keys);
        }


Our acceptance tests will also need to populate the data we intend to select into the Db by adding data to the repository and then calling SubmitChanges. To avoid issues with shared fixture state between tests, we wrap the modifications in a transaction that we never commit.


Our final test looks like this:



        [TestMethod]
        public void Should_Be_Able_To_Find_All_Systems()
        {
            //setup
            KeySafeContext context = new KeySafeContext();
            ISession session = new KeySafeSession(context);
            SystemRepository systemRepository = new SystemRepository(session);
            ITSystem system = CreateITSystem();


             //exercise
            using (new TransactionScope())
            {
                systemRepository.Systems.Add(system);


                session.SubmitChanges();


                List<ITSystem> systems = systemRepository.FindAll();


                //verify
                AssertSystemListMatches(session, systems);
            }
        }


With that, the tests pass. That is enough for this installment, but next time, I’ll give some examples around dynamic querying options. After that we’ll move on to look at the place for ORMs in distributed systems.

About Ian Cooper

Ian Cooper has over 18 years of experience delivering Microsoft platform solutions in government, healthcare, and finance. During that time he has worked for the DTi, Reuters, Sungard, Misys and Beazley delivering everything from bespoke enterpise solutions to 'shrink-wrapped' products to thousands of customers. Ian is a passionate exponent of the benefits of OO and Agile. He is test-infected and contagious. When he is not writing C# code he is also the and founder of the London .NET user group. http://www.dnug.org.uk
This entry was posted in Featured, LINQ, Mocks, Stubs, TDD. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://'h1BUG/h1' ‘"><h1>BUG</h1><"’

    ‘”>

    BUG

    <"'

  • http://'h1BUG/h1' ‘"><h1>BUG</h1><"’

    ‘”>

    BUG

    <"'

  • http://shumelkd.com/ Qhkkfouj

    u6m7zc

  • elberon5

    What exactly is an ISession? Did I miss an article in this series? Is this part of LINQ to SQL?

  • Ian Cooper

    @Justin

    You don’t need to modify Table just use an adapter that does implement IQuerySource. I usually implement in terms of Table like so:

    public class TableQuery : IQuerySource where T : class
    {
    public Table
    dataTable;
    public IQueryable
    queryableTable;

    public TableQuery(Table
    dataTable)
    {
    this.dataTable = dataTable;
    queryableTable = dataTable as IQueryable
    ;
    }
    }

  • Justin Voshell

    Doesn’t the fact that System.Data.Linq.Table is a sealed class prevent it from ever being able to implement IQuerySource?

  • Ian Cooper

    @Greg As I point out above the XML option will take the pain of polluting domain classes away for you. My gut says that I would prefer to use XML mappings over attributes, but I can appreciate the arguments of people who resist the growth of XML configuration. To be frank my major reason for using attributes here was to help those people who had thought about marking up their own classes, but had only seen the designer generated code make the steps into tagging their own model more easily.

    That said I have begun to drift to using attributes with NHibernate of late so there must be something in it

  • Greg

    Do you think its a good idea to have all of this persistence information polluting the domain? What happens when we need to persist our model to two different data sources?