CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Ian Cooper [MVP]

March 2008 - Posts

  • Architecting LINQ To SQL Applications, part 7

    Previously: Architecting LINQ To SQL Applications, part 6

    The topic of managing entity lifetimes is an important one as many of the issues that people have when using an ORM for the first time relate to a lack of understanding of how an ORM manages objects loaded from the Db, or that are to be inserted into the Db. In addition over the next few installments we will begin to talk about some of the issues related to multi-tier scenarios. It is important to understand how lifetime is managed because many of the issues people have come from working against the ORM rather than with it in these circumstances.

    What are Entity and Value types

    An Entity is a type which has an identity that remains unique and consistent throughout its lifetime. It is a unique in the sense in that it must always be possible to distinguish one entity from another. It is consistent in that even if the attributes change, the entity will still retain its identity. Consider a Customer type. As an entity we must be able to distinguish one customer from another. We need to somehow define Customer identity through a member. We avoid using natural members like Name because they are not individually unique and may change over time. Instead we base the identity of our entity on a surrogate member that gives us identity instead. In this case we might define an Id member for our Customer that assigns them a unique value within our organization. 

    Within the Db a primary key field distinguishes an entity, which is represented by a row on a table. LINQ To SQL piggybacks on this to provide support for an Entity, via a type mapped via a Table attribute or mapping, that corresponds to a table that has a primary key. So we map our Customer class to a Customer table and our Id member to an Id primary key field on that table. Again natural keys are allowed, but should be avoided in favour of surrogates to ensure that our entity remains unique and identifiable throughout its lifetime.

    The counterpart to an Entity type is a Value type. A Value type takes its identity from the value of its attributes. Two entities that have the same attributes, compare the same. For example two postcodes that have the value "ABC 123" compare the same so we consider them to have the same identity. A value type might have just the one attribute, as with our postcode example, but also might have multiple attributes. An example of a value type with multiple attributes would be a Money type that has both an Amount and a Currency. We want to compare two instances of Money for GBP 123.74 as equal.

    Within the DB a Value is represented as one or more columns within the row of an entity. LINQ To SQL only supports primitives as Values i.e. string, int, etc. and has no support for mapping user-defined value types. Thus there is no direct support in LINQ To SQL for types like Money. Hopefully this is a limitation that will be addressed in future versions.

    Entity Lifetime

    A data context is a unit of work. It tracks changes and submits updates to the Db when flushed. Because of this you should only keep it around long enough to do the work ...but no longer. When working with a DataContext we need to distinguish between, to borrow Hibernate terminology, transient and persistent entities.

    Persistent objects have been loaded by LINQ To SQL and we have a reference to them in the DataContext’s identity map. The DataContext tracks changes for the persistent entity against the time that they were loaded. Future requests for that object, will return the object in the cache. Changes to that entity can be submitted to the Db.  Lazy Loading of associations, uses the same DataContext we loaded the entity with originally.

    An entity that is not in the identity map of the DataContext is a transient entity. It has a lifetime equivalent to the running application. To ensure that the entity persists we need to add it the matching table on the DataContext using InsertOnSubmit and flush it using SubmitChanges on the DataContext, causing the entity to become persistent.

    MyContext context = new MyContext();

    //persistent object
    MyObject myOldObject = context.MyObjects.Where(m => m.Name == "Old").Single();
    //changes are tracked
    myOldObject.Name = "Changed Name";

    //new entity; transient
    MyObject myNewobject = new MyObject();
    //no need to track; no Db row to update yet
    myNew.Name = "New Name";

    //make new object persistent and flush changes to old objects
    context.SubmitChanges();

    This is not always intuitive:

    MyContext anotherContext= new MyContext();

    //new entity; transient
    MyObject myNewobject = new MyObject();
    myNew.Name = "New Name";

    //new entity is still transient until we actually submit
    anotherContext.MyObjects.InsertOnSubmit(myNewobject);

    //not persistent, won't be found
    var results = anotherContext.MyObjects.Where(m => m.Name == "Old");

    It is important to note that although an identity map is sometimes called a first-level cache its purpose is not to optimize retrieval of entities from the Db. Because the ORM, such as LINQ To SQL does not know the results of a query it cannot determine whether two calls to a query (even if it is the same query as other users may have updated the Db) will return the same result set. For this reason it must always bring the result set back, and then check for the existence of those entities within the identity map. If the map contains the item, we must return that instance instead because you might have made changes to the entity and we want to preserve them througout the unit of work. For this reason the cache is not about optimisation but about ensuring you do not lose changes during your unit of work. It is possible that queries that request an entity by primary key could be retrieved from the map directly, where they are already loaded, but you should not rely on this optimization.

    Because we hold the object within the identity map for the lifetime of the unit of work there is a danger of concurrency errors, where another user updates the Db while we have the object. For this reason the identity map stores the original version of our object as well as the current one. This allows LINQ To SQL to compare the original against the state of the Db when it issues an update or delete query and raise an optimistic concurrency violation error if it has changed. Of course if you use a timestamp and set Update.Never on your mapping to inform LINQ To SQL that it should not check that field when looking for concurrency errors you can optimize this feature as well. However the optimal SQL issues by LINQ To SQL during an update still depends on knowing what has changed.

    LINQ To SQL supports a Refresh method on its DataContext to force a reload of an entity or attributes of an entity. The purpose here is to allow you to resolve optimistic concurrency errors or to reload an object during a unit of work if you are aware that the Db has been changed by a mechanism outside of the purview of LINQ To SQL. The Refresh just goes back to the Db to find the latest version of your object. Note that refresh targets specific objects in the map or a set of objects.

    You can disable use of the identity map by setting ObjectTrackingEnabled to false. The purpose here is to optimize when you are loading read-only collections i.e. you never want to submit changes on these objects back to the Db. Remember though that another DataContext will consider these to be transient objects, so avoid assigning references loaded this way into entities loaded via DataContexts which are tracking.

    Attaching Entities

    Sometimes we have an entity that is in the backing store but not in the identity cache of our DataContext. To make our DataContext aware that this is a persistent and not transient entity we need to Attach it to our DataContext. This puts it in our cache. However, our context cannot know if the entity is the same as the representation on the backing store, so it must assume so and change tracking will consider your object to be in the ‘original’ state.  If your object has changed and you try to save those errors, you will get optimistic concurrency errors. This is because when we compare your ‘original’ to the Db, they do not match, which fools LINQ To SQL into thinking another user has changed it since we loaded it.

    One option to avoid you must tell the DataContext what the original state in the Db was or set your columns as UpdateCheck.Never. If you have a Timestamp column, you can rely on that to do the right thing for you. Sometimes people suggest an Update.Never on every column strategy, where you cannot use a timestamp, but the danger is that we can overwrite genuine changes by another user.

    Otherwise we need to either provide the original, by maintaining the original state for any objects we may choose to detach, or adopt a load and replay strategy for detached objects where we load the current representation from the Db and then write our changes over it.

    Think carefully before heading down the detached objects route as it multiplies the complexity of what you are doing. This issue most often raises its head in multi-tier scenarios. We will talk about how to handle those in a future blog post, but for now recognize that the unit of work implies that the framework will not help you track changes outside of that context.

    Managing DataContexts

    Do not try to work with two different contexts at the same time. This is because what are persistent entities for one look like transient entities to the other because it does not have them in its identity map, as it did not load them.

    Do not try to access an object graph loaded via LINQ To SQL outside of its DataContext if it has lazy loaded properties. This is because LINQ To SQL will access the original DataContext to load the entities.  Trying to lazy load within another context falls foul of our earlier rule not to mix our contexts.

    Finally, assume that a DataContext is not thread safe i.e. work with a DataContext only on one thread and do not try to pass entities retrieved via a DataContext on one thread, to another thread.

    When working with a web application consider creating a DataContext per http request, using it to retrieve and then submit any changes required by the session. For a client-side application consider using a DataContext for each application transaction.

    While a DataContext is disposable, only dispose of it when you finish your request or application/transaction and are finished with the persistent entities that it loaded.

    Exercise caution around caching entities that were loaded via a DataContext. This is because when you access those elements they may still refer to the DataContext if they contain a lazy loaded association. If you want to use LINQ To SQL to load cached data, make sure you load objects that are not coupled to the DataContext, by not using EntitySet<> and EntityRef<> for associations, and disabling deferred loading on the context that you use to load them. This can be appropriate for reference data, in which case, you can also disable change tracking.

    Is LINQ To SQL deficient here?

    I read  a fair number of opinions that are suprised by the behavior of LINQ To SQL. However, having used ORMs for a number of years, I find LINQ To SQL conforms to my expectations as to how an ORM should behave. Indeed a reading of something as old now as Martin Fowler's Patterns of Enteprise Application Architecture and you will find exactly this pattern of behavior for an ORM discussed. Indeed WORM (Wilson O/R Mapper now open source BTW) and NHibernate both behave in a similar fashion. So some of this seems to be based on expectations that don't come from experience of using ORM tools. On a recent .NET Rocks there was an opinion expressed that LINQ To SQL was somehow only fit to be a RAD tool because of multi-tier issues. I can't agree with this opinion at all. LINQ To SQL has a similar feature set to WORM, on which I have built distributed enterprise applications. Its limitations relate to the diversity of mappings that it supports (table per concrete class in an inheritance hierachy or value types for example) and its lack of support for multiple Db vendors, not some percieved issues around the unit of work and identity map patterns which it implements.

  • Architecting LINQ To SQL Applications, part 6

    Previous: Architecting LINQ To SQL Applications, part 5

    Mapping with XML files instead of Attributes

    Greg Young pointed out in the comments to the last post that using attributes can clutter your domain objects. Although it is simpler to show attributes first, so that you can relate rolling your own mappings to the designer generated code, I did not want to leave the story incomplete without showing you how to move those mappings into a file in order to keep your domain objects clean.

    The correspondence between the mapping file and the attributes is straightforward. Instead of attributes we just have XML elements and instead of properties on those attributes, we have attributes on our XML elements.

    First of all we need to create a text file to hold our mappings. We call it Keysafe.map. Next we need to indicate the xml encoding:

    <?xml version="1.0" encoding="utf-8"?>

    Now we need to open up a Database element, which will form the root of our mapping.

    <Database Name="northwind" xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">
    </Database>

    Within the Database element we need to add a Table element for each entity we wish to map (equivalent to our [Table] attribute).

    <?xml version="1.0" encoding="utf-8"?>
    <Database Name="northwind" xmlns="
    http://schemas.microsoft.com/linqtosql/mapping/2007">
     <Table Name="dbo.Category">
     </Table>
    </Database>

    Because we are not using an attribute we have to tell LINQ To SQL what type our table maps to explicitly:

    <?xml version="1.0" encoding="utf-8"?>
    <Database Name="northwind" xmlns="
    http://schemas.microsoft.com/linqtosql/mapping/2007">
     <Table Name="dbo.Category">
      <Type Name="KeySafeDomain.Category">
      </Type>
     </Table>
    </Database>

    Then we need to map out our Columns. Again because we are not associating our attribute with a member, we have to explicitly indicate the member.

    <?xml version="1.0" encoding="utf-8"?>
    <Database Name="northwind" xmlns="
    http://schemas.microsoft.com/linqtosql/mapping/2007">
     <Table Name="dbo.Category">
      <Type Name="KeySafeDomain.Category">
       <Column Name="Id" Member="Id" DbType="Int NOT NULL IDENTITY" IsPrimaryKey="true" IsDbGenerated="true" UpdateCheck="Never" AutoSync="OnInsert" />
       <Column Name="Name" Member="Name" DbType="NVarChar(50) NOT NULL" CanBeNull="false" UpdateCheck="Never" />
       <Column Name="ParentId" Member="ParentId" DbType="Int" UpdateCheck="Never" />
       <Column Name="Version" Member="Version" DbType="rowversion NOT NULL" CanBeNull="false" IsDbGenerated="true" IsVersion="true" AutoSync="Always" />
      </Type>
     </Table>
    </Database>

    As before the DbType information is there to help us generate the Db from our domain model.

    We also need to map out the associations between our classes. Again the conversion between the attribute based model and our XML model is straightforward.

    <Association Member="Children" Storage="children" ThisKey="ParentId" OtherKey="Id"/>

    <Association Member="Parent" Storage="parent" ThisKey="ParentId"/>

    <Association Member="Systems" Storage="systems" OtherKey="CategoryId"/>

    At this point it is just grunt work to translate our previous attribute based mappings into an XML mapping file.

    In the end our mapping looks like this:

    <?xml version="1.0" encoding="utf-8"?>

    <Database Name="northwind" xmlns="http://schemas.microsoft.com/linqtosql/mapping/2007">

    <Table Name="dbo.Category">

    <Type Name="KeySafeDomain.Category">

    <Column Name="Id" Member="Id" DbType="Int NOT NULL IDENTITY" IsPrimaryKey="true" IsDbGenerated="true" UpdateCheck="Never" AutoSync="OnInsert" />

    <Column Name="Name" Member="Name" DbType="NVarChar(50) NOT NULL" CanBeNull="false" UpdateCheck="Never" />

    <Column Name="ParentId" Member="ParentId" DbType="Int" UpdateCheck="Never" />

    <Column Name="Version" Member="Version" DbType="rowversion NOT NULL" CanBeNull="false" IsDbGenerated="true" IsVersion="true" AutoSync="Always" />

    <Association Member="Children" Storage="children" ThisKey="ParentId" OtherKey="Id"/>

    <Association Member="Parent" Storage="parent" ThisKey="ParentId"/>

    <Association Member="Systems" Storage="systems" OtherKey="CategoryId"/>

    </Type>

    </Table>

    <Table Name="dbo.ITSystem">

    <Type Name="KeySafeDomain.ITSystem">

    <Column Name="CategoryId" Member="CategoryId" DbType="Int NOT NULL" UpdateCheck="Never" />

    <Column Name="Comments" Member="Comments" DbType="NVarChar(4000)" UpdateCheck="Never" />

    <Column Name="Name" Member="Name" DbType="NVarChar(50) NOT NULL" CanBeNull="false" UpdateCheck="Never" />

    <Column Name="Id" Member="Id" DbType="Int NOT NULL IDENTITY" IsPrimaryKey="true" IsDbGenerated="true" UpdateCheck="Never" AutoSync="OnInsert" />

    <Column Name="Version" Member="Version" DbType="rowversion NOT NULL" CanBeNull="false" IsDbGenerated="true" IsVersion="true" AutoSync="Always" />

    <Association Member="Keys" Storage="keys" OtherKey="SystemId"/>

    <Association Member="Category" Storage="category" OtherKey="Id" ThisKey="CategoryId"/>

    </Type>

    </Table>

    <Table Name="dbo.Key">

    <Type Name="KeySafeDomain.Key">

    <Column Name="Id" Member="Id" DbType="Int NOT NULL IDENTITY" IsPrimaryKey="true" IsDbGenerated="true" UpdateCheck="Never" AutoSync="OnInsert" />

    <Column Name="Password" Member="Password" DbType="NVarChar(50) NOT NULL" CanBeNull="false" UpdateCheck="Never" />

    <Column Name="SystemId" Member="SystemId" DbType="Int NOT NULL" UpdateCheck="Never" />

    <Column Name="UserName" Member="UserName" DbType="NVarChar(50) NOT NULL" CanBeNull="false" UpdateCheck="Never" />

    <Column Name="Version" Member="Version" DbType="rowversion NOT NULL" CanBeNull="false" IsDbGenerated="true" IsVersion="true" AutoSync="Always" />

    <Association Member="System" Storage="system" OtherKey="Id" ThisKey="SystemId"/>

    </Type>

    </Table>

    </Database>

    We can then delete all of our mappings from our domain model so that our model is clean.

        public class Category
        {
            private EntitySet<Category> children = new EntitySet<Category>();
            public int Id { get; set; }
            public string Name {get;set; }
            public int? ParentId { get; set; }
            private EntityRef<Category> parent;
            private EntitySet<ITSystem> systems = new EntitySet<ITSystem>();
            public byte[] Version {get;set;}
            ...
        }
       
        public class Key
        {
            public int Id { get; set; }
            public string Password { get; set; }
            public int SystemId { get; set; }
            private EntityRef<ITSystem> system = default(EntityRef<ITSystem>);
            public string UserName { get; set; }
            public byte[] Version {get;set;}
            ...
        }
       
        public class ITSystem
        {
            public int CategoryId { get; set; }
      private EntityRef<Category> category = default(EntityRef<Category>);
            public string Comments {get;set;}
            public string Name {get;set; }
            public int Id { get; set; }
            private EntitySet<Key> keys = new EntitySet<Key>();
            public byte[] Version {get;set;}
            ...
        }

    Managing the Mapping File 

    I often embed the file into the dll that contains the model. The upside here is that you don't need to worry about deployment, but the downside is that you cannot change the mapping without re-issuing the DLL. If you embed the mapping file, you will need some code to read it, so that you can pass it into the DataContext. I use a helper class like this:

        public static class Mapping
        {
            public static XmlMappingSource GetMapping()
            {
                XmlMappingSource mapping;
                using (Stream stream = Assembly.GetExecutingAssembly().GetManifestResourceStream("KeySafeDomain.Mappings.Keysafe.map"))
                {
                    mapping = XmlMappingSource.FromStream(stream);
                }

                return mapping;
            }
        }

    Then we just grab the mapping when we construct our typesafe DataContext:

            public KeySafeContext() : base(ConfigurationManager.ConnectionStrings[DbName].ConnectionString, KeySafeDomain.Mappings.Mapping.GetMapping()) 
            {
                Systems = GetTable<ITSystem>();
                Keys = GetTable<Key>();
                Categories = GetTable<Category>();

                DataLoadOptions dataLoadOptions = new DataLoadOptions();
                dataLoadOptions.LoadWith<ITSystem>(s=>s.Keys);
            }


    With that done we can re-run our tests to check that everything passes.

     

    Posted Mar 09 2008, 07:01 PM by Ian Cooper with 9 comment(s)
    Filed under:
  • Developer Day Ireland

    A while back I posted about the future of Developer Day (called DDD here, but not to be confused with Domain Driven Design). For the international audience DDD is was inspired by US Code  Camps, and has a similar by and for the community agenda.

    One of the issues I raised was that the UK & IE only had one Code Camp event. Although we ran it twice a year, it was always over-subscribed, suggesting that there was a huge demand for on-the-weekend community events. We have always seemed a big enough place to hold a lot more regional events. Back when I last blogged I mentioned DDD Scotland and now I am pleased to note that there will be a DDD Ireland on Saturday May 3rd. It's great to see the community events spreading. It is good to see some ALT.NET topics in the agenda too, Ben is going to be be talking about TDD, Alex Homer (who I guess breaks the rules once he goes to work at the P&P team, but I guess he gets away with it this time around) on Dependency Injection, and Mike Azocar talking about Scrum.

    I would still like to see more people across the UK + IE holding local altnet conferences. Come on us Londoners can't have all the fun. They are a lot easier to organize than a DDD style event, as they are mostly self-organizing. 

     

  • Some recommended books

    A couple of people at the BASTA conference asked me for book recommendations for people who were trying to get a better idea of how to do solution design. At the same time, a couple of people at my TDD Best Practices talk at the London .NET user group asked me to remind them about book recommendations. I promised to put the titles we discussed on my blog, so here we go.

    This is list is not meant to be exhaustive, but I think it represents a good starting point. I am sure there are many books I have left out that people like, but I wanted to keep the list to a manageable number for people to think about tackling. They are all ones I have found real value in and keep referring to (usually via Safari nowadays).

    Design
    Head First Design Patterns
    Agile Principles, Patterns, and Practices in C#
    Patterns of Enterprise Application Architecture
    Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
    Domain-Driven Design: Tackling Complexity in the Heart of Software
    Applying Domain-Driven Design and Patterns: With Examples in C# and .NET
    Object Design: Roles, Responsibilities, and Collaborations

    Test-Driven Development
    Test Driven Development: By Example
    Working Effectively with Legacy Code
    Refactoring: Improving the Design of Existing Code
    xUnit Test Patterns: Refactoring Test Code


More Posts