Previously: Architecting Linq to SQL, part 9
End of the line
This is intended to be the last part in this series and I wanted to take the opportunity to talk about a number of related if diverse topics. I would like to look at what I would like to see in the next version, and talk about when and where I intend to use Linq to SQL and other ORMs such as NHibernate.
I will try to get some of the code that goes with this series up onto google code over the coming months, schedules permitting. Roger Jennings requested that I give more than a trivial example of how to do messaging for n-tier scenarios. I'm flattered by Roger's confidence, though I feel that Greg or Udi would be better placed to do a introductory piece on messaging. But if there is demand I will give it a try.
What would I like to see in the next version?
The following are, I think, the priorities:
Support for Value Types. In a fine grained object model we may have classes that are not entities (have a distinct identity independent of their state). A common example would be Money, which has a value and currency. We do not want to map these to rows in a table, but to columns. Right now with Linq to SQL we have to represent money as two fields, amount and currency in an entity. We would like to represent them as one type, which can be mapped independently. As an aside a lot of systems overuse primitive types directly. Often we have something that is not a string or an int even if we can represent it as such. Our systems become clearer if we can wrap these primitives in a name appropriate to the domain such as ShippingReference. This only works if we can map value types easily. This is fairly straightforward if we assume that the column names used by these value types remain the same on any entity that stores them.
Support for changing loading options. As of today we can only alter the default loading behavior of a DataContext before we use it. This assumes that we can determine what we want to eager load once for a context. The reality is that we may want to set this before we run any query. So it must be possible for us to set eager loading options each time we run a query. An alternative would be to do something more akin to Hibernate's HQL language's ability to add a fetch to the query expression so that we can tell the query to load the relationship eagerly.We also want support for eager loading multiple child associations, not just the one we have now.
Support for ordered relationship types. Right now an association is treated as a set - an unordered collection. However often are children are ordered, particularly being in a map where we have both a key and a value. The key should be both a primitive type and another entity or value type. While this type of mapping is less common in the relational world, within our domain we often want to use ordered mappings, and support for mapping these to relational tables gives us increased flexibility when mapping domain to Db.
Support for table per sub-class mapping. Sometimes we do not want to allow fields on sub-classes to be nullable. Unfortunately this is a requirement of table-per-class-hierarchy mapping strategies. Allowing table per sub-class, using a shared key strategy would allow us to avoid this issue. Table per-subclass with shared key avoids some of the performance issues from moving away from a single table, which might be incurred if we took a union approach to combining data from multiple tables to support subclassing.
There are also some things I would like to see, though I am less optimistic that they will happen
Expose the provider model. Allow LINQ to SQL to target multiple Dbs. It exists but was never exposed at RTM. I suspect the resources were not allocated because the Entity Framework became the way to work if you had a non-SQL Server back end. Given EF not being positioned as an ORM let's open it up so we can take LINQ to SQL forward.
Include an explicit in-memory provider. This will make TDD a breeze. Once we have an in-memory provider it would be easy to swap out the Db for unit testing purposes.
Support for second-level caching. MS now has a second level caching technology in Velocity. It would be nice to see support for working with a second level cache within LINQ to SQL (the first level cache is the identity map).
What I would be cautious about in the next version
There are also some things that I would be disappointed about a disproportionate amount of much effort being expended on:
Support for serialized entities. I hope I have managed to explain why serializing an entity across tiers is a bad architectural style. Instead of corrupting LINQ to SQL with support for this practice, I would like to see an emphasis from the patterns on practices team on dissuading people from approaching n-tier design in this style. We do not want to pollute entities with change tracking or serialize a DataContext.
More advanced designer options. I appreciate that some folks like designers, but I think that they may be a red herring here. If you work domain-first then you might as well use attributes to mark up your domain model, or hand code your xml mappingfile. If you are going to work in a data first approach, I would push extending SQLMetal with those capabilities instead of a designer.
In the data first case the design is done in the RDBMS, not in the domain model, so by the time we get to LINQ to SQL we are just generating our entity model from our Db. All the designer gives us is the ability to select a sub-set of tables to generate. A fairly simplistic UI, such as a dropdown list to add tables, could configure the options for a SQLMetal call. Flashy drag and drop layout seems a little bit wasteful. Even better if the property based approach is just a wrapper around a SQLMetal call that makes that command you have configured available. That allows folks to use the command they have created throough the designer in their build scripts to call SQLMetal. This would give more resources for the new functionality people want from their data-first designer such as file per entity, update an existing set of files for changes etc.
I understand this may not be popular, but command line tools are cheaper to author and can deliver a lot more bang for your buck if your team has limited resources. In addition a designer can blind you to an over-complex approach to mapping. If you cannot easily map by hand, if you require that designer, then I believe that you may have lost your way.
I understand that these suggestions will be unpopular with some people, but both of them represent dead ends to me, that do not provide us with the ability to write better software. Of course your mileage may vary.
Linq to SQL over Entity Framework for your ORM
For my part, and that is of course based of my school of software development, LINQ to SQL is a better ORM than the Entity Framework. That may come as no shock to the EF team who have a bigger vision for their product than ORM. For me, LINQ to SQL get a lot right: support for persistence ignorance, single mapping file that is by-hand authorable, lazy loading as a default strategy. If MS intends to provide an offering in the ORM space, as opposed to whatever space the EF is defining, and thus fulfill the vision that Anders gave us of simplifying the development experience by having data access as part of the language. To me LINQ to SQL is the best MS contender for the crown. Given the resources, LINQ to SQL could become a great tool. I hope that MS continue to allocate a fair share of resources to it.
LINQ to SQL vs. NHibernate
To be honest, I have to say that my next project will use NHibernate for its persistence technology instead of LINQ to SQL. Why? It is a large project, with a significant number of entities, and we want to support fine-grained object models and table per sub-class mapping strategies. We also wanted the insurance of being able eager fetch on a query-by-query basis and have a 2nd level cache. It is an old adage but 'there is no silver bullet'. I'm picking one tool out of the kit, it does not mean the others are not valuable.
At the same time LINQ to SQL still forms part of our strategy, because we believe it to be simpler to approach for many projects.So we also have and will be using LINQ to SQL. If anything LINQ to SQL replaces WORM for us which we used for a number of projects where we had good table to entity affinity. Ironically perhaps WORM was an implementation of the proposed interface for ObjectSpaces. ObjectSpaces was the MS ORM for .NET 2.0, that never saw the light of day. ObjectSpaces became LINQ to SQL, and Matt has full the story here, so it seems a natural inheritor. Let us hope it does not meet the ObjectSpaces fate of being sidelined for a more grandiose vision of data access.
A valid question might be to ask why I want improve LINQ to SQL, why I do not just tell everyone to use NHibernate. Some of this is a recognition of the market, many people will not use a non-MS ORM and LINQ to SQL is is a solid ORM. Pragmatically we are likely to get more .NET developers who know LINQ to SQL available in the market place than NHibernate developers. But I also believe that with the expressiveness of LINQ MS have a real chance to move the ORM market forward in the .NET space. LINQ to SQL is like ASP.NET MVC, it is a welcome acknowledgement from MS of what developers want, and we should commend them when they do get it right.
I will be posting a series on NHibernate going forward, so that you can make your own judgements on which to use and when.
Posted
Tue, Jul 1 2008 1:40 PM
by
Ian Cooper