Previously: Architecting LINQ To SQL Applications, part 7
A layer, such as we discussed Part 2, in is not a tier. A layer is a logical unit of division; a tier is a physical unit of division. The two need not be the same. Software may be layered within a single process running on one machine or across a number of processes running on different machines. Layering is about separation of concerns. Physical distribution is about sharing and scalability. So for example an RDBMS tends lives on a server, because we want to share data among many users in a scalable fashion. Some rules of thumb are:
Application and Domain Layer: This may run in the same place as the presentation layer, and this is the most responsive option. An application server runs the application and domain logic on a server. It allows sharing of the functionality exposed by the domain across multiple clients or to other applications. Running an application server can also lower cost of upgrades in a rich-client environment, because you update the domain logic in a smaller number of places.
Infrastructure Services: Because the application and domain layer depend on the infrastructure services they tend to run with the application and domain layer or on their own server.
(As an aside, layers do not constrain packaging decisions either (i.e. how to divide your application into assemblies) and packaging decisions are again seperate from tiers. I do not intend to cover packaging decisions here. One of the best discussions is in Robert Martin’s Agile Principles, Patterns, and Practices in C#. The NDepend tool is invaluable in helping you make sensible packaging decisions.
The Laws of Distribution
The first law of distributing your application into tiers is don’t.
Distribution is a complexity multiplier. Increased complexity equals increased cost. Software is an economic proposition to the buyer: what benefit they receive offsets the cost of purchase and ownership. The benefits of distribution come from sharing, scalability, and quality of service. Always make tier decisions with those criteria in mind so those are the criteria which steer us in that path. David Hayden has talked about this issue on CodeBetter before.
Distribution will make your application more complicated and crossing process boundaries will make your application slower (calls across process or machine boundaries are orders of magnitude slower). So why would we scale our application? Many application servers seem to do little beyond passing a requests to the Db to and from the UI, doing little or no processing in-between. What are people using a server for here? Connection Pooling is the major driver. If our data access happens from within the same process, then we can share connections from a pool reducing the cost of creating and destroying connections.
However, for ASP.NET applications there is little to be gained here. Indeed, a lot of people separate their domain layer into an application server in this context without needing to. We gain the benefits of connection pooling obtained from running in a single process, by virtue of being hosted within ASP.NET. There is no need to introduce a seperate application server for connection pooling here. For smart clients we can gain benefits from pooling by doing data access from a server tier, because otherwise every client will have its own connection to the Db. So at a certain point a Windows Forms application may gain a performance benefit from using an application server, but its worth bearing in mind the need to trade this benefit against the performance costs of distribution to begin with.
The connection pooling example extends to anything where we have expensive creation. We may find it is more efficient to create once on the server and then allow mutliple clients to share that resource. Indeed if materialization of our objects from the Db were to be an issue then caching the objects so created and returning the cached copy from our server would relieve pressure on the Db, by allowing clients to retrieve the copy instead of re-issuing the request. However, we would need to make decisions about the importance of the liveness of our data and how our cache would be refreshed if it became invalid.
In addition, where we are performing intensive activity on the server we may want to improve the performance of our application by splitting the work up and then scaling out or up. In this case we put the work into a server and share the work out among a pool of servers. Where the application server does a lot of work, for example complex calculations, then adding an application server allows the number of nodes where that calculation can be performed.
Sharing creates pressure for a middle tier, especially if we want a service oriented architecture where there is one authority within our enteprise for certain operations. For example in an insurance company a number of applications might require access to our rating information. By hosting this information in a service we make it possible for any application within the enterpise to obtain that information. If it was locked into a smart client (for example an Excel spreadsheet) it would be difficult for other applications to share that information. A rich client cannot share the knowledge of the domain it captures, or the results of calculations, and in this circumstance you are always forced into moving this knowledge onto the server.
Quality of Service
We may want to provide our application with multiple nodes to improve its fault tolerance; if we lose one node other nodes can continue to carry out work. We might want to run a service our application uses in a different security context to that which the application uses. We might want our application to use reliable messaging for an operation. All of these issues can be loosely grouped as quality of service concerns and we may want to distribute our application to avail ourself of these services.
The Last Responsible Moment
Beware premature decisions here. Just because you may need to expose this knowledge as some point, does not mean that you need to do so from the beginning. A well-layered application would be amenable to refactoring to a mult-tiered one, when you need it to be. I would try to defer the cost of distribution, until you know you need it. Bear in mind for example that a web application just uses the HTTP protocol to exchange HTML with a browser. To add a second ‘presentation’ layer that exposes XML over HTTP instead, to a calling application instead of a browser, should be a straightforward change to your existing architecture. Indeed, identifying the services the application provides may be easier if we look toward what ‘services’ we provide via HTML.
Why is there pain going multi-tier with LINQ To SQL?
Assuming that you have worked through options and decided you need an N-Tier application, the question becomes: what issues will you hit when you try to use LINQ To SQL in this context.
It is not straightforward to serialize an entity between two tiers.
The first issue is that XMlSerialization fails with circular-references so XMLSerialization will tend not to work where you are managing a parent-child relationship with an EntitySet-EntityRef pair. The designer does have a workaround for this in WCF, so that you don’t serialize both parts. There is a good summary of the issues here and here.
The second issue is DataContext. Remember how we talked about the DataContext as a unit of work in which we should manage all of our interaction with entities. Passing an entity to another tier removes it from its extant unit of work. Assuming that you intend to work with the entity on another tier and then pass it back to update the Db with any changes, you will need to bear in mind that you are converting a peristent entity to a transient one and back again. There are two ways of dealing with this round trip. The first is to attach the entity to the DataContext when you deserialize it, to let the DataContext know that it is really a persistent object. As previously discussed, to effect this you think about how you will manage the ‘prior’ state by, for example, tracking the old state of the entity yourself. The second is to think of your deserialized object as set of deltas to the current Db representation of the entity and load the entity from the Db within the current DataContext and change it to reflect the object you have deserialized.
This suprises some people because they are used to working with DataSets, which formed a unit of work, and were serialized with both the old and current state of the rows they contained, allowing the updates to be played through a DataAdapter when they were passed back to the middle tier. LINQ To SQL does not support this model ‘out-of-the-box’. This tends to shock people who expect that the features of DataSets should somehow extend to LINQ To SQL.
We’ll cover solutions to these issues next time, but as a teaser I would point out that the ‘Replay’ model above is common in web applications. There we create a DataContext and load any entities required in response to a request for a page; compose an HTML response to that request from those entities and then dispose of out DataContext. If the user is editing they send a message from their browser in return with changes that indicate the modifications we want to apply to the entity. In response to that postback we load the entity from the Db again, using a fresh DataContext, and apply the edits that the user has made to it.
Note how we are exchanging messages with the browser, in the form of HTML, instead of serializing objects. This basic pattern of message exchange forms the heart of how we should architect out n-tier applications when using LINQ To SQL.