This discussion came up on the alt.net list last night and I had this post about half way done so I agreed to push it to the top of my stack and get it done for today given the timeliness of the information. I apologize that I lied about what the next post would be. I will have the follow up on the last DDDD post next.
Consider the following code:
Repository<Customer> repository = new Repository<Customer>();
foreach(Customer c in repository.FetchAllMatching(CustomerAgeQuery.ForAge(19)) { }
The intent of this code is to enumerate all of the customers in my repository that match the criteria of being 19 years old. This code is fairly good at expressing its intent in a readable way to someone who may have varying levels of experience dealing with the code. This code also is highly factored allowing for aggressive reuse.
Especially due to the aggressive reuse the above code is commonly seen in domains. Developers are trained that reuse is good and therefore tend towards designs where reuse is applied. The reuse can be seen two-fold. The first is in the definition of an IRepository<T> interface something like:
interface IRepository<T> {
IList<T> FindAllBy(IQuery<T> query);
void Add(T item);
void Delete(T item);
...
}
Then people using Object Relation Mappers such as Hibernate will tend to make a generic implementation of this interface since the ORM does most of the heavy lifting for them ex: Repository<T>
Show me the polymorphism!
A main reason why one would favor a generic contract that is possibly specialized such as in the example of IRepository<T> is that one could write code that operated upon IRepository<T> directly to perhaps do things like be a “generic object editor”. That is it uses the various repositories in a polymorphic fashion.
Quite simply, where and more importantly why would someone want to do this? Finding a place or a reason for how this polymorphism would be used is extremely difficult under the guise of domain driven design. Perhaps in some sort of admin interface, but even this would fail to the forms over data complexity test and is likely better off being done in another methodology such as Active Record.
As if the utter lack of necessity of a shared interface were not enough the introduction of such an interface actually causes further issues.
C(?)R(?)U(?)(D(?)
Some objects have different requirements than others. A customer object may not be deleted, a PurchaseOrder cannot be updated, and a ShoppingCart object can only be created. When one is using the generic IRepository<T> interface this obviously causes problems in implementation.
Those implementing the anti-pattern often will implement their full interface then will throw exceptions for the methods that they don’t support. Aside from disagreeing with numerous OO principles this breaks their hope of being able to use their IRepository<T> abstraction effectively unless they also start putting methods on it for whether or not given objects are supported and further implement them.
A common workaround to this issue is to move to more granular interfaces such as ICanDelete<T>, ICanUpdate<T>, ICanCreate<T> etc etc. This while working around many of the problems that have sprung up in terms of OO principles also greatly reduces the amount of code reuse that is being seen as most of the time one will not be able to use the Repository<T> concrete instance any more.
Revisiting the intent.
What exactly was the intent of the repository pattern in the first place? Looking back to [DDD, Evans] one will see that it is to represent a series of objects as if they were a collection in memory so that the domain can be freed of persistence concerns. In other words the goal is to put collection semantics on the objects in persistence.
The key here is that as a rule there should be no persistence logic within the domain. This allows the domain to be more easily tested, tested independently of persistence and moved easily from a persistence mechanism (this is more important as a long term maintainability goal as opposed to “we want to use XML files now”).
Simply put the contract of the Repository represents the contract of the domain to the persistence mechanism for the aggregate root that the repository supports. The realization that the Repository is less of an object and more of a contract to infrastructure is both a subtle and important one.
The importance of the “Contract”
The Repository represents the domain’s contract with a data store (another common word we may use here is that it is the seam). This is extremely important as one can tell every possible way that the domain interacts with the data store by looking at the contract. When it comes time to optimize the database for performance as an example one can look at the repositories and figure out what the domain requires of the data store.
The unfortunately is only useful if the contract is narrow and specific. Consider the conceptual difference between the following.
Repository<T>.FindAllMatching(QueryObject o);
CustomerRepository.FindCustomerByFirstName(string);
In terms of figuring out what the contract to the data store actually is the first example gives us no information. It could literally be running any query that could be expressed within the QueryObject (read: any possible query). In order to now figure out what the contract actually entails one would have to go look through the domain (and possibly the UI (ugh) depending on where the query objects originate).
Simply put: allowing query objects to be passed into the repository widens the contract to a point of uselessness.
But reuse is good?
None of us like writing the same code over and over. However a repository contract as is an architectural seam is the wrong place to widen the contract to make it more generic.
You will note that nothing has precluded the use of Repository<T> only really of IRepository<T>. So the answer here is to still use a generic repository but to use composition instead of inheritance and not expose it to the domain as a contract. Consider:
public interface ICustomerRepository {
IEnumerable<Customer> GetCustomersWithFirstNameOf(string _Name);
}
In the customer repository composition would be used.
Public class CustomerRepository {
private Repository<Customer> internalGenericRepository;
Public IEnumerable<Customer> GetCustomersWithFirstNameOf(string _Name) {
internalGenericRepository.FetchByQueryObject(new CustomerFirstNameOfQuery(_Name)); //could be hql or whatever
}
}
The key here is that our seam as exposed to the domain is a very specific architectural seam as opposed to the open/generic seam that allows us to do anything. Composition is used as opposed to inheritance to gain reuse while minimizing the width of the repository contract within the domain.
Hint: Minimize the complexity at the entry/exist seams of your domain by making the seams as explicit as possible.
Posted
Fri, Jan 16 2009 12:16 PM
by
Greg