Classically an ORM persists an object model to a relational model. Typically
just use it to retrieve or persist the state of the objects in our model, not
provide integration through models. Though obviously there is some translation
between the models in the act of persistence, the greater distance between the
two models the greater the effort we have to expend on the persistence mapping.
This becomes an issue when we are tempted by using a shared database schema
as a mechanism for system integration. The approach is simple in concept: two
systems with differing models, communicate by virtue of the fact that they
share a relational schema. In Hohpe’s Enterprise Integration Patterns,
Fowler calls this integration style Shared
Database. The two systems don’t exchange information, they share
Shared Database seems an attractive way to integrate systems within your
enterprise at first. You do not have the complexity multiplier of
distributed systems and the skills sets required to make them work.
Corporate MI does not have to map between models to report, it just points to
an reporting store created from the shared repository (or in poor
implementations to the transactional store itself).
There are a number of problems that flow out of shared database approaches.
The applications have both data and behaviour. Interpreting what a field
such as premium means to source systems often requires an understanding of the behaviour
to eliminate ambiguities. Is a field called premium net or gross of tax, for
example? Looking at the source system may be required to determine the
properties of the field. The source system will also contain rules about the
validity of records, information that we often can’t capture with constraints.
We can try to work around this by ensuring that we document the usage along
with the schema, but documentation tends to fall behind a system which leads to
an insidious growth in inaccuracy. If we are to create a schema that
multiple applications can use to store their data, we need to remove any ambiguity.
If that was not hard enough, we need to have a schema that is both a superset
of all their data, and this often makes the representation complex. Those
complex representations are not just hard to comprehend, they often perform
In addition the database schema now has multiple owners, so we need to
negotiate how changes to the model impact consumers. We might have an adequate
test suite to allow us to use continuous integration, but often the systems
involved in shared database include legacy applications that have no such
ability to understand the impact of changes across all consumers of the model.
That makes change very expensive and so tends to lead to software atrophy as we
are no longer able to keep our software in synchronization with business needs.
These complex models prove difficult for developers and domain experts
alike. Domain experts may find that they do not represent the model of the
problem domain. In Domain Driven Design (DDD) terms we have two bounded
contexts, the model for the application and the model for the shared database.
The developers is then confronted with some unpalatable choices. Does he
conform to the model in the Shared Database, hydrating an object model that is
essentially an OO representation of the ER model? This will prevent him from
using the principle of ubiquitous language to share the same model as the
domain expert within the implementation of the software. This is not just an
academic issue, it becomes a significant overhead to translate the requirements
from the domain expert to their expression in the model, and is often lossy and
error prone. Even something as simple as responding to a live issue becomes
more complicated when the model bears little relation to the problem
experienced by the end user. “My broker’s address details are not being
refreshed correctly” becomes much more complicated if you do not have an
entity called broker in your system but a Party with a PartyRole of Broker.
Someone needs to understand the translation to solve the issue.
So it can seem tempting to try and use our ORM tool to elide this problem,
by mapping from the ER into the OO model. After all an ORM tool allows
model-to-model mapping, so why not leverage it? So ORM tools such as Linq to
SQL are not really sophisticated enough to handle this kind of translation.
Others are though and this kind of sophistication can be tempting. Indeed, this
seems to me to be one of the supposed benefits of the EF is its ability to put
some kind of anti-corruption layer between the relational schema and the OO
The problem is that this leads to a growth in complexity that seems contradictory
to our supposed benefit for using shared database, low cost, in the first place.
Eric Evans points out in Domain Driven Design that:
“Technically, the relational table design does not have to reflect the
domain model. Mapping tools are sophisticated enough to bridge significant
differences. The trouble is, multiple overlapping models are just too
complicated. Many of the same arguments presented for MODEL-DRIVEN DESIGN—avoiding separate analysis and
design models—apply to this mismatch. This does entail some sacrifice in the
richness of the object model, and sometimes compromises have to be made in the database design (such as selective
denormalization), but to do otherwise is to risk losing the tight coupling of
model and implementation.” – Eric Evans, Domain Driven Design
The danger with better tooling is that it can sometimes make approaches that
we once have found to difficult to swallow seem more digestible. However, just
because the cost to implement drops does not mean that the cost of ownership drops
along with it, in fact it often increases.
Try to keep the idea that it is ‘turtles all the way’ down and keep the
design of your relational model the same as your object model. In addition, avoid
the use of shared database as an integration strategy within the Enterprise and
prefer messaging. What you gain in implementation cost you will lose many times
over in ownership cost.