It’s turtles all the way down

Normal
0

false
false
false

EN-GB
X-NONE
X-NONE

MicrosoftInternetExplorer4

 

Classically an ORM persists an object model to a relational model. Typically
just use it to retrieve or persist the state of the objects in our model, not
provide integration through models. Though obviously there is some translation
between the models in the act of persistence, the greater distance between the
two models the greater the effort we have to expend on the persistence mapping.

This becomes an issue when we are tempted by using a shared database schema
as a mechanism for system integration. The approach is simple in concept: two
systems with differing models, communicate by virtue of the fact that they
share a relational schema. In Hohpe’s Enterprise Integration Patterns,
Martin
Fowler
calls this integration style Shared
Database
. The two systems don’t exchange information, they share
information.

Shared Database seems an attractive way to integrate systems within your
enterprise at first. You do not have the complexity multiplier of
distributed systems
and the skills sets required to make them work.
Corporate MI does not have to map between models to report, it just points to
an reporting store created from the shared repository (or in poor
implementations to the transactional store itself).

There are a number of problems that flow out of shared database approaches.

The applications have both data and behaviour. Interpreting what a field
such as premium means to source systems often requires an understanding of the behaviour
to eliminate ambiguities. Is a field called premium net or gross of tax, for
example? Looking at the source system may be required to determine the
properties of the field. The source system will also contain rules about the
validity of records, information that we often can’t capture with constraints.
We can try to work around this by ensuring that we document the usage along
with the schema, but documentation tends to fall behind a system which leads to
an insidious growth in inaccuracy.  If we are to create a schema that
multiple applications can use to store their data, we need to remove any ambiguity.
If that was not hard enough, we need to have a schema that is both a superset
of all their data, and this often makes the representation complex. Those
complex representations are not just hard to comprehend, they often perform
badly.

In addition the database schema now has multiple owners, so we need to
negotiate how changes to the model impact consumers. We might have an adequate
test suite to allow us to use continuous integration, but often the systems
involved in shared database include legacy applications that have no such
ability to understand the impact of changes across all consumers of the model.
That makes change very expensive and so tends to lead to software atrophy as we
are no longer able to keep our software in synchronization with business needs.

These complex models prove difficult for developers and domain experts
alike. Domain experts may find that they do not represent the model of the
problem domain. In Domain Driven Design (DDD) terms we have two bounded
contexts, the model for the application and the model for the shared database.
The developers is then confronted with some unpalatable choices. Does he
conform to the model in the Shared Database, hydrating an object model that is
essentially an OO representation of the ER model? This will prevent him from
using the principle of ubiquitous language to share the same model as the
domain expert within the implementation of the software. This is not just an
academic issue, it becomes a significant overhead to translate the requirements
from the domain expert to their expression in the model, and is often lossy and
error prone. Even something as simple as responding to a live issue becomes
more complicated when the model bears little relation to the problem
experienced by the end user. “My broker’s address details are not being
refreshed correctly” becomes much more complicated if you do not have an
entity called broker in your system but a Party with a PartyRole of Broker.
Someone needs to understand the translation to solve the issue.

So it can seem tempting to try and use our ORM tool to elide this problem,
by mapping from the ER into the OO model. After all an ORM tool allows
model-to-model mapping, so why not leverage it? So ORM tools such as Linq to
SQL are not really sophisticated enough to handle this kind of translation.
Others are though and this kind of sophistication can be tempting. Indeed, this
seems to me to be one of the supposed benefits of the EF is its ability to put
some kind of anti-corruption layer between the relational schema and the OO
model.

The problem is that this leads to a growth in complexity that seems contradictory
to our supposed benefit for using shared database, low cost, in the first place.

Eric Evans points out in Domain Driven Design that:

“Technically, the relational table design does not have to reflect the
domain model. Mapping tools are sophisticated enough to bridge significant
differences. The trouble is, multiple overlapping models are just too
complicated. Many of the same arguments presented for MODEL-DRIVEN DESIGN—avoiding separate analysis and
design models—apply to this mismatch. This does entail some sacrifice in the
richness of the object model, and sometimes compromises have to be made in the database design (such as selective
denormalization), but to do otherwise is to risk losing the tight coupling of
model and implementation.” – Eric Evans, Domain Driven Design

The danger with better tooling is that it can sometimes make approaches that
we once have found to difficult to swallow seem more digestible. However, just
because the cost to implement drops does not mean that the cost of ownership drops
along with it, in fact it often increases.

Try to keep the idea that it is ‘turtles all the way’ down and keep the
design of your relational model the same as your object model. In addition, avoid
the use of shared database as an integration strategy within the Enterprise and
prefer messaging. What you gain in implementation cost you will lose many times
over in ownership cost.

.

 

 

 

 

About Ian Cooper

Ian Cooper has over 18 years of experience delivering Microsoft platform solutions in government, healthcare, and finance. During that time he has worked for the DTi, Reuters, Sungard, Misys and Beazley delivering everything from bespoke enterpise solutions to 'shrink-wrapped' products to thousands of customers. Ian is a passionate exponent of the benefits of OO and Agile. He is test-infected and contagious. When he is not writing C# code he is also the and founder of the London .NET user group. http://www.dnug.org.uk
This entry was posted in Architecture, DDD, Object-Orientation, ORM. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://www.symptomsincontinence.com urinary incontinence

    Hello, What enticed you to post this article. It was extremely interesting, especially since I was searching for info on this subject last Thursday. Please visit my site stress urinary incontinence when you got time.

  • http://www.treatmentsfibromyalgia.com fibromyalgia treatment

    Thanks for posting, definitely going to subscribe! See you on my reader. Please come visit my site fibromyalgia pain when you got time.

  • Daniel Fernandes

    That is all very wise Ian.

    I would add, though, that messaging is indeed a way of integrating two systems, it is not the only one.
    I think the important thing to do is to realize first that there is an integration requirement. So for instance, a large system can grow so big that it need breaking down in smaller systems. Therefore each of these vertical slices would need integrating with others, in the same way the payment system would need to integrate with the ordering system.

    Therefore in my opinion Messaging is merely a simple mechanism to implement asynchronous integration and as a matter of fact Messaging might be the best solution but there are others as well.
    For instance, one can generate Fact based data in some system that has been modeled to be directly consumed for integration purposes.

    The main thing to remember as you clearly pointed it, if one uses an ORM then the database objects it relies upon should clearly be marked as part of the application database and should in no way directly used for integration purposes.

    Daniel

  • http://codebetter.com/members/Ian-Cooper/default.aspx Ian Cooper

    @Frans

    That section is written by Martin if you check the credits. Agreed it is confusing, but just trying to be accurate.

  • http://weblogs.asp.net/fbouma Frans Bouma

    “Hohpe’s Enterprise Integration Patterns, Martin Fowler calls this integration style Shared Database”
    Didn’t Gregor write that book instead of Martin Fowler? ;)

    About the article: O/R mapping systems are used to make sure that two projections of the same abstract model (namely 1) code and 2) relational schema) are representing the same thing. As long as that’s possible, it will work: an entity instance (== data) transfered from one projection to another keeps its meaning. THe moment it gets a different meaning, one should indeed move to a projection (be it its own code model or its own relational model) which fits the other projection better so the meaning of what an entity instance ‘means’ is the same in both projections