Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Impedance Mismatch Reframing

 

This is a reply to Stephen Fortes post Impedance Mismatch from a ways back. I would have posted about it sooner but I sadly just saw it today when a co-worker Stefan Moser linked it over to me. I know that this debate has become quite heated through the community and as such will refrain from personal attacks (such as those unfortunately experienced by Julia Lerman) and focus solely on the technical merits of the post.

 

My first problem with ORMs in general is that they force you into a “objects first” box. Design your application and then click a button and magically all the data modeling and data access code will work itself out. This is wrong because it makes you very application centric and a lot of times a database model is going to support far more than your application.

 

Well I wouldn’t say that this is a problem with ORMs per se but a problem with some tools. Those who are using Domain Driven Design are certainly not using this methodology, one of the main reasons I like to tell people to use DDD is that they can design their data storage mechanisms in parallel to their domain model seeking an optimal solution to each. In other words we should be embracing the impedance mismatch and doing what is best on both sides. The paragraph then continues with

 

In addition an SOA environment will also conflict with ORM.

 

I do not necessarily agree with this in any way shape or form but am happy to leave it left open to “the many definitions of SOA”. I think it can quite easily be done if you follow solid command query separation. Udi Dahan gives a nice discussion of this on his blog.

Later in the article (I am jumping around a bit to keep my own post coherent)

 

One of the biggest hassles I see with LINQ to SQL is the typical many-to-many problem. If I have a table of Ocean Liners, vessels,  and ports, I’ll typically have a relational linking table to connect the vessels and ports via a sailing. (Can you tell I am working with Ocean Freight at the moment?) The last thing I want at the object layer is three tables! (And then another table to look up the Ocean Liner that operates the vessel.) Unfortunately, this is what most tools give me. Actually I don’t even want one table, I want to hook object functionality to underlying stored procedures. I really want a port object with a vessel collection that also contains the ocean liner information.

 

The author discusses his experiences with Linq2Sql and then applies it to “what most other tools give me”, this is an unfortunate fallacy or a lack of research on available tooling. Linq2Sql is not a real “mapper” nor is what the author referring to “mapping”, it is simply an Active Record implementation that is not using self-serving objects. This is what happens when mappers stay too close to the relational structure, they suck in terms of domain language and structure.

If we were however to use a real mapper (let’s say the one those notorious mafia guys are using) a quite different scenario would exist; a domain that sounds almost exactly like what is described as being wanted. This paragraph is also key in showing that research has not been done into Domain Driven Design by the author, I would bet that Stephen and Eric could have some really interesting discussions at the Advisory Council as Eric uses this exact problem domain as a naive starting point for examples in about half of his book.

A more serious problem is shown though in the authors propensity towards a relational bias when domain objects are called “tables”. Why would anyone have a domain full of “tables”? These are behavioral objects. Unless this misunderstanding of what a domain model is is corrected the rest of what a domain model is or does will never make any sense.

A further lack of understanding of Domain Driven Design is shown with the statement of..

 

ORM is real good for CRUD and real bad at other things.

 

Again I believe the author has become confused between ORM and Active Record for some reason. I would never under any circumstances recommend someone to use Domain Driven Design for a CRUD app as there are easier ways (like using Active Record). DDD is hard and often painful, it is costly up front and should only be used in domains that can justify its up front costs in maintainability.

 

Although it may be surprising, it is my belief that the author is actually a Domain Driven Design aficionado but has just not yet realized it yet.

 

I prefer to build the application’s object model and the data model at about the same time, with a “whiteboarding” approach that outlines the flows of data and functionality across the business process and problem set.

 

It is quite common in an “object first” perspective to be either doing database and code modeling either in small iterations or in parallel where a team of object experts focus on the domain model and the best way to model the data in order to support transactional behaviors while a team of database experts focus on how best to store the data given their own set of requirements. These types of sessions would in fact be prescribed in an agile team and the small “whiteboarding” sessions are absolutely prescribed by Domain Driver Design.

 

Maybe it is the MBA talking but I tend to be “business and customers first” when I design a system. (Those of you that know me know that I have designed some very large and scalable systems in my day.)

 

This is one of the core beliefs of Domain Driven Design, the primary example would be the creation of an Ubiquitous Language in order to ease communications between the “business and customers” and the team.

 

What I am saying (and have been saying for a long time) is that we should accept, no, embrace the impedance mismatch!  While others are saying we should eradicate it, I say embrace it.

 

Again we are back into agreement with Domain Driven Design. I like to look at Domain Driven Design as being an orthogonal architecture, my domain survives through anything that is moved around it as it is the core of my business and where the largest amount of my investment has gone…

 

 

We come now to where the author is unfortunately not in line with DDD but perhaps can be moved. The only way that one can reach an orthogonal architecture is to ensure the purity of the domain model. The OLTP RDBMS will eventually leave in popularity, what happens when I want to move to say “the cloud” and just store my aggregate roots as XML, this is a perfectly valid and extremely effective architecture. If I favor too heavily the RDBMS side of the impedance mismatch then this change will not be orthogonal to my domain and will as such be extremely costly. The author may disagree with my reasoning as he points out.

 

ORM tools should evolve to get closer to the database, not further away.

and

Developers who write object oriented and procedural code like C# and Java have trouble learning the set-based mathematics theory that govern the SQL language. Developers are just plain old lazy and don’t want to code SQL since it is too “hard.” That is why you see bad T-SQL: developers try to solve it their way, not in a set-based way.

and

So ORMs are trying to solve the issue of data access in a way that C# and VB developers can understand: objects, procedural, etc.  That is why they are doomed to fail. The further you abstract the developer from thinking in a set-based way and have them write in a procedural way and have the computer (ORM) convert it to a set-based way, the worse we will be off over time.

 

Well I think I have already discussed the first of these points pretty well, by moving closer to the database we break our hopes of an orthogonal architecture. The second comment albeit sounding like it came from a grand and mighty sql wizard sent down by the gods to lift us heathen from our sinful ways is actually a red herring as is the third when framed properly.

I do know relational algebra (yes I can tell you what an anti-join is) and I challenge anyone to show me notation for an insert. While one could argue it can be involved with say a delete by PK/FK or update by PK it is for all intensive purposes useless in the process of writing to a properly normalized database, these items tend to be procedural regardless. I will admit there are times where it can come in handy but they are by far the minority. The relational algebra is focused on reading data and manipulating sets.

As many who have had long post-conference talks over beer with me know I find any query that is of any amount of complexity close to thinking about the relational algebra to be a report. Reports are not expressed within my domain and may or may not be read from the same data source (I often times use an eventually consistent reporting model specifically for the purpose of running such queries). I take this often to extremes, my repositories in an ideal world have a single read method, FetchAggregateByUniqueId. Anything that is searching in a more complex nature is deemed a report and sits outside of this (usually as a small mapper that returns DTOs that match screen shapes, not domain shapes but provide the appropriate aggregate ids for writes to be possible). My “reports” all make very strong use of SQL and Relational Algebra, my domain has no need to know that it exists as it is essentially a write only model. I could go much more into this but it is another post.

Getting back to the article, the author does however end off with a great quote from Ted Neward:

 

“Developers [should] simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as “raw” JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems.”

 

This is great advice … just remember if you do it to hide it from your domain and to use it sparingly as you may not always have a RDBMS sitting behind you and if you don’t these set based operations may be quite difficult to implement.

This entry was posted in DDD, DDDD, Domain Driven Design, ORM. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

20 Responses to Impedance Mismatch Reframing

  1. Rob Tweed says:

    Eric your comments on Mumps made me smile, given our current blog which you can see at:

    http://www.outoftheslipstream.com/node/124

    Relational database may be here to stay but people are increasingly finding that it’s hierarchical and schemaless databases that are needed for many of the new breeds of application.

  2. Wow as well. Great post and very helpful.

    Also just watched your http://www.viddler.com/explore/bsimser/videos/2/ presentation at Alt.Net Canada – and your reiteration here of separating command and query behaviours (write only domain vs. advanced searching and reporting). This is definitely my “Ah” moment for the week – maybe even the month. Will be reading Eric’s book – and would love to see another post on this topic if you can find the time.

  3. Eric says:

    Greg,

    Thanks for the response. I’m sorry it was formatted so horribly. I thought the line breaks would transfer, but they didn’t.

    I have a few questions about your domain repository preferences:

    Assume a simple E-commerce store. We have a Customer object, that has a reference to a Cart object which has a collection of CartItem objects which each in turn have references to a Product object.

    Which is the aggregate root? In a web application, I would argue you have many different roots, depending on the page/view you are displaying (a customer view/edit page, an order submission page, viewing/modifying the cart, etc). Based on your current needs (the data you’re going to view/edit on the current page), you could slice the objects and their relations into any number of combinations. It just seems wasteful to me to load more data than you need for a single page.

    There are definitely times when it makes sense to have a domain model with rich “business logic”, but for the most part all we’re doing is a lot of CRUD actions. We’ve replaced our recordsets with strongly typed data objects, and have lost the flexibility that the recordset offered… being able to only grab the data we needed for the current page.

    Of course a lot of this is centered around web development (and older minicomputer/mainframe development), and using something like nHibernate with clearly defined aggregate roots and client state makes much more sense to me.

    Ugh, maybe I should just get out of web development… :)

  4. Greg says:

    Eric all of the points that you bring up are quite true … I addressed many of these however in my stating that there should be a separation between the OLTP database and the OLAP database (command vs query separation). Every issue that you bring up is a problem due to most people lacking this separation.

    As I stated in an ideal world my domain repositories contain 1 method, FetchAggregateByUniqueId …

    I have no problem with using an RDBMS and it helps deal with alot of the issues that you discuss, I just use the RDBMS for what its great at, reporting through set manipulation. For a transactional system things like storing XML in the could, storing events, or using an OODBMS make far more sense.

  5. Eric says:

    > The OLTP RDBMS will eventually leave in popularity, what
    > happens when I want to move to say “the cloud” and just
    > store my aggregate roots as XML, this is a perfectly valid and
    > extremely effective architecture.

    They’ve been saying that for YEARS. I don’t see it happening. Object databases exists today. Hierarchical databases existed long before relational databases. In fact people still use them today. MUMPS/CACHE is one example. I’ve used it extensively in the past and it’s quite terrible for ad-hoc reporting and querying compared to a relational database. The problem with storing a chunk of data or XML by aggregate root is that you have to traverse each root and examine it in order to decide if it meets your filter criteria or not. There were a bunch of hacks in MUMPS to create indexed globals that would allow you to store a reference back to the aggregate root, but you had to manage them yourself, you didn’t get them for “free” like you did with a relational database. Sorry folks, but the relational database is here to stay…

  6. Jeremy Gray says:

    re: Stefan Moser’s points – Ding, ding, ding! I’m glad someone finally mentioned that. For all but the most trivial applications, I would argue that using a generational approach in _either_ of the available directions is a fast track to an unmaintainable project. (Schema versioning being what usually falls down first in the schema-generational option, regardless of what the vendor/creator would promise you (and always after you ship), with object model fidelity being what usually crashes down in the model-generational option, again regardless of what the vendor/creator would promise you.)

    Generation in either direction is for tiny projects and/or bootstrapping a project in its extremely early stages. Any time after that (or heck, even before it, as even using generational at the start can lead to problems later) it is time to go into parallel model mode (’cause that’s what they are) and use the ORM to bridge the gap (as that is exactly what it is for. Big surprise, huh? 😉

  7. Stefan Moser says:

    “My first problem with ORMs in general is that they force you into a “objects first” box. Design your application and then click a button and magically all the data modeling and data access code will work itself out.”

    Funny how Stephen then whines that he doesn’t like a tool like Linq to SQL which does just the opposite, design your database and then click a button and magically all the object modeling and data access code will itself out. That doesn’t seem to work either, does it.

    The whole point of an ORM is to embrace the impedence mismatch and have some flexibility between how your objects and database tables are modelled. We need to get away from the object first vs data first debate because both of those approaches will results in identical models with whichever model comes second being sub-optimal. As Greg says, design both models in parallel and use an ORM to bridge the impedence mismatch between them just as the tool was intended to do in the first place.

  8. I see this debate from both sides. For a long time I worked in a team where design was done in the database and then the modules that made up the various parts of the system were created. We did not use ORM’s we simply coded a data access class and used stored procedures. The schema design was clean and the end product worked very well and was developed very quickly. Recently I have started using monorail with Active Record to build a simple CRUD website and I am impressed at how fast I can code up new screens without writing any stored procs at all. I think both approaches can work. Moving away from this debate lots of people have been predicting the demise of the RDBMS for some time but with the exception of a few scenarios I have yet to encounter a serious alternative. Database’s can be awkward to deal with but the things they are able to do are still impressive. My biggest bug bear with SQL is that it is impossible to elegantly reuse code. Why is it still not possible to abstract a piece of SQL into an object and reuse that in your stored procedures without compromising performance???

  9. karl says:

    Anyone who thinks ORM force you into object-first design obviously hasn’t used or doesn’t know how to use an ORM. Steve’s arguments seem based on cursory knowledge of the other side of the argument, so I’m not sure that there’s much value

    ORM is good for CRUD. Yes it is. And, like you yourself say it’s good for atleast 60% of all other data access scenario, so why WOULDN’T you use it? duh

  10. Colin Jack says:

    Formatting on my last comment didn’t make it live but hoping this one displays correctly…

    @Ian
    “Every time we take this approach we find that the class we surface represents an insight into the domain.”

    Definitely agree on that, and not a topic discussed enough in DDD (the book).

  11. Greg says:

    actually ian, can you drop a link to a complex mapping I can’t seem to find one. I don’t consider renaming fields or supporting xml as opposed to attributes (which I can find) to be moving away from AR.

  12. Greg says:

    Ian I stand corrected, I have just only seen it used in this way (and as described by Stephen). It is not my ORM of choice and as such I have been going on observations of how others use it.

    Thanks for the clarification.

  13. Ian Cooper says:

    Hi Greg,

    “Linq2Sql is … simply an Active Record implementation that is not using self-serving objects.”

    Not
    true. L2S does not support value types, but is not an implementation of
    Active Record. It is a Data Mapper that allows both attributes and xml
    mapping files and supports persistent ignorance approaches to design.

    We
    need to support L2S as it is much closer to what we want than EF and
    may represent a more accessible ORM than NHIbernate for some
    development shops.

    Now L2S does not support many-to-many
    relations, but a better response to Stephen here would be to point out
    that it is far better to decompose a many to many relationship into two
    one-to-many relationships and surface the mapping as a domain concept.
    Every time we take this approach we find that the class we surface
    represents an insight into the domain.

    So I would fundamentally disagree with Stephen that support for many-to-many relationship mapping is critical.

  14. Colin Jack says:

    Great stuff, as always.

    “The OLTP RDBMS will eventually leave in popularity, what happens when I want to move to say “the cloud” and just store my aggregate roots as XML, this is a perfectly valid and extremely effecting architecture”

    I was thinking about this recently, seems attractive not least as the hierarchical nature of XML suits well.

    “I take this often to extremes, my repositories in an ideal world have a single read method, FetchAggregateByUniqueId. Anything that is searching in a more complex nature is deemed a report and sits outside of this (usually as a small mapper that returns DTOs that are bound for the screen).”

    Never thought of things that way, does make sense though.

    Having said that I do sometimes “bind” my domain model to the display, not directly but it is the domain model classes that are updated based on the user input (ducks behind cover).

  15. Wow, just wow.

    This is a great read for anyone who questions the value of DDD and ORM tools and the thought process behind them. You answer a lot of tough questions here and provide real reasons behind your decisions. Kudos.

    Sorry this just seems like a rah-rah, but what more can be added?

  16. Jeremy Gray says:

    Ugh. There were things like line breaks and paragraphs and such in that comment. There really were. Damn you, codebetter.com comment engine! Damn you! 😉

  17. Jeremy Gray says:

    Since Stepehn’s blog is refusing my comment regardless of using FF or IE, I’m going to post it here. :)

    “Late to the game, I know, but I just spotted this post as it is getting some fresh link love from a few parties today.

    “a dialog was started only because of the professionalism of the EF team”

    Hardly. Many, many folk made repeated attempts to pursue dialog with the EF team regarding the EF’s many issues long, long agao, as far back as when each and every one of those issues first hit the light of day. Their attempts at dialog resulted in not much more than hand-waving (e.g. lazy loading) and a steadfast refusal to even slightly alter course. The only real response to come out of the EF group so far essentially boils down to “We hear you, but everything you raise is out of scope.” or, in other words, “We hear you, but aren’t _listening_.”

    One really has to wonder just what is behind all of this fact-twisting and whitewashing that is going on around the EF. When did brushing off the community become “starting a dialog”? When did concerned community members become a “mafia”?”

    That Stephen could not only make such statements but also have such confusion in terms of what is and isn’t ORM and what he is and isn’t looking for and what developers can and cannot handle, I really wonder why it is that he is considered one of the “advisors” to the EF group on these issues. Ugh.

  18. Thanks for this, Greg!

    I find it so depressing how .net community status quo folks leverage their visibility as credibility and boldly frame perspectives as authoritative commentary without having gained experience or done research.

    This – to me – is a serious social crisis.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>