CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

John Papa [MVP C#]

.NET Code Samples, Data Access, and Other Musings

Entity Framework Thoughts

I agree that there is quite a bit of work to do yet with the Entity Framework. Earlier today Scott posted his thoughts on the challenges that the Entity Framework faces. He makes some good points which really show how the approach to application development greatly influences which tools and development strategies developers will choose. I agree that applications have a better chance to succeed when they are "business rule" driven.

I prefer to build a data model and a business model together, not one before the other. In its current state the Entity Framework can be generated from an existing data model (or can be created by hand with XML). The data team at Microsoft knows this is a limitation and is working on solutions to make building and tweaking entities, mapping the conceptual model to a logical model, and overall design of the EDM much more fluid. I am very excited about seeing what they produce on this end.

Building off of the example that Scott worked, assume you have an entity that you are working with and realize that you need to add a new property (or field) to it to establish a business rule (or to write a test if you are using TDD).  You want to add the field or property to the entity and keep moving along with your business rule (probably using Resharper or the like to create the field/property combination). The idea is to keep your brain patterns flowing with the business rule and not to step outside of tat to implement a new field/property and populate it. You want to create the additional property, use it in your code, and map it to the models quickly.

With the Entity Framework in its current state you could add a new partial class and add the field and property to it, but then you are deviating from the EDM with a field that you will eventually want to persist in the database anyway. Another option with the Entity Framework is to add the property to the conceptual model and mapping it to a database field (that you must create). This is an option that  makes sense but is the most difficult to implement at this time. I believe this is one area where the Entity Framework has room to grow and the data team can add more value to the development community: Having a way to easily amend business entities, have it flow through the conceptual model ,and map them to a logical model to the database.

There is a lot left to work out and the more people who use the Entity Framework, participate in the forums, test the CTP’s, and provide feedback where they can, the better the end result will be. I believe that the data teams at Microsoft are absolutely committed to this technology and I am personally very excited about the Entity Framework ….. more excited than I have been about a technology in a long, long time.



Comments

Frans Bouma said:

What I wonder is: do your entities fall from the sky? Or do you think about them before adding them? I mean: if you think about them before you start out hammering code, you WILL understand they are persisted.

Now, the thing is that with a big enterprise application, you can't just simply step in and add a field somewhere. What if the app is already in production and this change requires major data migrations?

Thinking from code and a couple of classes, OK, it makes sense not to worry about the db that much as it won't have a big impact. However in a big application with lots of entities (the average hospital systems I've seen have at least 1000 entities, sometimes even 2000) you need to think about the model up front. You can't simply 'design' your entities in a code editor, you need overview, and analysis results.

Then adding a field follows:

- add to relational model

- agree that it is the best place to add it there

- migrate data if applicable

- update code to match the new relational model

No matter how hard people want to automate this, it won't be possible to automate this in a lot of cases.

So I'm not that convinced what you try to describe in this post is that important, as the current EDM approach is tackling this perfectly.

# March 22, 2007 9:40 AM

Jeremy D. Miller said:

Frans,

Can you answer a couple questions?  Why can't any upfront analysis be in terms of the actual business classes instead?  Why are you so insistent that it has to be the database first?  As long as I'm using an O/R Mapper that doesn't suck and isn't intrusive, I will be able to persist my objects.

Plus, in a larger system like a hospital system, wouldn't you really want to worry more about designing the business logic classes to support the desired behavior upfront instead of getting tied down in the storage details?  An application is often much more than just pushing around bags of data.

It goes the other way around as well Frans, I don't think you've made an adequate argument for your case.  I think the current EDM approach isn't ideal for systems with a lot of business logic.  It's always going to be faster to change an object model, get what you want, then change the relational model as soon as you're happy.  The db is always going to be more expensive in terms of change management.

# March 22, 2007 10:15 AM

John Papa said:

Frans,

You make some good points. But it makes me think that I must have miscommunicated .... I agree that with a big enterprise application, you should not simply step in and add a field somewhere for several reasons (However, sometimes it is needed when a client makes modifications after the fact and it is not a new design ... but I digress).

When I design an application I keep both the business model and the database in mind. I dont design a database and then build entities on top of it with business rules. And I dont go the other way and first design rules that drive my entities that drive my database. I take the approach that both are important and vitally linked to the success of the application.

I agree that the EDM approach is going to tackle this perfectly. I do not agree that it is there yet, though. For example, modifying the models in XML is just not the conducive to efficiently building an application. I firmly believe that a visual modelling tool is needed to make the process quicker and easier ( so we spend less time tweaking xml syntax and more time building).

On a smiliar note .... I am very interested in hearing your thoughts on the Entity Framework given your experience. If you have further thoughts as you ecplore the EF, please feel free to shoot me an email. I'd love to pick your brain too :)

# March 22, 2007 10:25 AM

Frans Bouma said:

I'm not talking about tables, I'm talking about entities, which are abstract definitions, i.e. at the NIAM/ORM level. I'm not thinking in classes either, but in abstract types, a class is just a technical construct.

What I'm trying to say is that you shouldn't use a table-> class 1:1 mapping as such, what I'm saying is that you should reverse engineer the model of the tables into an abstract entity model if you don't have that model, to the level of NIAM. This means that when I say 'customer' I'm not talking about a table and fields and rows, I'm talking about the entity 'customer' and physically I can implement that in a table (or 2 or 10 tables, who knows) and also as a class (or 2 or 10 classes, (inheritance)). That doesn't really matter as that's a technical implementation of the actual concept of the entity.

I think that's why there is a bit of a misunderstanding: no, reverse engineering your abstract entity model from a physical datamodel isn't bad, it gives you a model that is as abstract on at the same level as your own domain model classes.

Jeremy, that's why I think we miscommunicate: you apparently think I want to use a physical table and work with that. No, I don't want to work with a physical table either, as the physical table is a result of a conversion from an abstract model as well as your domain classes are, as these domain classes aren't falling out of the sky as well, you do analysis as well to get these classes defined, what goes in which class, which is aggregate, which is value object etc. That's not something to do with code, that's conceptual, on an abstract level. If I talk about 'customer entity' and it has attribute A, B and C, and you do the same, we talk about the same thing. FROM THERE, you create a domain class 'Customer' and I do to, though via a different route, as I don't want to type these all in as I already have the definitions.

The entity framework's main goal isn't to provide a persistence tier. The main goal is to provide a new physical database model no matter how the original model looked like. You then talk to that physical database model.

On top of that, they defined a .NET feature so you can consume that new model in .NET code. This is the usual classes to table mapping framework we've seen a lot of in the past years. So the entity framework is actually a 2 part system: you have the datamodel A mapped onto datamodel B part and you have the .NET classes mapped onto datamodel A part. You of course can skip the first, or don't have to use the latter. That's up to you.

Conceptually I do like the entity framework, though they are very slow in developing it. I'm sure they do their best to get everything ironed out but the speed in which they do it is too slow to be anywhere near impressive, so the features they present aren't that mindblowing as well. On top of that, the communicated theory is very deep and hard to grasp for newbies, so I have little hope for them it will appeal to a large group of people.

Coming back to my abstract model story: the reason I bring that up is that it's often much easier to change your abstract model and FROM THERE generate your E/R model for tables and classes for code than it is from the other side: one change in your abstract model might have the effect of changing a couple of tables. Doing it the other way around is thus not that clever: modifying tables is only allowed if you know the change made to them comes from a model change, and beware: the abstract entity model isn't a 1:1 copy of the resulting E/R model, not by far!

I hope I've explained it enough. It might seem we're on the opposite sides of the spectrum, but we're actually on exactly the same side.

# March 22, 2007 12:13 PM

Frans Bouma said:

"When I design an application I keep both the business model and the database in mind. I dont design a database and then build entities on top of it with business rules. And I dont go the other way and first design rules that drive my entities that drive my database. I take the approach that both are important and vitally linked to the success of the application."

Exactly. And here is where tooling is actually lacking. The thing is that you can start from the middle, as I described in my essay which is in Jimmy's latest book, e.g via NIAM/ORM, though MS doesn't seem to provide that tooling inside VS.NET, only via Visio. This is unfortunate, considering the fact that the father of NIAM/ORM is an employee of MS, prof. T.A. Halpin.

So if you could have an abstract modeling tool which drives both classes and tables FROM a higher level model which allows you to define which relates to what and when, it would be the best choice.

About my thoughts on the EF: when I first saw the publications, I wasn't that happy. Though after a while I realized it was actually OK, and a good oppertunity for us as well to branch out our work. Which we will do in the second half of this year (also towards nhibernate btw). What I find missing (but I'm glad they didn't add it ;)) is more support for entity management. Current mature o/r mapper frameworks aren't persistence tiers, they manage much more for you. This is lacking in their work, but can of course be added by a 3rd party tool :)

# March 22, 2007 12:23 PM

ScottBellware said:

> So if you could have an abstract modeling tool which drives both classes

> and tables FROM a higher level model which allows you to define which

> relates to what and when, it would be the best choice.

... best choice for coarse-iteration development, but poor for micro-incremental styles.  Contemporary development styles are increasingly micro-incremental, and these styles preclude coarse-grained Big Gulp modeling workflows.

# March 22, 2007 6:48 PM

Raymond Lewallen said:

@John,

"When I design an application I keep both the business model and the database in mind. I dont design a database and then build entities on top of it with business rules. And I dont go the other way and first design rules that drive my entities that drive my database. I take the approach that both are important and vitally linked to the success of the application."

I think everybody keeps all layers in mind, but I prefer to design behavior and domain models first, then move to repository.  I'm certainly not as saavy as both you and Frans in the realm of data repositories, so I was curious if you could elaborate on your approach and how you feel it produces better software.  I've tried it in the past, and to me, as requirements and the design evolve in the application layer, it just causes more work for me to go back and modify the repository (this was also prior to me being as familiar with ORM tools as I am today).  I prefer to have solidified the domain models of the current iteration prior to implementing the OR mappings and repository for those models.

@Frans,

Good explanations from you on this subject.  I feel like you and I relate more closely regarding these types of approaches towards development after reading your comments, but I'm still very much on the behavior first design approach.

"what I'm saying is that you should reverse engineer the model of the tables into an abstract entity model"

I'll have thousands of lines of code and tests long before I ever know where the data is going to reside.  I just don't ever think about it until I feel my domain models and behaviors are solid.  It certainly might not be right in all scenarios, but so far its worked well for me and I've never run into serious obstacles, performance issues or extensibility problems.  

I agree with Scott on his response to your statement.  Test first design and short, compact iterations just don't allow for high level modeling tools to fit in very nicely.  Again, that's just the flavor I choose to eat and it tastes good to me.  Evolving design without high level abstraction models.

# March 22, 2007 8:08 PM

John Papa said:

Ray,

Generally with a design I take the approach where I sit down with the business users and determine what all of their requirements are. We go through how their buusiness works and jot down notes on data points, workflows, business rules, junctures, and so on. Once I go through all of this (which in itself is an iterative process) I like to work up a business model and a database model. I like to look at it from both angles because sometimes from the business model point of view I can see things that I would not have from just looking at the data points (for example, rules of course and data items that may not be obvious). I also like to hit it from the data model point of view so I make sure that I am not building in all data points even if the rules I hit seem to cover it all. I think the simplest way to put it is that I like to use an iterative model where I gather business needs, design entities, rules and data models and then throw it aainst the wall and see if it sticks. Then I repeat the cycle.

Sorry if this is not clear .. it is hard to explain whats in my head when I design a system .... I have often thought about writing a book on my philsophy on how to do it. :)

# March 22, 2007 9:35 PM

John Papa said:

Frans,

"So if you could have an abstract modeling tool which drives both classes and tables FROM a higher level model which allows you to define which relates to what and when, it would be the best choice. "

That indeed would be a very cool modelling tool.

# March 22, 2007 9:47 PM

Raymond Lewallen said:

John,

I do the same thing, but at what point in your iteration do you develop domain models and database models? Which one do you do first?

# March 22, 2007 10:29 PM

Frans Bouma said:

"... best choice for coarse-iteration development, but poor for micro-incremental styles.  Contemporary development styles are increasingly micro-incremental, and these styles preclude coarse-grained Big Gulp modeling workflows."

True in the sense that if you opt for pure TDD and 'design by the keyboard' approach, you will run into problems, but as I said and will repeat: entities don't fall out of the sky. Even if you use pure TDD, you WILL have to model the world your work represents and applies to in your software, be it via classes or other ways. Mind you: when you create a customer class, is that just popping up in your head like "Hey, let's do a customer class today!"? No it's a result of analysis of the problem domain, and your domain approach will dictate you to create a customer class, so the class itself represents something and has a reason to be there.

_THAT_ is what I'm trying to say: it doesn't matter how you define that model, but the reason the RESULTS of the model exist has to be the model and what the model represents, be it something in your head or on paper, that's not important (although I'd prefer something on paper or in a designer as the software maintenance cycle (which takes up to 70% of every software project!) can be more smooth after you've left for another project.

Also, it might be you want to close your eyes for what is inevitable: namely that your data will be persisted in a database, but that won't make the database go away. The database isn't something with infinite speed and no memory consumption, it's an expensive resource and your application will spend a truckload of time inside that database, whatever you do.

Take the example of a central base class with 2 fields and 50 entities all deriving from that base class and all classes (so all 51 of them) have a table mapped, and you use inheritance in your o/r mapper to get the hierarchies mapped out correctly.

Will this be a top performer? Not necessarily, as the base class table will be VERY big over time and you've to join it in every filter, every query you will run on the database.

Perhaps you don't want to deal with these kind of problems or you bluntly don't care about these problems as they're probably the concern of the team taking over when you leave for another project, but this IS a real life scenario which will happen if you're not carefully considering that the data might end up in an external persistent storage.

I'm not saying you should micro-optimize everything, that's not good, but you should take into account your data will end up in a database which tend to be slow at some actions and fast at other actions. Of course you can refactor afterwards, but not that easy if your db is in production with 10 million rows or more.

Raymond: the model is often created by the architect or team of DBA's. Large applications with big databases often have a team of dba's who manage the db fortress and help designing the relational model.

Classes first works OK in development, but after deployment, refactoring which has big implications on the relational model (class changes -> table changes) aren't trivial, as a table change has high impact on the running system, so a data migration has to take place. These things aren't simple in a lot of situations with big databases. Not to know WHY the tables have to change that way is then a bit of a problem as over time people come and go and why a table is formulated the way it is formulated gets more and more unclear, making maintenance a pain.

# March 23, 2007 4:34 AM

John Papa said:

Ray ... I generally create the entities on paper (not the classes) and the data model on another. I dont necessarily start on either first. Doing them at the samet ime helps me see the disparity between the 2.

If I had to choose, I would start with the entities first (again, abstract, not classes nor code). This is because I like the business to drive the models. However, the one pitfall I have seen people fall into when going this route is that you have to be careful not to design the model just to work for a single applicaiton. Many shops I work with have multiple purposes for their models. This is just something I keep in mind during the design.

# March 23, 2007 9:09 AM

Raymond Lewallen said:

"you have to be careful not to design the model just to work for a single applicaiton."

One reason why I build the application layer through behavior first, so that the models reflect what the business requirements are.  If I understand you correctly, then I've experienced the opposite, which is developers creating domain models that are too coupled to other layers, especially the data layer.  I consult to a company now where we are creating models and workflows that work for many business processes, and with proper design its all easy to handle.  The one thing we don't want to do is think too quickly about repository and persistence and make poor design decisions that couple us to a specific solution.  In my experience models and behaviors have been designed to work well in different application scenarios.  Its the service layers, agents, entity translators, controllers, views etc that all get more specific to a type of application.  One pitfall I have run into many times is getting multiple business divisions within large companies to agree on what the abstraction and definitions really are that the models are designed to represent.

It really all comes down to experience and what your pain points have been in the past.  I was just curious about how quickly you focus on the database model during the iteration and if you focused on it first.

# March 23, 2007 10:05 AM

Console.Write(this.Opinion) said:

Esta semana não foi tão animada quanto a semana passada.. ASP.Net Sébastien Just fez uma compilação das

# March 26, 2007 10:36 AM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add

About John Papa

John (C# MVP and MCSD.NET) has been working with Microsoft distributed architectures for over 10 years. He has enterprise experience architecting and developing with .NET technologies including ASP.NET as well as WebForms using both C# and VB.NET. He is a baseball fanatic who spends most of his summer nights rooting for the Yankees with his family and his faithful dog, Kadi. John has authored or co-authored several books on ADO, ADO.NET, XML, and SQL Server, is the author of the Data Points column in MSDN Magazine, has presented MSDN WebCasts and can often be found speaking at industry conferences such as VSLive and DevConnections. Check out Devlicio.us!