David Hayden [MVP C#]

Sponsors

The Lounge

News

  • CodeBetter.Com Home

Other Links

Teas

Patterns & Practices

Florida .NET Developer

Book Reviews

Tampa ASP.NET MVC Developer Group

Advertisement

Images in this post missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at imagehelp@codebetter.com
LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios

In the last post:

LINQ To SQL Query Tuning for Performance Optimization

Frans pointed me to a post on the MSDN Forums that discusses the chattiness that can occur in LINQ To SQL when you attempt to prefetch more complex object graphs.

Take the example in the forum post that discusses grabbing the Customers from the Northwind Database and prefetching their Orders and OrderDetails at the same time:

 

 

which you could code like the following:

 

using (NorthwindDataContext context = new NorthwindDataContext())
{               
    DataLoadOptions options = new DataLoadOptions();
    options.LoadWith<Customer>(c => c.Orders);
    options.LoadWith<Order>(o => o.Order_Details);

    context.LoadOptions = options;

    IEnumerable<Customer> customers = context.Customers.ToList<Customer>();
}

 

Opening up SQL Server Profiler shows a rather ugly set of queries ( 40+ queries in total ) that breakdown as:

  1. 1 Query to First Retrieve all the Customers in Northwind
  2. 40+ Queries - 1 Query for each Customer to get its Orders and OrderDetails.

 

Again, you have that startling number of database roundtrips in LINQ To SQL that typically you want to avoid if possible to improve performance.

Since Frans mentioned it :), I wrote similar code to do the same thing using LLBLGen Pro and Self-Servicing Entities:

 

CustomerCollection customers = new CustomerCollection();
IPrefetchPath prefetchPath = new PrefetchPath((int)EntityType.CustomerEntity);
prefetchPath.Add(CustomerEntity.PrefetchPathOrders).SubPath.Add(OrderEntity.PrefetchPathOrderDetails);
customers.GetMulti(null, prefetchPath);

 

which resulted in only 3 queries ( roundtrips ) that first grabbed all the Customers, then all the Orders, and then all the OrderDetails. This is more inline with what I was expecting from LINQ To SQL.

I couldn't find a way to tweak LINQ To SQL to reduce the number of queries, but maybe Scott Guthrie can come up with a more practical solution in his next installment of LINQ To SQL. This is just something to keep in mind when using the current version of LINQ To SQL. Always review the database communication with SQL Server Profiler to determine what is actually being generated.

by David Hayden

 


Posted 08-06-2007 3:14 PM by David Hayden

[Advertisement]

Comments

Chris Moseley wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-06-2007 4:47 PM

This is interesting. I actually put this question to Scott Guthrie a few weeks ago as a comment on his blog. One of the first things I tried to do with Linq to Sql as an experiment in beta 1 was convert a small project from LLBLGen. In my LLBLGen projects I'm pretty big on using prefetch paths so I came across this problem almost strait away. The fact that I can't seem to get a decent answer from anyone at Microsoft about it has been bothering me a little ever since.  Scott’s answer was “I think you can call the LoadWith() statement multiple times.  So to setup relationships across multiple tables just call it multiple times with each one.”. I tried this, and as already stated here, it doesn’t work. Scott did mention that he hadn’t had much time to play with this though so maybe when he does he’ll be able to apply some pressure to his team and get some sort fix implemented in time for release. He should as this is a pretty severe limitation to an ORM.

ScottGu wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-06-2007 10:47 PM

In cases where the relationship between two entities is (n:1), LINQ to SQL does a JOIN as opposed to separate queries since these tend to be safe from a data explosion perspective.  

In the case of a 1:n associations, LINQ to SQL only supports joining-in one 1:n association per query.

So for example, if you wrote code like below:

           DataLoadOptions options = new DataLoadOptions();

           options.LoadWith<Product>(c => c.Category);

           options.LoadWith<Product>(c => c.OrderDetails);

           options.LoadWith<OrderDetail>(o => o.Order);

           db.LoadOptions = options;

           IEnumerable<Product> products = db.Products.ToList<Product>();

You'd find that only 1 query is executed against the SQL database for everything (even though you are bringing back all of the Products and their associated Category and OrderDetail information).  This is because the product->category relationship is (n:1) and the OrderDetail->Order relationship is (n:1) and so both are automatically joined in.

The reason the relationship in your blog post above is generating multiple queries is that you have two (1:n) relationship (Customers->Orders) and (Orders->OrderDetails).  If you just had one (1:n) relationship (Customer->Orders) or (Orders->OrderDetails) LINQ to SQL would optimize and grab it in one query (using a JOIN).  

Hope this helps,

Scott

Christopher Steen wrote Link Listing - August 6, 2007
on 08-07-2007 1:07 AM

Link Listing - August 6, 2007

FransBouma wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-07-2007 3:50 AM

Scott: isn't that sloppy coding? I mean: the LoadOptions advertise as if they will do a proper job in fetching the graph. Because that's what the developer is interested in.

However, it won't do any efficient fetching at all. Not only is a join in an 1:n relation pretty inefficient (as every row contains a duplicate), it's also limiting the graph fetching, as you can't fetch multi-path graphs efficiently.

If I fetch 500 customers and their 100 latest orders and the order lines for these orders + the employee rows related to the orders, I'm in for a lot of fun, while it's so unnecessary as the database also supports subqueries which often result in the same execution plan PLUS in path fetching they are more efficient in almost all cases.

Furthermore, you can optimize the queries for each node pretty good with a threshold which you use to choose whether you would do:

1) select * from child where fkToParent in (@c1, @c2, ..., @cn)

or

2) select * from child where fkToParent in (select pk from parent where <parent filter>)

How the implementation is now is very very limited. While a limited implementation could be OK, the thing with this is that it is advertised as if it is able to fetch a graph efficiently (I asked you about this in a reply to your linqdatasource article, and you said one could efficiently fetch a graph using loadoptions, but IMHO that's thus not the case in many situations) but it can't do that in a truckload of situations. The WORST part of this is is that this problem will likely be popping up in production, where the dataset is big and due to that performance problems will arise. The developer won't notice these things very likely because the dataset the developer works with is often small enough to be performant even with a lot of queries.

(although the simple customer-order-orderdetails graph was pretty slow on my northwind db. )

David Hayden wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-07-2007 4:31 PM

Thanks for the clarification, Scott.

This type of information needs to be in a LINQ To SQL Best Practices Document so developers are aware of when LINQ To SQL may generate less-than-ideal SQL. If a query like above is one of the more active queries in an application, LINQ To SQL wouldn't be an ideal candidate and it would be nice to know that upfront.

A lot of developers, including myself, are looking for guidance on how to use LINQ To SQL in n-layer / n-tier applications as well as when it is efficient and when not. I hope Microsoft is working on a document and real-world samples to help developers, because with the go-live licenses we are building applications now :)

Thanks.

Jim Wooley wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-07-2007 11:45 PM

As Scott mentions, the recommendation at this point is to use the LoadOptions (formerly DataShape, formerly Including). If MSFT isn't working on a best practices, I'm including a discussion of it in my upcoming LinqInAction book at least (blatent plug...)

David Hayden wrote re: LINQ To SQL - Query Tuning Appears To Break Down In *More Advanced* Scenarios
on 08-08-2007 11:02 AM

Jim,

Unfortunately the LoadOptions doesn't *help* with multiple 1:m associations as we discovered above and Scott confirms.

Have you found a way to have LoadOptions help in those cases?

wagnerblog.com » What are you NOT doing…..??? wrote wagnerblog.com &raquo; What are you NOT doing&#8230;..???
on 08-16-2007 10:39 AM

Pingback from  wagnerblog.com &raquo; What are you NOT doing&#8230;..???

Community Blogs wrote Developing Linq to LLBLGen Pro, part 14
on 03-07-2008 1:30 PM

(This is part of an on-going series of articles, started here ) We&#39;re getting closer to the goal

.Net World wrote Developing Linq to LLBLGen Pro, part 14
on 04-14-2008 5:15 PM

Update: I made a mistake in the first Linq to Sql query. It&#39;s not that slow as I previously posted

Hardcode wrote Interesting Linq Discussion
on 11-07-2008 12:08 PM

Here on DataLoadOptions implementation. I'm debugging a website and it seems that indeed in some cases

David Hayden - Florida .NET Developer - C# and SQL Server wrote Performance Profiling O/R Mapper Calls to the Database
on 01-06-2009 10:02 PM

It is important to understand what the O/R Mapper is doing in the background and act accordingly.