Seemingly innocent Linq problem that had me baffled for ages

Important: I really don’t recommend doing this now, as there is a Nuget package that makes it all soooo much easier. Read more…

I came across an innocent-looking Linq problem the other day that really had me baffled for some time.

It’s easiest explained using the ubiquitous Northwind database (although just about any other relational database would probably do). Suppose you want a list of customers with their orders. Pretty easy eh? You do something like…

from c in context.Customers.Include("Orders") select c;

As you have included the Orders navigation property, this will include every order for the customer.

Now, what happens if you only want to include orders from this year? Sounds like a pretty simple request eh? Well, it isn’t! If you were to do this in SQL, you could simply do something like…

select c.CustomerID, c.CompanyName, o.OrderDate from Customers as c
  inner join Orders as o on o.CustomerID=c.CustomerID
  where o.OrderDate > '1 Apr 1998'
  order by CompanyName, OrderDate

Admittedly, this would return a flat dataset, with one row for each order, but you could fix this with some data shaping without too much problem.

Now the obvious thought would be to do something like this in Linq…

from c in context.Customers.Include("Orders")
  where c.Orders.OrderDate > new DateTime(2011, 1, 1) select c;

The problem is that this won’t compile as the Orders collection doesn’t have an OrderDate property. Orders is a collection of Order entities.

Well, this one had me going for ages. I posted a question up in two separate forums (fora?), and didn’t get very far. Finally, a very helpful person posted an answer that worked.

What you have to do is create a query that creates an anonymous type that includes the Customer entity, and the Orders that you want. Then you create a second query that selects just the Customer entities from the first query…

using (NorthwindEntities context = new NorthwindEntities()) {
  context.ContextOptions.LazyLoadingEnabled = false;
  var customers =
    from c in context.Customers.Include("Orders")
      where c.CompanyName.StartsWith("A")
      select new {
        Customer = c,
        Orders = c.Orders
          .Where(o => o.OrderDate > new DateTime(1998, 1, 1))
      };
  IEnumerable<Customer> customers2 =
    customers.AsEnumerable().Select(c => c.Customer);
}

What happens behind the scenes is that the Entity Framework uses relationship fixup to sort out which Order entities should be included in the query. I’m not 100% clear what’s going on here, but it works.

Two words of warning here. First, this won’t work in the rather excellent LinqPad, as it doesn’t support fixup. Second, you need to disable lazy loading.

If you want to read more about relationship fixup, see page 345 in Programming Entity Framework by Julia Lerman. I couldn’t find a decent explanation of it on-line.

Be First to Comment

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.