The Weekly Source Code 52 - You keep using that LINQ, I dunna think it means what you think it means.
Remember good developers don't just write source code, they also READ it. You don't just become a great poet by writing lots of poems. Read and absorb as well. Do check out the Source Code category of my blog here, there is (as of today) 15 pages of posts on Source Code you can check out.
Recently my friend Jonathan Carter (OData Dude, my name for him) was working with a partner on some really weird stuff that was happening with a LINQ to SQL query. Remember that every abstraction sometimes leaks and that the whole port of an abstraction is "raise the level" so you don't have to worry about something.
Plumbing is great because it abstracts away water delivery. For all I know, there's a dude with a bucket who runs to my house when I turn on the tap. Doesn't matter to me, as long as I get water. However, sometimes something goes wrong with that dude, and I don't understand what's up with my water. This happened to JC and this partner.
In this example, we're using the AdventureWorks Sample Database to make this point. Here's some sample code the partner sent us to reproduce the weirdness.
protected virtual Customer GetByPrimaryKey(Func<customer, bool> keySelection)
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
return (from r in context.Customers select r).SingleOrDefault(keySelection);
}
[TestMethod]
public void CustomerQuery_Test_01()
{
Customer customer = GetByPrimaryKey(c => c.CustomerID == 2);
}
[TestMethod]
public void CustomerQuery_Test_02()
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
Customer customer = (from r in context.Customers select r).SingleOrDefault(c => c.CustomerID == 2);
}
CustomerQuery_Test_01 calls the GetByPrimaryKey method. That method takes a Func
Well, if you run this in Visual Studio - and in this example, I'll use the Intellitrace feature to see the actual SQL that was executed, although you can also use SQL Profiler - we see:
Here's the query in text:
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title],
[t0].[FirstName], [t0].[MiddleName], [t0].[LastName],
[t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson],
[t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash],
[t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
Um, where's the WHERE clause? Will LINQ to SQL kill my pets and cause me to lose my job? Does Microsoft suck? Let's take a look at the second query, called in CustomerQuery_Test_02():
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title],
[t0].[FirstName], [t0].[MiddleName], [t0].[LastName],
[t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson],
[t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash],
[t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
WHERE [t0].[CustomerID] = @p0
OK, there it is, but why does the second LINQ query cause a WHERE clause to be emitted but the first doesn't? They look like basically the same code path, just one is broken up.
The first query is clearly returning ALL rows to the caller, which then has to apply the LINQ operators to do the WHERE in memory, on the caller. The second query is using the SQL Server (as it should) to do the filter, then returns WAY less data.
Here's the deal. Remember that LINQ cares about two things, IEnumerable stuff and IQueryable. The first lets you foreach over a collection, and the other includes all sorts of fun stuff that lets you query that stuff. Folks build on top of those with LINQ to SQL, LINQ to XML, LINQ to YoMomma, etc.
When you are working with something that is IQueryable; that is, the source is IQueryable, you need to make sure you are actually usually the operators for an IQueruable, otherwise you might fall back onto an undesirable result, as in this database case with IEnumerable. You don't want to return more data from the database to a caller than is absolutely necessary.
From JC, with emphasis mine:
The IQueryable version of SingleOrDefault, that takes a lambda, actually takes an Expression
, whereas the IEnumerable version, takes a Func > . Hence, in the below code, the call to SingleOrDefault, is treating the query as if it was LINQ To Objects, which executes the query via L2S, then performs the SingleOrDefault on the in memory collection. If they changed the signature of GetByPrimaryKey to take an Expression it would work as expected. >,
What's a Func and what's an Expression? A Func<> (pronounced "Funk") represents a generic delegate. Like:
Func<int,int,double> divide=(x,y)=>(double)x/(double)y;
Console.WriteLine(divide(2,3));
And an Expression<> is a function definition that can be compiled and invoked at runtime. Example"
Expression<Func<int,int,double>> divideBody=(x,y)=>(double)x/(double)y;
Func<int,int,double> divide2=divideBody.Compile();
write(divide2(2,3));
So, the partner doesn't want a Func
protected virtual Customer GetByPrimaryKey(Expression<Func<customer,bool>> keySelection)
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
return (from r in context.Customers select r).SingleOrDefault(keySelection);
}
[TestMethod]
public void CustomerQuery_Test_01()
{
Customer customer = GetByPrimaryKey(c => c.CustomerID == 2);
}
[TestMethod]
public void CustomerQuery_Test_02()
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
Customer customer = (from r in context.Customers select r).SingleOrDefault(c => c.CustomerID == 2);
}
Just changed that one line, so that GetByPrimaryKey takes a Expression
Someone famous once said, "My code has no bugs, it runs exactly as I wrote it."
Layers of Abstraction are tricky, and you should always assert your assumptions and always look at the SQL that gets generated/created/executed by your DAL before you put something into production. Trust no one, except the profiler.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
Since IQueryable inherits from IEnumerable, and since an Expression<Func<>> and a Func<> seem to behave exactly the same way, it's pretty easy to inadvertently retrieve a whopping helping of records from your database without even realizing it--especially if you're new to L2S.
A sidenote which perhaps further illustrates the point: "from r in context.Customers select r" is equivalent to "context.Customers". So, if I'm not mistaken, you could just as easily write (assuming you're referencing System.Linq and System.Data.Linq):
context.Customers.SingleOrDefault(keySelection);
<Expression<func><int, int, double>> should be <Expression<Func<int, int, double>>
and
</int, int, double> and some other weird closing tags showing up in a couple of places.
Most of the typos are in the three code examples before the last screenshot from SQL Profiler, and in the text right before them.
Keep up the good job!
Love the timing :)
Kidding aside, thanks for a very informative post. Again. Isn't it at least slightly annoying being so awesome all the time? Knowing that it would might alleviate the pain I feel every day over not being as awesome as some of you guys...
A optional parameter with default value? A empty string for a bool? I replace it with Func<Customer, bool>, and get the different result in my machine! Everything works well, I get "where" statement without using Expression!
I'm curious as well. Again, my un-awesomness is shining through here, but I don't quite get that part.
Getting back to LINQ... To give a little more background on why the Func-only approach was not working is because the Expression wrapper allows the LINQ-provider (in this case LINQ to SQL0 to walk through your Func delegate and generate a SQL string. And without the ability to walk through the delegate and figure out that you have a Customer object with a property of CustomerId that should be equal to 2, it cannot generate the necessary SQL.
The key to understanding LINQ, more specifically IQueryable, is that LINQ is not compiled to SQL it's interpreted to SQL. And it's done so by creating a binary tree, you remember those from COMP SCI 101, of expressions that is walked by the provider. And each of these nodes can be any expression in this namespace.
http://msdn.microsoft.com/en-us/library/system.linq.expressions.aspx
It's very interesting stuff, defintily worth checking out.
Thank you Scott!
Ultimately, as you noted, the func argument that he passed in resulted in the c# compiler using the IEnumerable extension method. Internally, that method will call IEnumerable.GetEnumerator, resulting in the query without the where clause.
I'd say that it was a misunderstanding of the core concepts that make linq work that caused this issue for Mr. Carter's partner, as opposed to a leaky abstraction.
Also, I'm guessing that the whole "point" of an expression is to raise the level; you currently say that it's the whole port.
Apart from copyediting details, there are more problems here than your coders' grasp of L2S. Me, I had a sinking feeling from the first line, when they'd written a function called GetByPrimaryKey() that does not actually get by primary key, but merely applies whatever the heck function you give it. Then, when their second statement initialized an IDisposable that was not explicitly disposed, I knew we weren't exactly in for a tour de force of meticulousness.
But yeah, that's how L2S works. Yep. [Hock, spit.] Expressions, not strictly parsed until runtime. Is there any other way to understand it? (There are clearly some ways of programming it without understanding; I'm not asking about those.)
I'm with Charles Strahan. This doesn't seem like an example of two coders with a minor misunderstanding of the framework; the lesson here isn't that L2S is creaky and full of gotchas (though it is). This particular example seems like two coders with only a superficial understanding of what L2S is and does.
As far as Jon Skeet is concerned, you're talking about a piece of code that is not valid C#. So, to prove your case, do you have an example of a Func with an equals sign in it that does actually compile?
I feel like I'm asking for a tape of the Abominable Snowman here...
I also have a post talking about the detail of how does the IQueryable<T> query methods' source code like, and how does the above whole things work:
Understanding LINQ to SQL (10) Implementing LINQ to SQL Provider
http://weblogs.asp.net/dixin/archive/2010/05/12/understanding-linq-to-sql-10-implementing-linq-to-sql-provider.aspx
Comments are closed.