Way too often I come across code like this:
IEnumerable<Product> products = GetProducts();
}
What’s wrong with this is that Count() is a LINQ extension method that literally iterates through every item in an enumerable. In other words it won’t short circuit after it finds more than zero items -- it will continue iterating through every single item. And while that may be a fast operation in dev, it may be a really slow operation in production.
But it gets worse. Because of LINQ’s deferred execution the results of .Count() are not cached. So when there is at least one product you are guaranteed to enumerate every single product twice: once for .Count() and once for .DataSource = products.
Do Any() Solutions Exist?
One possibility is to instead use IEnumerable.Any(). So you could rewrite the above code like this:
IEnumerable<Product> products = GetProducts();
The results will be better regardless of if your data source is in memory or a database, but for illustration purposes LINQ to SQL outputs the following SQL if you run the code smell version when there is at least one product in the database:
SELECT COUNT(*) AS [value]
And it outputs this SQL if you run the better version when there is at least one product:
SELECT
SELECT NULL AS [EMPTY]
Looks a little uglier, but if you run the execution plan you’ve converted two table scans into an index scan and a table scan (well, depending on your schema of course). Regardless, much better execution-wise.
Undeferring deferred execution
Now what if your data source is a database and you really want to optimize the heck out of this and only perform one SQL statement? You could disable deferred execution by doing this:
IList<Product> products = GetProducts().ToList();
The .ToList() function forces enumeration to happen immediately and cache the results in a list. This achieves the result of converting two queries into one, but be warned it also destroys one of the nicest benefits of LINQ’s deferred execution: the ability to append .Sort() or .Skip().Take() (paging) statements at the very end and have them execute in your original query, in SQL, in the database, where they belong.
Summary
IEnumerable.Count() seems like a pleasant and friendly convenience function; it feels so similar to the innocuous List.Count -- but don’t be fooled. It’s a mistake. One you (or the person who maintains the system when you leave) may soon regret.
Comments
var things = //linq query here
for(var i = 0; i < things.count(); i++)
things.ToList()[i]
I wish I was joking.
You wouldn't want to count data length with a stream either.