Thursday, July 22, 2010

Death to the DAO and How to Test LINQ

Occasionally I hear complaints that LINQ is hard to unit test. These complaints aren’t about LINQ to objects, mind you, they’re specific to the complexities of the flavors of LINQ that turn C# code into something else like SQL or CAML using expression trees. The most common technologies are LINQ to SQL, the Entity Framework, or in my case at the moment LINQ to SharePoint. In this post I’m going to propose a technique that makes testing LINQ not just easy, but downright elegant – assuming you’re ok with extension methods – lots of extension methods. And assuming you’re ready to kill your Data Access Objects (DAO) tier.

The Unit Testing Problem

Any architecture needs a place to put code that finds entities. For instance FindBySocialSecurityNumber(). In a traditional architecture we might put a method like this is in a DAO layer. If so our method will look something like this:

public class EmployeesDao {
    public Employee FindBySSN(Context ctx, string ssn) {
        return ctx.Employees.SingleOrDefault(e => e.Ssn == ssn);

So how would we go about unit testing this?

One fairly typical solution would be to use an in-memory database. That approach works if our data store is a database, but it certainly doesn’t work if the data store is something less traditional like SharePoint. But even if our store is a database, we’ll still have the hassle of setting up the in-memory database.

Another solution might be to use a mock Context that returns an IQueryable. But wouldn’t it be wonderful if we could avoid mocking all together?

Killing the DAO

The first question is why we even have a DAO tier to begin with. The original idea was that we wanted a place to put code specific to a particular data store. In other words we wanted to isolate the code that will need to be changed should the data store switch from SQL Server to Oracle. But isn’t that exactly what LINQ does? I’d be pretty surprised if there wasn’t a decent LINQ provider for just about any data store at this point that required more than minimal code changes. So why not embrace LINQ and reconsider alternatives to a DAO tier?

One alternative that I’ve been using for over a month now is to switch to extension methods. To give credit where it’s due the idea originated with a conversation with fellow Near Infinity employee Joe Ferner. And I'm sure the idea isn't particularly original (please post in the comments if you know others that use this approach).

Using this technique our code changes from something like this:

var employeeDao = new EmployeesDao(); // or use IOC of course
employeeDao.FindBySSN(ctx, "111-11-1111");

To something like this:


Among other things I find this far more aesthetically pleasing because each of the three elements to the statement represent a subsequent filtering of data. It's a more functional way of looking at things.

We could implement this off of the Employees property of the context if we have control over that (which I don't with spmetal). But if we implement this as an extension method like this:

public static class EmployeeExtensions {
    public static Employee FindBySSN(this IQueryable<Employee> employees, string ssn) {
        return employees.SingleOrDefault(e => e.Ssn == ssn);

We now have something that’s considerable easier to unit test.

Testing It

Once we’ve refactored our function as an extension method that filters down the corpus of entities, we can test the code using in-memory objects with a call to .AsQueryable(). For instance:

public void FindBySSN_OneSsnExists_EmployeeReturned() {
       var employees = new [] { new Employee { Ssn = "111-11-1111" } };
       var actual = employees.AsQueryable().FindBySSN("111-11-1111");

Notice we didn’t have to mock anything.

Testability, but at What Cost?

This technique works great for the example above, but how does it scale to harder problems and what other downsides are there?

As far as scalability I’ve found this technique works great for every scenario I’ve run across in the month I’ve been doing it. It works for joins, aggregations, and even for inserts, update, and deletes.

As far as downsides the astute reader may be wondering about mockability. For instance what if we want to mock the call to FindBySSN and give it the exact Employee that will be returned. This scenario is admittedly harder. But what I've found is that far more often than not I don’t really need to mock the types of things that used to live in the DAO tier. Instead I just mock the Employee object off of context to return in-memory objects and make my tests slightly larger in scope. Most of the time I find the larger scope increases the usefulness of the test. In the occasional case where I do really want to mock the "DAO" tier I use a technique described in either this post or this post by Daniel Cazzulino.


Obviously there is more to this architecture, for instance how do you handle insert and update operations? The short answer is it’s easy, but I’ll save that topic for a future post. For now why not give this approach a try? You weren’t really happy with that useless old DAO tier anyway, were you? I say we eradicate it and never look back.