Sunday, September 13, 2009

Sporm - Out of the Depths of SharePoint's XML Hell

In my last post I described how a new open source tool called sporm significantly simplifies unit testing SharePoint. Making SharePoint unit testable is my absolute favorite feature of sporm because SharePoint is notoriously hard to unit test. But sporm provides other benefits as well and its ability to pull us out of the depths of verbose loosely typed XML hell and into LINQ excellence is next on my list of favorite features. So in this post I’ll describe the pre-sporm technique of querying with CAML, how to query data using sporm, and finally how sporm supports SharePoint’s unique architecture of allowing multiple content types per list and what that means to you.

Caml’s Are Ugly

Warning: if you’re new to SharePoint then what you’re about to see may shock and upset you. If, like me, you hate both XML and loose typing then you will agree that CAML is awful, but bear with me I promise sporm will make it better. Much better.

CAML or Collaborative Application Markup Language is how one queries for data in SharePoint. A simple query might look like this:

<Query>
  <
Where>
    <
And>
      <
And>
        <
Eq>
          <
FieldRef Name='First_Name' />
          <
Value Type='Text'>Lee</Value>
        </
Eq>
        <
BeginsWith>
          <
FieldRef Name='Last_Name' />
          <
Value Type='Text'>Rich</Value>
        </
BeginsWith>
      </
And>
      <
Leq>
        <
FieldRef Name='Dob' />
        <
Value Type='DateTime'>2009-01-01T00:00:00Z</Value>
      </
Leq>
    </
And>
  </
Where>
</
Query>

Simple right? ;) In case you didn’t catch the meaning from the slightly, uh verbose, query this asks for records with a First_Name of “Lee”, a Last_Name starting with “Rich” and a “Dob” less than or equal January 1 2009.

There are a couple of things to note about this query:

  • Field names are loosly typed. If Dob were ever renamed to DateOfBirth the query would fail at runtime (probably during a demo to a customer, or if you’re lucky during integration tests), but certainly not at compile time.
  • And is a binary operator. This forces explicit precedence and removes the need for parenthesis, but at the cost of readability.
  • Types must be explicitly defined. I guess this is necessary since it’s XML, but somehow I just don’t feel like this should be necessary.
  • It’s XML. Ok obviously, but the point is you have to type everything twice. <BeginsWith> </BeginsWith>. Ouch, so verbose, so angley, I so hate XML.

Now, there are tools that make this better. U2U’s free CAML Query Builder tool significantly improves the experience of querying SharePoint data.

But it makes you wonder if something is wrong when you need a tool to retrieve data from your data store. Do you typically use tools to assist when you’re writing SQL? Probably not. But like I said hang in there, sporm make things better.

Unlinq my Caml

If you were to write the same query as above using sporm it would look like this:

IQueryable<Employee> employees = GetEmployees().Where(e =>
      e.FirstName == "Lee"
      && e.LastName.StartsWith("Rich")
      && e.Dob <= new DateTime(2009, 1, 1));

Just a little easier to read than CAML, right? A couple of things to note:

  • Fields are strongly typed. If you were to rename FirstName the compiler would catch every single instance at design time.
  • Operators are standard C# operators. &&, .StartsWith(), and <= are all familiar and concise and use a standard, known precedence.
  • Types are standard C# types. You never have to explicitly say that "Rich" is a string, it just is.
  • Your query is actual C# code. The lambda (closure) and the fact that it uses deferred execution might throw off a junior developer in some scenarios, but the query is readable and works exactly the same as if you were querying in memory objects or querying a database with LINQ to SQL or the Entity Framework.

Nice! What's beautiful about this is that sporm converts your C# directly into CAML using C# 3.0's expression trees feature. What sporm can't convert to CAML it executes in memory transparently to you. Sporm uses log4net and outputs its CAML queries to the console by default, so it is a good idea to watch the output if you're concerned about performance.

Now the following isn’t relevant to the comparison with CAML, but I would be negligent if I didn’t explain how the GetEmployees() function works.

SP != DB B/C of Content Types

First of all GetEmployees looks like this:

private IQueryable<Employee> GetEmployees() {
      return MySPDataContext
            .GetCurrent()
            .GetList<Employees>()
            .OfType<Employee>();
}

How it works is that the static GetCurrent() method retrieves sporm’s context object, which knows how to query a SharePoint site, from either the web context or thread local storage; next the GetList() method tells sporm which list you want to query; and the OfType() method tells sporm which content type within the list you want to query. This last part is important because sporm supports SharePoint’s ability to have multiple content types per list, which other LINQ providers like LINQ to SharePoint do not. But what are content types and why should you care?

SharePoint’s List/Content Type architecture seems odd at first, but it allows interesting scenarios not available in a traditional database. For instance you might have a calendar list that contains multiple types of records (list items) in the same list. Your calendar list might contain some combination of the following three record types: meetings with unique fields like Organizer (a person); iterations with unique fields like DeployedToProduction (a Boolean); and actions with unique fields like RelatedEmployee (a reference to another list). This architecture allows SharePoint to view all three types of records in a single view: like a per month calendar view. The data might look like this:

Title Start End ContentType Organizer Deployed To Production Related Employee
Iteration16 1/12/09 1/19/09 Iteration   false  
Stakeholder Demo 1/19/09 3 PM   Meeting Lee Richardson    
Tag Trunk 1/19/09   Action     Lee Richardson

Sporm’s unique architecture supports this scenario by allowing you to retrieve iterations like this:

return MySPDataContext
      .GetCurrent()
      .GetList<Calendar>()
      .OfType<Iteration>();

Or get meetings from the same list like this:

return MySPDataContext
      .GetCurrent()
      .GetList<Calendar>()
      .OfType<Meeting>();

While you may only use multiple content types per list occasionally you can feel comfortable knowing that sporm will support you when you need it. The rest of the time you can arrange your architecture to accommodate your 90% scenario of one content type per list. I’ll discuss the architecture I’m using on my current project in my next post.

For now I hope this has clarified some of the benefits of using sporm, and I hope that you’ll consider using it on your next SharePoint project.

No comments: