Friday, December 5, 2008

An Exciting Future: C# 4.0, Silverlight 3, MVC Dynamic Data, Live Mesh, VS 2010, ...

Last week I attended the Public Sector Developer Conference in Reston, Virginia. Summary: I can barely contain my excitement for just about everything Microsoft is doing right now for software development. In particular the first talk "What I've learned about Visual Studio 2010, .NET Framework 4.0 (and beyond), Silverlight 3 (and beyond)" was so exciting I almost gave a standing ovation.

What Marc learned about Visual Studio 2010, .NET Framework 4.0, Silverlight 3

So believe it or not this talk by Marc Schweigert (who is a fabulous presenter by the way) was a 90 minute distillation of everything that happened at PDC (which sadly I missed). Consequently it flew like a jet engine, gave barely a minute or two to every topic, and required either massive amounts of caffeine or a good amount of prerequisite reading. Fortunately I had both.

To summarize the presentation I'll just give my favorite upcoming technologies and sort it roughly in decreasing level of my excitement:

Technology: ASP.Net MVC Dynamic Data
Overview: Ruby on Rails for .Net
Impressions: As if ASP.Net MVC wasn't cool enough. This looks awesome! I want to try it on a real project.

Technology: ADO.Net Data Services (Astoria)
Overview: Expose LINQ ORM data (e.g. LINQ to Entities) to web services, now with better BLOB support
Impressions: Powerful, exciting (just a little scary)

Technology: Astoria Offline
Overview: Write disconnected apps that sync to ADO.Net Data Services
Impressions: Frikin' awesome. I can't wait to see/write disconnected apps.

Technology: ASP.Net AJAX - jQuery Support
Overview: ASP.Net will abandon its custom "roll your own" JavaScript framework and adopt a leading open source one .
Impressions: Amazing. I couldn't be happier.

Technology: Live Mesh Web Apps
Overview: Offline capable, in browser or on desktop apps (HTML/AJAX or Silverlight) that run "in the mesh"
Impressions: Wow, this is some really cool stuff, I'll definitely be keeping an eye on this technology

Technology: SharePoint Integration
Overview: Build web parts, wsp files, etc directly in VS without an add-on
Impressions: Fantastic! Awesome! This was a major point.

Technology: Task Parallel Library (TPL)
Overview: Simplify parallelization and avoid programming in threads by using delegates
Impressions: This looks absolutely amazing, I can't wait

Technology: Parallel LINQ
Overview: Parallelize LINQ queries
Impressions: So easy, so powerful, so exiting

Technology: In Process Side By Side Model
Overview: Run both 2.0 and 4.0 CLR code in the same process
Impressions: Wow. Minimize existing app upgrade pain; free developers to use new features in old code bases; free language designers more flexibility in refactoring the C# language.

Technology: C# 4.0
Overview: Named optional parameters and dynamic types
Impressions: Named parameters: cool, dynamic types: controversial, but I suspect a very good thing

Technology: F#
Overview: Functional programming language to ship by default with VS
Impressions: I'm playing with this now, it's very interesting, but I'm not prepared to comment yet

Technology: Managed Extensibility Framework
Overview: Massively simplify creating desktop app add-on frameworks.
Impressions: This has the potential to change desktop apps to very light containers that load functionality as necessary. I am hopeful and excited.

Technology: VS 2010 IDE Improvements
Overview: WPF & XAML Editor better, Multi-Monitor Support, Built on WPF
Impressions: Cool, looking forward to trying. I like the dog-food approach, although as I've pointed out earlier dog-food isn't everything.

Technology: ADO.Net Entity Framework V2
Overview: The DBMS agnostic LINQ based ORM now with N-tier improvement, DDL generation, caching
Impressions: DDL generation was interesting, but apparently won't support deltas, that's for another product, maybe Oslo. Caching support looks fabulous.

Technology: WPF
Overview: WPF to have more controls, better usability in VS, and simplify ribbon programming
Impressions: Cool, but .. am I the only person who hates the ribbon?

Technology: Silverlight 3
Overview: 3D & GPU support; will work as well as WPF in VS; H.264 support
Impressions: 3D = Very cool; VS WPF = Cool, although I may still program in XAML; and H dot two sixty what?

Technology: Web Development in VS 2010
Overview: JavaScript intellisence improvements; view-state improvement, Deployment simplified (e.g. Web.Production.Config & setup packages); new controls (chart)
Impressions: Very nice.

Technology: ASP.Net AJAX – Client Templates
Overview: Simplifying AJAX without an UpdatePanel
Impressions: I'm lazy so I'll probably stick with UpdatePanel & server based approach, but I'm very happy to see the new client based focus as an additional option

Technology: Workflow Foundation (WF)
Overview: More controls, faster, better debugging
Impressions: I haven't really had a need for it WF, but the enhancements sound good.

Technology: Oslo
Overview: Create DSLs that do things like help manage databases
Impressions: I fail to see what problem this solves, but Martin Fowler thinks it's cool, so I'll keep an eye on it.

Technology: SQL Data Services
Overview: Basically Amazon S3 (cloud stuff)
Impressions: In case you didn't know I love referential integrity, so I'm not a big fan. What did impress me is Microsoft will merge the LINQ serialization approach with ADO.Net Data Services. Nice benefit to multiple technologies within one company.

Summary

There's so much going on it's a little overwhelming and I'm sure I missed stuff, but you can see why I'm so excited. I vow to never miss a PDC again.  Anyway I hope this post helped clear up the vast amount of exciting new technologies coming out soon.

Wednesday, November 26, 2008

Write My Rhino Mocks Expect Statement

Mocking multiple calls to a complicated external dependency (using RhinoMocks for instance) can be challenging if you want to return realistic data. But if you have access to the external dependency and it has decent test data why can’t your mocking tool just call the real dependency and give you the exact Expect() and Return() statements that you need to mock it’s real state?

Well, now it can with WriteMyExpectStatement on CodePlex. How much would you expect to pay for this power? $1,000? $100? How about absolutely free! Alright, guess I've been watching too much late night TV. Let’s examine the problem via an example.

Example: Mocking a Database

Suppose you have a Northwind style database with products and orders and you want to re-order products whose inventory is low:

public class NorthwindService {
  ...
  public void ReorderLowInventoryProducts() {
    using (SqlCeConnection cn = NorthwindDao.GetConnection()) {
      NorthwindDao.Connection = cn; // simple IOC
      IEnumerable<Product> products = NorthwindDao.GetLowInventoryProducts();
      // cache lookup tables in memory
      Dictionary<int, Supplier> suppliers = NorthwindDao.GetAllSuppliers()
        .ToDictionary(k => k.SupplierID); // yayyyy, LINQ

      foreach (Product product in products) {
        Supplier supplier = suppliers[product.SupplierId];
        supplier.OrderMoreProduct(product);
      }
    }
}

It’s a simple example, but already there is a dependency between the data of the two calls that need to be mocked. Specifically GetLowInventoryProducts() specifies a SupplierId whose value must be returned by GetAllSuppliers(). This isn’t complicated enough you couldn’t mock it yourself, but you can imagine a more complicated example with multiple calls and multiple data dependencies.

RhinoMocks.PleaseJustWriteMyExpectStatementforMe = true

In order to mock the above using Rhino Mocks you would first need some code like the following:

private static void ReorderLowInventoryProductsMocked() {
  NorthwindService northwindService = new NorthwindService();

  MockRepository repository = new MockRepository();
  NorthwindDao northwindDaoMock = repository.StrictMock<NorthwindDao>();
  northwindService.NorthwindDao = northwindDaoMock;

  // put .Expect() statements here

  repository.ReplayAll();

  northwindService.ReorderLowInventoryProducts();
}

And assuming NorthwindDao’s methods are virtual you’ll get

ExpectationViolationException NorthwindDao.GetConnection(); Expected #0, Actual #1.

But if you let WriteMyExpectStatement at it using the following:

try {
  northwindService.ReorderLowInventoryProducts();
} catch (ExpectationViolationException ex) {
  NorthwindDao northwindDaoReal = new NorthwindDao();
  using (SqlCeConnection cn = northwindDaoReal.GetConnection()) {
    northwindDaoReal.Connection = cn;
    MyExpectStatement.Write(ex, "northwindDaoMock", northwindDaoReal);
  }
}

You’ll literally get an exception that looks like this:

NorthwindDao.GetConnection(); Expected #0, Actual #1.
WriteMyExpectStatement:
SqlCeConnection sqlCeConnection = new SqlCeConnection {ConnectionString = "Data Source=Northwind.sdf;Persist Security Info=False;", }
Expect.Call(northwindDaoMock.GetConnection()).Return(sqlCeConnection);

Cool huh? Now if you paste that code at “// Put .Expect() Statements Here” and run it again you’ll get:

NorthwindDao.GetLowInventoryProducts(); Expected #0, Actual #1.
WriteMyExpectStatement:
IEnumerable products = new List {new Product {ReorderLevel = 17, UnitsInStock = 25, ProductName = "Chang", Discontinued = false, ProductID = 2, SupplierId = 1, }, … };
Expect.Call(northwindDaoMock.GetLowInventoryProducts()).Return(products);

Just keep copying and pasting until all of your expect statements are there and you’re done.

Limitations

WriteMyExpectStatement knows how to generate the code to do most things and it will recurse into any complex objects you throw at it (e.g. notice how it picked up the fields in the Product class that it had no a priori knowledge of). What it can’t do as well is call methods that take complicated parameters. For instance through reflection it needs to call GetLowInventoryProducts() on northwindDaoReal. If we abandoned InversionOfControl (IOC) and had SqlCeConnection be a parameter to GetLowInventoryProducts(), then WriteMyExpectStatement would have absolutely no idea that it would need to call SqlCeConnection.Open() and so would fail. So as long as you primarily use primitives as parameters to the things you mock you should be able to use WriteMyExpectStatement. And if you have bigger requirements let me know (“codeplex@l” + “eerichardson.c” + “om”), I’d be happy to let you in to the project.

Summary

I realize most situations don’t really need code generation for their expect statements, but if for instance you have existing integration tests that you want to convert to mocks, then something like this is essential. It was for me anyway.

Wednesday, November 12, 2008

Applications Can't Use SharePoint Master Pages

This is the story of stupid SharePoint problem and an ugly, kludgy, and embarrasing solution. On our project we have the need for an application that looks and feels like a SharePoint subsite. Specifically, we need it to inherit from SharePoint's masterpage. But we also need it to be a separate IIS application.

Problem: SharePoint dislikes Applications

We get the following error when we try to dynamically set the MasterPageFile in code:

System.ArgumentException: The virtual path '/_layouts/application.master' maps to another application, which is not allowed.

Several sites say we just can't do what we're trying to do. We considered a Virtual Path Provider, but decided not to even go down that path because SharePoint's master page undoubtedly has hooks into web.config and httpmodules and such, so we came up with the following hack:

Solution: An Ugly SharePoint Text Manipulation Hack

Deploy a nearly blank page in to /_layouts/ (we do this in our WebAppManifest.xml & WebAppSolution.ddf files). In PageBase (that all pages inherit from) we override Render to retrieve the blank page and do text manipulation to insert our content into it. Here's the code:

protected override void Render(HtmlTextWriter writer) {
  HttpWebRequest request = (HttpWebRequest)WebRequest.Create(GetUrlToBlankPage());
  request.Credentials = CredentialCache.DefaultCredentials;
  using (WebResponse webResponse = request.GetResponse()) {
    StreamReader streamReader = new StreamReader(webResponse.GetResponseStream());
    string pageHostHtml = streamReader.ReadToEnd();

    using (StringWriter stringWriter = new StringWriter())
    using (HtmlTextWriter htmlTextWriter = new HtmlTextWriter(stringWriter)) {
      base.Render(htmlTextWriter);
     
      string renderPageHtml = stringWriter.ToString();

      string title = GetTitleFromHtml(renderPageHtml);
      pageHostHtml.Replace("PAGE TITLE", title);

      string content = GetContentFromHtml(renderPageHtml);
     
      pageHostHtml = RemoveFormData(pageHostHtml);
      pageHostHtml = pageHostHtml.Replace("DO NOT MODIFY THIS PAGE", content);
    }
    writer.Write(pageHostHtml);
  }
}

private static string GetContentFromHtml(string html) {
  Match match = Regex.Match(html, "<body>(.*)</body>",
    RegexOptions
.IgnoreCase | RegexOptions.Singleline);
  return match.Success ? match.Groups[1].Captures[0].Value : "";
}

private static string GetTitleFromHtml(string html) {
  Match match = Regex.Match(html, "<title>(.*)</title>", RegexOptions.IgnoreCase);
  return match.Success ? match.Groups[1].Captures[0].Value : "";
}

private static string RemoveFormData(string html) {
  string replaced = Regex.Replace(html, "<form[^>]*>", "",,
    RegexOptions
.IgnoreCase | RegexOptions.Singleline);
  replaced = replaced.Replace("</form>", "");
  return replaced;
}

Conclusion

I’m sure there are plenty of optimizations that we can do (e.g. caching), but that’s the basic idea. But this solution makes me feel so dirty. So please, please, dear reader, tell me there is a better way. I want to feel clean again.

Thursday, November 6, 2008

Eight Miserable TFS Features

I'd prefer to post a positive, happy, or ideally emotion-agnostic technical post, but today Microsoft Visual Studio Team Foundation Server (TFS)'s source control pissed me off one too many times. I could go on for pages, but this list represents the top eight reasons why you should never pick Team Foundation Server for source control (in decreasing order of annoyance).

1. Can't see changes on get latest

If you like to publically humiliate co-workers when they violate coding standards, or even if you just want to keep an eye on what they're doing, then TFS is absolutely not for you. I'm completely spoiled by Tortoise SVN which allows you to

  1. Get latest
  2. See which files have changed; and
  3. Quickly view the diff

I suppose I should be happy TFS at least allows #1.

2. TFS drops Solution Items

Solution Items are files and folders that exist outside of projects, but within solutions. For instance my current project has an etc directory that contains database scripts and such. First of all it would be nice if Visual Studio automatically recognized new items in the etc directory. Ironically you have to "add existing item" even though it already exists. Fine, whatever. But more importantly it would be nice if Visual Studio didn't periodically delete said files from the solution. As convenient of a feature as that may sound, it has caused numerous problems. A "Blame" has determined that nearly every person on the project has activated this feature leading me to believe it is truly a bug. Brilliant.

3. Checked In Files Are Marked Read-only

Suppose you have a file you want to open outside of Visual Studio (gasp). With TFS if you don't already have Visual Studio open you have to start the IDE in order to check the file out to remove the read only bit. So ten minutes later you're ready to go. At least TFS does allow multiple check-out.

4. TFS server name in project file

TFS embeds the server name in every single one of your project files. The beauty of this approach becomes evident when one team member uses an IP address, while another uses the domain name or computer name. You end up fighting over the project files and automatically checking them out when all you want to do is view them.

5. Can't Access Source Control Outside of Visual Studio

Want to check in or get latest without opening Visual Studio? Yea, you can't. Hope you don't have any java (non-Microsoft) code on your project.

6. Reconciling, Painfully Slow

TFS actually does a decent job of auto merging conflicting files. And when it can't auto-merge conflicting files you can view each file side by side and click which change you want to keep. But the process is painful and slow. For instance if you forget to get latest before check in and there are conflicts you can resolve them, but your check in will always fail. Get latest seems to take much longer when there are conflicts. Auto merge is really slow. If there are un-reconcilable conflicts hitting the "resolve" button takes forever and must be done for each and every conflict. It's like it downloads the entire history of the file. And that wait is mandatory even if you simply want to discard either your or the server's changes.

7. *.vsmdi files

This is just a frustrating known bug where if you use TFS for unit testing (Team Foundation Server Test System) and TFS for source control, Visual Studio will either take this .vsmdi file which is essential file for unit testing and either duplicate it or corrupt it. There's a lot of ink spilled on this vsmdi bug and there are some workarounds, but it's a pain.

8. Moving a folder opens all files in the folder

This was the last straw and the impetus for writing this post. Today I wanted to move a "solution item" into a project. I opened the Source control Explorer (which is impossible to find in the first place) and right clicked, selected move, entered my destination, and TFS in its wisdom opened every file in the folder nearly bringing my machine to its knees. There are hundreds of small annoyances like this every day with TFS.

In Summary

TFS actually has some nice features that I like over SVN. Like the merge tool is pretty good, and being able to see all people who are currently working on a file is nice. But overall the cons heavily outweigh the pros. And if it's any indication how bad TFS is for source control (or how much developers hate it) Codeplex has even implemented a SVN to TFS bridge on their server.

It's ridiculous to me that Microsoft uses this tool internally and sells it and people buy it! I must be missing something. If so please post comments to this post. Because the entire thing boggles my mind.

Thursday, October 23, 2008

Forget Burndown Use Burnup Charts

Agile projects traditionally use burndown charts to visually show work remaining over time. This could be for the current iteration or it could be for the duration of the project. Either way they can help managers (or the Project Owner in Scrum) track velocity, estimate either the project or iteration completion date, or find trends in past performance. But burndown charts have a major shortcoming: they fail to show what makes agile projects agile – new requirements. And that’s where burnup charts come in. But first let’s examine burndown charts.

The Problem with Burndown

A typical burndown for the life of a project might look like this:

It shows the project started with 100 points of work in the backlog; it’s completed eight iterations; the team accomplishes about ten points an iteration; and if everything continues at the current velocity the project will complete all work within another two iterations. Great.

But what happened in iteration six? Very little appears to have been accomplished. Maybe the team all took a vacation. Maybe there was a major problem or the team incorrectly estimated complexity. Or perhaps a large set of new requirements were added to the backlog because the customer decided what they thought they wanted wasn’t what they really needed: namely the exact scenario agile was designed for.

Burnup Charts

The problem is that burndown charts lack two essential pieces of information. First, how much work was actually accomplished during a given iteration (as opposed to how much work remains to be completed) and second how much total work the project contains (or if you prefer how much scope has increased each iteration). A burnup chart for the exact project above might look like this:

We can now clearly see that the team did not take a breather in iteration six. They continued to complete about ten points per iteration, but during the sixth iteration the scope increased by about twenty points.

One could imagine the opposite happening as well. Later in the project the team might delete old user stories that were envisioned during project inception and thus decrease the total scope. The burndown chart would incorrectly show such a scenario as a sudden increase in velocity.

Summary

Either way the burndown chart hides essential information. I propose we throw it away and show the slightly more complicated, but infinitely more useful burnup chart. After all you wouldn’t want upper management thinking you were lazy in iteration six would you?

4/10/2010 EDIT

For more information on how to produce burn-up charts see my video: How to Create Burnup and Burndown Charts in SharePoint

.

Monday, August 11, 2008

Silverlight: Cannot specify both Name and x:Name attributes

I received the following error while upgrading from Silverlight beta 1 to beta 2: Cannot specify both Name and x:Name attributes. I’m sure there can be many causes, but since there isn’t much written on this, here is one cause.

If you have a custom control you can no longer put an x:Name on the root element if you also have an x:Name on an instance of the control. A crappy work-around is to declare the root node in code and set it to the parent of a child element. For instance:

<Canvas x:Class="MyNamespace.MyCustomControl"
    xmlns="http://schemas.microsoft.com/client/2007"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    x:Name="LayoutRoot"
    >
    <StackPanel x:Name="_stackPanel">

Canvas LayoutRoot;

/// <summary>
///
Constructor
/// </summary>
public MyCustomControl() {
    InitializeComponent();

   // set _controlBase after InitializeComponent
   //     so _stackPanel is initialized
   LayoutRoot = (Canvas)_stackPanel.Parent;
}

This is pretty fragile, so I’d love to hear any other solutions.

Saturday, August 9, 2008

DEFCON 16 Day One

It would probably be more interesting to blog about my numerous traveling mishaps (summary: never stay at Circus Circus however close it may be to your conference), but I thought a quick bullet point version of the first day at DEFCON 16 would be more relevant. Incidentally this trip was possible because of a generous training budget benefit provided by Near Infinity.

Making the DEFCON 16 Badge, Joe “Kingpin” Grand

  • The default firmware makes the badge function as a TV-B-Gone (it really turned on/off my hotel room TV ... cool)
  • You should be able to transfer files between SD card readers in badges (haven’t tried)
  • Later in the day I soldered for my first time ever and added a mini-USB port to the card in the “Hardware Hacking Village.” This was a rush, very cool. Thanks to Joe and Andrew for the assistance.

Deciphering Captcha Michael Brooks

  • It never occurred to me you could generate all possible outcomes of a captcha algorithm to break it, cool idea
  • Never read a script at a conference, however interesting the topic may be, I left after 5 minutes

Whitespace: A Different Approach to JavaScript Obfuscation, Kolisar

  • Problem: Your Cross Site Scripting Code needs to evade human and/or automatic detection
  • Solution: Embed JavaScript code in whitespace as tabs (1) and spaces (0)
  • This was a really cool demo

New Tool for SQL Injection and DNS Exfiltration, Robert Ricks

  • SQL Injection is still a widespread vulnerability
  • Once you get in you can do all kinds of things, their demo included viewing the database schema
  • They showed a tool that automatically exploits vulnerabilities really fast, very cool

Living in the RIA (Rich Internet Application) World, Alex Stamos, David Thiel & Justine Osborne

  • A review of security vulnerabilities in five new RIA technologies: Adobe AIR, MS Silverlight, Google Gears, Mozilla Prism, and HTML 5
  • All technologies significantly increased attack surface and had vulnerabilities
  • Surprisingly Silverlight seemed to come out on top (go MS) followed by Adobe AIR
  • Google Gears, Mozilla Prism, and HTML 5 all have a long way to go

Bringing Sexy Back: Breaking in with Style, David Maynor & Robert Graham

  • Penetration testing is generally boring work, just run a few tools
  • Banks and federal agencies need to do more serious and creative pen-testing
  • Scenario: Russian czar hires developers with a 1 million budget and wants to break in to your company
  • Approach 1: Put iPhone in box, attach external power supply, turn on, fed ex to target company, while box is in mail room TTY to phone bypass physical security, attack wireless network
  • Approach 2: “Spear Phishing” – Start fake company, get domain name, get SSL certificate, send “New 401(k) Provider” e-mail to target company, link to bogus website, sign malicious ActiveX control so there aren’t warnings when they download
  • Wow, this talk was really cool, really eye opening

Keeping Secret Secrets Secret & Sharing Secret Secrets Secretly, Vic Vandal

  • As a presenter always err on the side of going too fast, and assuming your audience is smarter than you, this presentation sucked
  • Steganography is more than least significant bit image modification, it generally is about hiding things in plain sight such as hiding data after the EOF character in text files

Free Anonymous Internet Using Modified Cable Modems, Blake Self & Durandal

  • I didn’t actually attend this, I’m hoping Joe or Andrew will blog about it, because man did it sound cool. I’ll post a link if I see anything.

All in all this conference is absolutely amazing and eye opening. It really does give you a completely different perspective on the security industry.

Tuesday, May 27, 2008

A Major Silverlight PITA and Two Annoying 3.0 Limitations

Pardon my rant, but the thing I currently hate most about Silverlight (besides copious XML) is the Visibility property. Any sane framework would implement Visibility as a Boolean. Not Silverlight though. It’s creators in undoubted infinite wisdom implemented it as an enumeration. The values of the enumeration? There are two: Visible and Collapsed. Hmmm.

Of course this causes superfluous verbosity in common everyday code:

button1.Visibility = makeVisible ? Visibility.Visible : Visibility.Collapsed;

Or worse when things get a little more complex:

// don't display the panel if its button's aren't visible
panel1.Visibility = !(button1.Visibility == Visibility.Visible && button2.Visibility == Visibility.Visible) ? Visibility.Visible : Visibility.Collapsed;

Clearly this was done to keep Silverlight compatible with the Windows Presentation Foundation (WPF) which has three values in its enumeration property (Visible, Hidden, and Collapsed). But that’s just as ridiculous. Why WPF couldn’t use two properties, Visible (Boolean) and NotVisibleBehavior (enumeration) is beyond me.

It’s ok though, because .Net 3.0 gave me a cure to any Framework shortcomings: Extension Methods. A syntactic sugar cure for all my bitterness:

public static void SetVisible(this FrameworkElement element, bool visible) {
    element.Visibility = visible ? Visibility.Visible : Visibility.Collapsed;
}

public static bool IsVisible(this Visibility visibility) {
    return visibility == Visibility.Visible;
}

Fantastic, now my "complex" example becomes:

// don't display the panel if its button's aren't visible
panel1.SetVisible(!(button1.IsVisible() && button2.IsVisible()));

Still not quite as nice as a Boolean visible property, but certainly doable.

3.0 Limitation #1, By Ref Extension Methods

But wait. Isn’t it best practice in Silverlight to use binding for these types of things? Separation of logic from presentation and all. So I should do:

<StackPanel Visibility="{Binding IsPanelVisible}">

And then:

public class DisplayStuff : INotifyPropertyChanged {
    public
Visibility IsPanelVisible { get; private set; }

    public void UpdateStatus(bool makeVisible) {
        IsPanelVisible = makeVisible ? Visibility.Visible : Visibility.Collapsed;
        // make sure to notify the control that the property has changed
        PropertyChanged(this, new PropertyChangedEventArgs("IsPanelVisible"));
    }
}

And we can set the DataContext of some parent element to an instance of DisplayStuff and all the children including our panel magically databind. That’s cool, but the ugliness is back (well, not as bad since I removed the buttons to simply the example, but you can pretend). This is because we extended FrameworkElement not Visibility. No problem, just extend Visibility right?

public static void SetVisible(this Visibility visibility, bool visible) {
    visibility = visible ? Visibility.Visible : Visibility.Collapsed;
}

Except this doesn’t work. Can you spot the problem?

It compiles. It runs. But the value of IsPanelVisible never changes. Oh yea, C# is pass by value by default. And now the .Net Framework 3.0 limitation. This isn’t possible:

public static void SetVisible(this ref Visibility visibility, bool visible) {

You get "The parameter modifier 'ref' cannot be used with 'this'." Grr.

Limitation #2, By Ref Automatic Properties

Ok, so remove “this”, and go back to C# 2.0 helper functions which extension methods are syntactic sugar for anyway:

public static void SetVisible(ref Visibility visibility, bool visible) {

And now our class can do:

ExtensionMethods.SetVisible(ref IsPanelVisible, makePanelVisible);

Right? Not so fast I’m afraid. Compile error. “A property or indexer may not be passed as an out or ref parameter”. And I guess this is reasonable. You can’t pass the address of a function, which is what a property is in the background. So you should pass the private variable that backs the property.

Except that I don’t have one! I used an automatic property. And .Net doesn’t let me access the private variable backing the automatic property. So I’m stuck!

And this is .Net 3.0 limitation #2. Automatic properties are wonderful until you try to do much with them. Why couldn’t the framework notice that I’m using an automatic property and pass the variable that I can’t access by ref to my function?

And now I find myself back in a .Net 2.0 world because all the features I like so much in 3.0 are more sugar than substance.

Conclusion

Allowing automatic properties to pass by reference or allowing access to the private member behind them would be nice. Allowing extension methods to change the instance they extend would be nice. But ultimately none of this would be a problem if Visible had been implemented as a Boolean. The way every other framework in the world does. </Complaining>

Thursday, March 27, 2008

Expression Trees: Why LINQ to SQL is Better than NHibernate

In my last post I described how the Where() function works for LINQ to Objects via extension methods and the yield statement. That was interesting. But where things get crazy is how the other LINQ technologies, like LINQ to SQL use extension methods. In particular it’s their use of a new C# 3 feature called expression trees that makes them extremely powerful. And it’s an advantage that more traditional technologies like NHibernate will never touch until they branch out from being a simple port of a Java technology. In this post I’ll explain the inherent advantage conferred on LINQ technologies by expression trees and attempt to describe how the magic works.

What’s so Magic about LINQ to SQL?

LINQ to SQL (and it’s more powerful unreleased cousin LINQ to Entities) is a new Object Relational Mapping (ORM) technology from Microsoft. It allows you to write something like the following:

IEnumerable<Product> products = northwindDataContext.Products.Where(
      p => p.Category.CategoryName == "Beverages"
      );

Which as you’d expect returns products from the database whose category is Beverages. But wait, aren’t you impressed? If not read over that code again, you should be very impressed. In the background that C# code is converted into the following SQL:

SELECT [t0].[ProductID], [t0].[ProductName], ...
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN [dbo].[Categories] AS [t1]
ON [t1].[CategoryID] = [t0].[CategoryID]
WHERE [t1].[CategoryName] = @p0

In other words it’s pretty smart. It isn’t just returning all products and filtering them in memory using the LINQ to Objects version of Where() I discussed previously.

Doing something like that using NHibernate Criteria would require something like this:

ICriteria c = session.CreateCriteria(typeof(Product));
c.Add(Expression.Eq("Category.CategoryName", "Beverages"));
IEnumerable<Product> products = c.List<Product>();

You could use HQL too, but both NHibernate options suffer from the same problem. Did you spot it?

The LINQ to SQL version is taking actual strongly typed C# code and somehow smartly converting it to useful SQL. The NHibernate version does the same thing, but always using a weakly typed alternative. In other words the column “CategoryName” in NHibernate is a string. If it or its data type change in NHibernate you won’t find out until runtime. And that is the beauty of LINQ to SQL: you’ll find more errors at compile time. And if you’re like me you want the compiler to find your mistakes before the unit tests that you (or your fellow developers) may or may not have written do.

So you’re probably now wondering if you can put strongly typed C# in your where clause and it somehow magically gets converted to SQL, what’s the limit? If you put in a String.ToLower() or StartsWith() will it get converted to equivalent SQL? What about a loop or conditional? A function call? A recursive function call? At some point it has to break down and either return all products and filter them in memory or just fail right? Before answering those questions we need to understand what’s going on.

Understanding the Magic

The Magic happens in a class called Expression<T>. Expression takes a generic argument that must be a delegate and is usually one of the built in Func methods.  However the class can only be instantiated to a lambda expression. That’s right, not a delegate or anonymous method, only a Lambda expression. So in my deferred execution post where I explained what Lambda expression are, I said they were essentially syntactic sugar for an anonymous methods. Well, the emphasis is on the essentially, because they really aren’t sugar at all. When you assign a lambda expression to an Expression, the compiler, rather than generating the IL to evaluate the expression, generates IL that constructs an abstract syntax tree (AST) for the expression! You can then parse the tree and perform actions based on the code in the lambda expression.

Below is an example adapted from the .Net Developer’s guide on MSDN that shows how this works:

// convert the lambda expression to an abstract syntax tree
Expression<Func<int, bool>> expression = i => i < 5;

ParameterExpression param = (ParameterExpression)expression.Parameters[0];
// this next line would fail if we change the Lambda expression much
BinaryExpression operation = (BinaryExpression)expression.Body;
ParameterExpression left = (ParameterExpression)operation.Left;
ConstantExpression right = (ConstantExpression)operation.Right;

Console.WriteLine("Decomposed expression: {0} => {1} {2} {3}",
      param.Name,
      left.Name,
      operation.NodeType,
      right.Value
      );

This outputs “Decomposed expression: i => i LessThan 5”. The first line is the most important. It defines an Expression that takes a delegate with a single int parameter and a return type of bool. It then instantiates the Expression to a simple lambda expression.  Incidentally this would also work if we defined our own Delegate:

public delegate bool LessThanFive(int i);

public static void DoStuff() {
      Expression<LessThanFive> expression = i => i < 5;
}

It would, however, not work if we used an anonymous method:

Expression<Func<int, bool>> expression = delegate(int i) { return i < 5; };

While that looks legal it actually results in the compile time error “An anonymous method expression cannot be converted to an expression tree.”

There is a lot of complexity in parsing the AST, far beyond the scope of this article. However, the MSDN does have a nice diagram that helps explain how the following slightly more complicated Lambda expression that determines if a string has more letters than a number:

Expression<Func<string, int, bool>> expression =
    (str, num) => num > str.Length;

How Deep Does The Rabbit Hole Go?

So LINQ to SQL uses this Expression Tree technique to parse a plethora of possible code that you could throw at it and turn it into smart SQL. For instance check out a couple of the following conversions that LINQ to SQL will (or will not) perform:

p => p.Category.CategoryName.ToLower() == "beverages"

Results In:

SELECT [t0].[ProductID], ...
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN [dbo].[Categories] AS [t1] ON [t1].[CategoryID] = [t0].[CategoryID]
WHERE LOWER([t1].[CategoryName]) = @p0

Not bad, huh? How about:

p => p.Category.CategoryName.Contains("everage")

That results in the following SQL snippet:

WHERE [t1].[CategoryName] LIKE @p0

And it sets @p0 to “%everage%”. Pretty cool. Ok this will get it to fail though, right?

public static string GetCat() {
    return "Beverages";
}

IEnumerable<Product> products = northwindDataContext.Products.Where(
      p => p.Category.CategoryName == GetCat()
      );

It turns out that LINQ to SQL will look inside of other functions! Alright, there’s no way it can do complicated conditionals:

p => p.Category.CategoryName ==
    "Beverages" ? p.UnitsInStock < 5 : !p.Discontinued

This should only pick up Beverages that have fewer than 5 items in stock regardless of whether they are discontinued and any other products that aren’t discontinued. Would you believe that it runs a single SQL statement:

SELECT [t0].[ProductID], ...
FROM [dbo].[Products] AS [t0]
LEFT OUTER JOIN [dbo].[Categories] AS [t1] ON [t1].[CategoryID] = [t0].[CategoryID]
WHERE (
    (CASE
        WHEN [t1].[CategoryName] = @p0 THEN
            (CASE
                WHEN [t0].[UnitsInStock] < @p1 THEN 1
                WHEN NOT ([t0].[UnitsInStock] < @p1) THEN 0
                ELSE NULL
             END)
        ELSE CONVERT(Int,
            (CASE
                WHEN NOT ([t0].[Discontinued] = 1) THEN 1
                WHEN NOT NOT ([t0].[Discontinued] = 1) THEN 0
                ELSE NULL
             END))
     END)) = 1

Wow, it sure isn’t pretty, but it scales to multiple conditionals, and most importantly it didn’t return all products and process them in memory. Not bad.

Conclusion

I asserted up front that using expression trees and the strong typing that comes with them is the reason LINQ to SQL is inherently better that NHibernate. I really can’t make that claim without admitting one of LINQ to SQL’s biggest shortcomings: It currently does not support multiple table inheritance. Ultimately, however, it’s a short term fault since the forthcoming LINQ to Entities does. And I stand by my claim because from a long term perspective as long as technologies like NHibernate remain pure ports of Java code they will never realize the full benefits of equivelant LINQ technologies that take advantage of .Net's native strengths: like expression trees.

Friday, March 14, 2008

How System.Linq.Where() Really Works

After writing my last blog entry on Deferred Execution in LINQ I had a conversation with Seth Schroeder who rightly pointed out among other things that I really didn't show how LINQ's deferred execution works internally. So in this post I wanted to implement my own LINQ Where() extension method based off of the one in the System.Linq namespace. So I'll show you the code, explain interesting parts of how it works including collection initializiers and extension methods, and then explain where the deferred execution behavior comes from (i.e. the yield statement). I will only explain in the context of LINQ to Objects since that's far simpler than other Linq's. I will implement a Where() like LINQ to SQL does in a later blog post (that's where things get really crazy).

Implementing MyWhere()

Let's start out with some code. The first question is does this compile?

using System;
using System.Collections.Generic;
using MyExtensionMethods;

namespace PlayingWithLinq {
    public class LinqToObjects {
        public static void DoStuff() {
            IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};

            IEnumerable<int> result = ints.MyWhere(i => i < 5);

            foreach (int i in result) {
                Console.WriteLine(i);
            }
        }
    }
}

namespace MyExtensionMethods {
    public static class ExtensionMethods {
        public static IEnumerable<TSource> MyWhere<TSource>(
            this IEnumerable<TSource> source,
            Func<TSource, bool> predicate
            ) {

            foreach (TSource element in source) {
                if (predicate(element)) {
                    yield return element;
                }
            }
        }
    }
}

Side note: putting two namespaces in on file is far from a best practice, but yes that is allowed.

Lambdas and Collection Initializers

If you're new to C# 3.5 then your first thought may be that:

IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};

is not allowed. Actually it is. It's the collection initializer syntax that I initially whined about in my post C# 3.0: The Sweet and Sour of Syntactic Sugar (ironically I actually like this syntax the more I use it.)

Your next thought may be that:

i => i < 5

is not legitimate. This is in fact a Lambda Expression, and as I explained in Deferred Execution, The Elegance of LINQ it conceptually compiles down to an anonymous method. Incidentally those that know Groovy (myself not included) or Lisp may know this as a closure since as we'll see later it can access local variables.

Extension Methods

Ok, the .Net Framework certainly has no MyWhere() function on the List object so this certainly wouldn't compile in C# 2. But that's where C# 3's Extension Methods come in. The "this" in:

MyWhere<TSource>(this IEnumerable<TSource> source,

says that MyWhere() can be applied to any generic IEnumerable. If you want to, you can still call MyWhere() normally:

IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};
ExtensionMethods.MyWhere(ints, i => i < 5);

And in fact this is what the compiler does in the background when you call MyWhere() off of an IEnumerable. But now with extension methods you don't have to.

But does MyWhere() now exist on all IEnumerable objects everywhere? No, it turns out you only get MyWhere() when you import the namespace it exists in (MyExtensionMethods). Incidentally unlike Groovy and Ruby there is no way to add an extension method to a class itself, only to instances.

Whose got the Func()?

The last two questionable parts of the code are the Func<TSource, bool> and the yield. Func is pretty easy. It's simply one of several new predefined delegates (method signatures) that comes with the .Net framework off of the System namespace. The two generic argument one above will match any function that returns the second generic argument and takes the first generic argument as a parameter. It looks like this:

delegate TResult Func<T, TResult>(T arg1);

So rather than using a Lambda expression in my initial example I could have been very explicit about the delegate instance (myFunc):

public static void DoStuff() {
      IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};

      Func<int, bool> myFunc = IsSmall;
      IEnumerable<int> result = ints.MyWhere<int>(myFunc);

      foreach (int i in result) {
            Console.WriteLine(i);
      }
}

public static bool IsSmall(int i) {
      return i < 5;
}

And that would have done the same thing. Notice I had to specify the generic type on the call to MyWhere() since the compiler can't infer the type in this example.

Yield

Now the really interesting part: yield. Yield is what makes deferred execution work. It actually was introduced with C# 2.0, but I don't think anyone really used it (I didn't know about it until recently). So because MyWhere() returns an IEnumerable (and because it isn't anonymous and doesn't have ref or out parameters) it is allowed to use the yield statement. When a method has a yield return (or yield break) statement, then execution of the method doesn't even begin until a calling method first iterates over the resulting IEnumerable. Execution then begins in the method and runs to the first yield statement, returns a result, and passes execution back to the caller. When the calling method iterates to the next value execution continues in the method where it left off until it gets to the next yield statement and then it passes execution back to the caller again and so on. Weird huh? Joshua Flanagan has a nice article that explains this in more detail along with some of the nice benefits like a smaller memory footprint.

So here's a quiz. What happens when you execute the following code?

IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};

IEnumerable<int> result = ints.MyWhere<int>(i => i < 4);

ints.Add(0);

foreach (int i in result) {
      Console.WriteLine(i);
}

Without the yield you'd get the numbers 3 through 1 since you added 0 after the call to MyWhere(). But since the yield in MyWhere() (and the Where() in System.Linq) defers execution until the foreach statement, you actually get 3 through 0. Ready for a little more mind bending? How about this:

IList<int> ints = new List<int>() {9,8,7,6,5,4,3,2,1};

int j = 4;

IEnumerable<int> result = ints.MyWhere<int>(i => i < j);

ints.Add(0);
j = 3;

foreach (int i in result) {
      Console.WriteLine(i);
}

Does the state of j get captured? My intuition would say yes. If so you'd expect 3 through 0. Well, the closure part of anonymous methods and lambdas work by keeping a reference to their calling object (this). So consequently they always get the most up to date value of a variable. So if your intuition works like mine you'd be wrong. You actually get the numbers 2 through 0. Crazy huh? And definitely something I hope I won't run into in someone's code (JetBrains ReSharper actually warns you if you do something crazy like this).

Conclusion

If this made sense then you should have a pretty solid grasp of how most of Linq to Objects works. Understanding extension methods, Func delegates, and yield statements should form the majority of what Linq does. Well, except for expression trees. But that's a topic for another post. Please post if this doesn't make sense or if I got it all wrong, I'd love to hear from you.