Monday, March 23, 2015

Increase Maturity: Start a Sprint Closeout Checklist

If a sprint closeout checklist sounds excessively pedantic, consider this: how effective are you at resolving retrospective issues? If you aren't using a closeout checklist, the then it's not as well as you think.

This is because there's a special class of retrospective item that you can only reliably close by addressing it on a recurring basis.



New Year's Resolution? Lies, all lies.


For instance, suppose during your regular retrospective developers raise the issue that they don't know when to mark stories complete because they don't contain enough detail. Or, this problem could manifest itself as testers frequently rejecting vaguely written stories to the chagrin of developers.

In either case you could address the issue by promising to write stories with more detail or (better) to include acceptance criteria. Sounds great, the retrospecitve ends, everyone is happy since things will be much better going forward.

Only they won't. This solution is a mistake. Like most New Years resolutions, promises "to do better" are a lie. Promises like the above last for a sprint, maybe two. However, as time elapses and/or as stress increases, we forget our promises. We revert.

In the example above consider a customer verbally mentioning a new feature they want to you in a hallway. You quickly capture the need as you run off to another meeting with the intent to fill in the details later. An hour and four emergencies later you've already forgotten. The mistake wasn't breaking the promise, it was making it in the first place.

(Lists) to the Rescue


So how do we truly solve this issue? The answer is by adjusting our close-out process. Sprint close out is the perfect time to address issues like this because we can block out time to correct problems introduced during those hectic periods. Furthermore it's much easier to take a step back to focus on quality when you're not sprinting.

But if our processes are purely based on some static, published processes that isn't customized to our team and circumstances, or is based on the way we did things the last time, then we will lack the means to consistently and reliably adjust our processes over time. We need a different tool.

One such tool is a closeout checklist. The list may start as simple as listing the meetings from a traditional scrum process:

  • Perform Sprint Review
  • Perform Sprint Planning
  • Perform Sprint Retrospective

But over time (more specifically after retrospectives) we can add items onto this list thus customizing it to the specific needs of our team.

In the case of the example above we can add an item prior to Sprint Planning such as "Clarify New Stories". We can keep the list simple or move it to a spreadsheet with columns like responsible team members (i.e. just the scrum master, develoeprs only, whole team), implementation details, due date, or date completed.

Sample Checklist Items


If this still sounds a little vague here are some sample retrospective items that I've gathered from my career and how one might adjust a closeout checklist to address the underlying issue:

Issue Resolution
Post-production, stories are taking longer than expected If they failed to take data migration into account, try adding an "identify migration issues" task to the checklist. Possibly limit it to more experienced developers to keep the meeting short.
The CI server takes too long to build If there are lengthy UI automation or performance tests, consider moving them out of the continuous integration cycle into a nightly build. Then add a "check status of acceptance tests" task to the checklist. Don't close the sprint unless they are all passing.
Too many orphan branches in the source control server Add a "delete orphan branches" task to the checklist, maybe assign it a specific person.
Lack of adherance to coding conventions In Code Reviews: A Sane Approach I argue for a per-sprint code review. The checklist would be the perfect place to ensure this consistently happens.
Code Coverage is Low Add a task to "Publish code coverage numbers" and include the delta from the previous sprint. Sometimes increasing visibility is enough to remedy issues like these.
Effort estimation takes too long If eliminating it altogether isn't an option, try adding "auto-point defects to 1, no discussion allowed" in details of "sprint review"
Developer databases out of sync You could solve the problem once, or better yet add "re-baseline databases with production" task to the checklist to ensure it doesn't happen again.

Summary


So while a closeout checklist may sound pedantic at first it can be an excellent tool in your tool belt. If you give it a try, you may find that such a list can increase your maturity while becoming the linchpin that holds all of your processes together and ensuring that continous process improvement happens for your team today and far into the future.

So what checklist items would you add to the list above? Feel free to post in the comments or ping me on twitter.

Thursday, February 12, 2015

An illustrated Gude to Parameter Passing in JavaScript

I thought it would be fun to do a back-to-basics kind of post on the JavaScript language and base it on my second most popular blog post of all time Parameter passing in C#.

In other words this post will present some JavaScript brain teasers around parameter passing and visually illustrate the solutions.

Since JavaScript doesn't have the ref or out keywords like C# we'll just be looking at:

  • Primitive Values
  • Reference Values
  • Strings (Immutable Values)
  • Passing Primitives
  • Passing References

Primitive Values


Starting simple, what's the result of the following, and why?

var i = 5;
var j = i;
j = 10;
alert(i);

Let's take it line by line.  On the first line we declare a variable i and assign it the value 5.  5 is a value of type 'number', and number is a one of a handful of primitive types including undefined, null, boolean, and string.

On the second line we declare a new variable j and assign it the value held by the variable i.  If we think of variables as small containers (boxes) that can only contain primitives or references, then we can picture the first two lines of code like this:




If JavaScript were pass by reference then j might contain a pointer to i.  But because JavaScript is pass by value for primitives the second line copies the value in i into the box for j (copying is illustrated with the blue line).

Consequently when we assign j the value 10, it doesn't affect what's in the box for i. 


Thus the answer to alert(i) is 5.

Reference Values

So what is the result of the following, and why?

var i = { a: 5 };
var j = i;
i.a = 10;
alert(j.a);

In this example we declare a variable i and assign an object to it.  Consequently i contains a reference value.  Reference values are essentially pointers to objects (which are in turn aggregations of properties).  In a language like C# they'd be like a pointer on the stack to an object on the heap (although technically JavaScript doesn't have a stack or a heap).

So on line two we then instantiate a variable j and assign to it the value that's in i. Illustrating the pointer as an arrow we end up with this:


If JavaScript were pure pass by value then j might have contained a complete copy of the object.  But JavaScript, like Java, Python, Ruby, and C# is technically call by sharing, which means that for reference values we treat the pointer as pass by value, and thus the object it points to as pass by reference.

Consequently when we assign anything to j.a we follow the pointer to the shared object, and get this:



And that's why alert(j.a) returns 10.

Strings (Immutable Values)


How about this:

var s1 = 'hello';
var s2 = s1;
s2 = 'world';
alert(s1);

In C# strings are immutable reference types.  In JavaScript, however, they're just primitive values, which are all immutable.  So it's kind of a trick question and this turns out to be identical to our first example.


Thus alert(s1) is simply 'hello'.

Passing Primitive Values


Let's expand what we've learned about primitives to functions:

function Change(j) {
    j = 10;
}
var i = 5;
Change(i);
alert(i);

If you guessed that this is almost identical to our first example you're absolutely right.  Illustrating it roughly how the interpreter would see it would result in this:


Thus alert(i) is 5.

Passing Reference Values


Finally we can explore how reference values get passed with something like this:

function Change(j) {
    j.a = 10;
}
var i = { a: 5 };
Change(i);
alert(i.a);

If you're thinking this is nearly identical to our second example, you're right.  Illustrating it roughly how the interpreter would see it would result in this:


Thus alert(i.a) gets us 10.

Double Bonus Points


You're tied up in double overtime with one second left on the clock:

function Change(j) {
    j = { a: 10 };
}

var i = { a: 5 };
Change(i);
alert(i.a);

Go ahead, hit F12 .... nothing but net! ... and the crowd goes wild!

Hope you had fun reading this and hopefully learned something new.  Please post in the comments or hit me up on twitter if something doesn't look right.

Saturday, January 24, 2015

A Surprise Synchronization Context Gotcha

I got into an interesting argument conversation with a co worker last week about whether async/ await was multi-threaded.  He thought I was bonkers for suggesting it was not multi-threaded.  So I did some research.

First off, obviously if you're if you're doing async/await it's probably because you want some multithreaded behavior like network IO or file IO where some other thread does some work for you while freeing your UI thread to handle UI stuff (or in the case of IIS, releasing your thread to handle other incoming requests, thus giving you better throughput).  So my co-worker was right that 99% of the time async/await will probably involve multiple threads.

However, if async/await were multi-threaded by it's very nature, then it should be impossible to write a program using async/await that was single-threaded.  So let's try to write a method that we can prove is single-threaded that also uses async/await.  How about this:

public async void HandleClickEvent()
{
    await Task.Yield();
    j = 1;
    while (j != 0)
    {
        if (j == -1) j++;
        j++;
    }
    await Task.Yield();
}

It took some work to come up with an infinite loop that looked normal to the compiler, but that's what the while loop is doing.  If async/await were multi-threaded, then we might think that the UI thread would hit the first Task.Yield and spawn off a new thread.  Then the infinite loop would be run on a new thread and the UI would work great, right?

If we actually run that code in a Windows Store app the UI freezes.  Why?  Because, according to MSDN:

The async and await keywords don't cause additional threads to be created. Async methods don't require multithreading because an async method doesn't run on its own thread. The method runs on the current synchronization context and uses time on the thread only when the method is active. You can use Task.Run to move CPU-bound work to a background thread, but a background thread doesn't help with a process that's just waiting for results to become available.
So when I claimed async/await wasn't multi-threaded I was thinking of that.  What's basically happening is that the UI thread is a message pump that processes events, and when you await within the UI thread's synchronization context you yield control to the UI thread's message pump, which allows it to process UI events and such.  When your awaited call returns it throws an event back to the UI thread's message pump and the UI thread gets back to your method when it's done with anything else it's working on.

But after some research I realized that I didn't know nearly enough about synchronization contexts and so I spent the morning reading about them.  After a lot of research I finally found someone that has a great description of how all this works under the covers and if you get the chance I highly recommend reading C# MVP Jerome Laban's awesome series C# 5.0 Async Tips and Tricks.

In particular one thing I learned is that if you start a new Task, you throw away the UI thread's synchronization context.  If you await when there is no synchronization context, then by default WinRt will give you some random thread from the thread pool, which may be different after each await.  In other words if you do this:

public async Task RefreshAvailableAssignments()
{
    await Task.Run(async () =>
    {
        Debug.WriteLine(Environment.CurrentManagedThreadId);
        await Task.Yield();
        Debug.WriteLine(Environment.CurrentManagedThreadId);
    });

}

You will (usually) get a different thread after the yield than you did before it.  That can lead to trouble if you aren't careful and aren't aware of it.  It can be especially dangerous if you're deep in the guts of something and you aren't 100% sure of whether you are being called from the UI thread or from some other thread.  It can be particularly bad if someone after you decides to put your code into a Task.Run and you were dependent upon the UI thread's synchronization context without being aware of it.  Nasty, huh?

It makes me like more and more the idea introduced in the post by Jason Gorman entitled Can Restrictive Coding Standards Make Us More Productive? where he describes ways of discouraging team members from starting new threads (or Tasks on my project since WinRt doesn't give us Thread's) unless there is a really good reason for doing so.

It goes back to a most excellent statement my co-worker made:
Async/await is very powerful, but we all knows what comes with great power.
So that was fun.  I look forward to having lots more constructive arguments conversations like this one in the future.  :)

Wednesday, January 7, 2015

The TFS 2013 + Git API Disaster

Don't get me wrong, the addition of Git to TFS is huge, and it actually removes all of my previous complains about the platform.  Sadly, the API's for it aren't up to par yet, the documentation is poor, and the technology is so young that the Internet is completely silent on how to programmatically accomplish just about anything with it.



So after about 24 total hours of wasting time searching the internet, decompiling source, watching network traffic on Fiddler, and triggering builds I have some working code (and a bit of a rant) that I wanted to share to help fill out the Internet (because clearly it doesn't contain enough ranting; but at least this one has working code).

API #Fail


As my regular readers no doubt know I occupy my spare time running Siren of Shame, a build monitor, USB siren, and CI gamification engine. The software needs to work such that when a continuous integration build is triggered it needs to determine which check-in triggered the build and give that user credit for the check-in (or exude a little light hearted shame on failure).

For every other major CI server in the world this is pretty easy.  TFS 2013 + Git?  Not so much.  If it worked the way it should you could simply do this:

var query = _buildServer.CreateBuildDetailSpec(buildDefinitionUris);
query.MaxBuildsPerDefinition = 1;
query.Status = Microsoft.TeamFoundation.Build.Client.BuildStatus.All;
query.QueryOrder = BuildQueryOrder.FinishTimeDescending;
// this gets changesets (TFVC) as well as commits (Git)
query.InformationTypes = new[] { "AssociatedChangeset", "AssociatedCommit" };

var buildQueryResult = _buildServer.QueryBuilds(query);

var buildDetail = buildQueryResult.Builds[0];

var commits = buildDetail.Information.GetNodesByType("AssociatedCommit");

And it wouldn't even require a second web request to get the triggering commit.

Sadly, the above only works for completed builds.  In-progress builds return nothing in AssociatedCommit().

That's the older, strongly typed API that requires referencing Microsoft.TeamFoundation.Build.Client.dll (which you can find in the GAC).  With TFS 2013, there is now also a TFS Web API.  Sadly even the equivalent new Web API methods have the same limitation. For example if build 5 were in progress then this:

GET http://myserver:8080/defaultcollection/project/_apis/build/builds/5/details?api-version=1.0&types=AssociatedCommit&types=AssociatedChangeset

Wouldn't return the associated commit until it completed.

So, for in-progress builds you're stuck doing a second query.

More API #Fail


Ideally at this point you would use the powerful and convenient QueryHistory() method.  Using it looks something like this:

var workspaceServerMappings = _buildDefinition.Workspace.Mappings
    .Where(m => m.MappingType != WorkspaceMappingType.Cloak)
    .Select(m => m.ServerItem);
var workspaceMappingServerUrl = workspaceMappingServerMappings[0];
// p.s. GetService() is a dumb way to get services, why not just make
//     it dynamic, it’s just as undiscoverable

var
versionControlServer = _tfsTeamProjectCollection.GetService<VersionControlServer>();
// notice the workspace server mapping url is a parameter. This facilitates onne web call
var
changesets = versionControlServer.QueryHistory(workspaceMappingServerUrl,
    version: VersionSpec.Latest,
    deletionId: 0,
    recursion: RecursionType.Full,
    user: null,
    versionFrom: null,
    versionTo: VersionSpec.Latest,
    maxCount: 1,
    includeChanges: true,
    slotMode: false,

    includeDownloadInfo: true);

Sadly this only works for changesets; in other words traditional Team Foundation Version Control (TFVC) checkins.  It doesn't work for Git, despite that what we want to accomplish is so very, very similar (i.e. couldn't we just throw in an overload that asks for the branch you're querying against?).

But Wait, There's More


As far as I can tell there is only one remaining option.  It's the new TFS Rest API.

There are two ways to use it.  The documentation says to use an HttpClient, but there's also a nice convenience wrapper that you can get by adding a reference to Microsoft.TeamFoundation.SourceControl.WebApi.dll, which you can find in the GAC.  Using this approach if you write something like this:

var vssCredentials = new VssCredentials(new WindowsCredential(_networkCredential));
GitHttpClient
client = new GitHttpClient(projectCollectionUri, vssCredentials)

// unnecessary web request #1: get the list of all repositories to get our repository id (guid)
var repositories = await client.GetRepositoriesAsync();

// sadly the workspace server mapping in the build definition barely resembles the repository Name, thus the EndsWith()
var
repository = repositories.FirstOrDefault(i => workspaceMappingServerUrl.EndsWith(i.Name));
var repositoryId = repository.Id;

// unnecessary web request #2: the workspace server mapping told us which server path triggered the build, but it #FAIL’ed to tell us which branch, so we have to scan them all!!!
var
branches = await client.GetBranchRefsAsync(repositoryId);

List<GitCommitRef> latestCommitForEachBranch = new List<GitCommitRef>();
foreach (var branchRef in branches)
{
    // branchRef.Name = e.g. 'refs/heads/master', but GetBranchStatisticsAsync() needs just 'master'
    var branchName = branchRef.Name.Split('/').Last();
    // Ack! Unnecessary web requests #3 through (number of branches + 2)!!!
    // p.s. repositoryId.ToString()? Can we please be consistent with data types!?
    var gitBranchStats = await client.GetBranchStatisticsAsync(repositoryId.ToString(), branchName);
    latestCommitForEachBranch.Add(gitBranchStats.Commit);
}

var lastCheckinAcrossAllBranches = latestCommitForEachBranch.Aggregate((i, j) => i.Author.Date > j.Author.Date ? i : j);

I've documented everything I hate about this in comments, but the most important point is this: The workspace mapping API for build definitions (which says which folder(s) trigger the build) fails to include a branch property.  This is true even for the Web API's.  For instance:

http://tfsserver:8080/tfs/DefaultCollection/_apis/build/definitions/1?api=1.0

Fails to tell us anything about the workspace mappings.  This API omission forces you to query all branches, which requests lots of web requests.  Specifically it requires the number of pushed branches plus two web requests in order to find the latest check-in across all branches.  This could be insanely expensive, and it might not even be correct in some circumstances.

Is There No Better Way?


As nice as the strongly typed API approach sounds, it turns out to be missing a number of API's that you can get to if you use a WebClient to request them manually.  Specifically if you use the web API directly you can issue a single request against the commits endpoint to get the latest commit across all branches.

Sadly, the authentication via WebClient is a bit tricky and is dependent upon whether you are using a locally hosted TFS or Visual Studio Online.  For this reason you're better off with some helper methods:

///

/// This method handles requests to the TFS api + authentication
///

public async Task ExecuteGetHttpClientRequest(string relativeUrl, Func<dynamic, T> action)
{
    using (var webClient = GetRestWebClient())
    {
        string fullUrl = Uri + relativeUrl;
        var resultString = await webClient.DownloadStringTaskAsync(fullUrl);
        dynamic deserializedResult = JsonConvert.DeserializeObject(resultString);
        return action(deserializedResult.value);
    }
}

public WebClient GetRestWebClient()
{
    var webClient = new WebClient();
    if (MyTfsServer.IsHostedTfs)
    {
        SetBasicAuthCredentials(webClient);
    }
    else
    {
        SetNetworkCredentials(webClient);
    }
    webClient.Headers.Add(HttpRequestHeader.ContentType, "application/json; charset=utf-8");
    return webClient;
}

///

/// Using basic auth via network headers should be unnecessary, but with hosted TFS the NetworkCredential method
/// just doesn't work.  Watch it in Fiddler and it just isn't adding the Authentication header at all.
///

///
private void SetBasicAuthCredentials(WebClient webClient)
{
    var authenticationHeader = GetBasicAuthHeader();
    webClient.Headers.Add(authenticationHeader);
}
public NameValueCollection GetBasicAuthHeader()
{
    const string userName = "username";
    const string password = "password";
    string usernamePassword = Convert.ToBase64String(System.Text.Encoding.ASCII.GetBytes(string.Format("{0}:{1}", userName, password)));
    return new NameValueCollection
    {
        {"Authorization", "basic" + usernamePassword}
    };
}
private void SetNetworkCredentials(WebClient webClient)
{
    var networkCredentials = new NetworkCredential("username", "password");
    webClient.UseDefaultCredentials = networkCredentials == null;
    if (networkCredentials != null)
    {
        webClient.Credentials = networkCredentials;
    }
}
Wow.  That's a lot of boilerplate setup code.  Now to actually use it to retrieve check-in information associated with a build:
// Get all repositories so we can find the id of the one that matches our workspace server mapping
var repositoryId = await _myTfsProject.ProjectCollection.ExecuteGetHttpClientRequest<Guid?>("/_apis/git/repositories", repositories =>
{
    foreach (var workspaceMappingServerUrl in workspaceMappingServerUrls)
    {
        foreach (var repository in repositories)
        {
            string repositoryName = repository.name;
            if (workspaceMappingServerUrl.EndsWith(repositoryName))
            {
                return repository.id;
            }
        }
    }
    return null;
});
// now get commits for the repository id we just retrieved.  This will get the most recent across all branches, which is usually good enough
var getCommitsUrl = "/_apis/git/repositories/" + repositoryId + "/commits?top=1";
var commit = await _myTfsProject.ProjectCollection.ExecuteGetHttpClientRequest(getCommitsUrl, commits =>
{
    var comment = commits[0].comment;
    var author = commits[0].author.name;
    return new CheckinInfo
    {
        Comment = comment,
        Committer = author
    };
});
return commit;
Is this absolutely terrible?  Perhaps not  But it is a lot of code to do something that used to be quite simple with TFVC and is quite simple with all other build servers (or at least those I have experience with, specifically: Hudson, Jenkins, Team City, Bamboo, CruiseControl, and Travis).

Summary

So that's my story and I'm sticking to it.  If any readers find a better approach please post in the comments, send me a note at @lprichar, issue a pull request against my CheckinInfoGetterService.cs where you can find the full source for this article, and/or comment on this SO article where I originally started this terrible journey.  Hopefully this will save someone else some time -- if not in the solution, perhaps in the following advice: if you value your time avoid the TFS Git API.

Thursday, October 2, 2014

Cowboys vs Perfectionists vs … Body Builders

I've worked with many developers over my fifteen years of professional software development.  Sometimes I just can't help but classify them.  But I bet you've noticed this too: there tends to be two main types.

Cowboys


If you've never heard of unit testing or avoid it because it slows you down, you're probably a cowboy.  Cowboys can prototype concepts to prove viability before you've finished describing the idea.  They can hit unrealistic deadlines and still have plenty of time to gold plate.  A cowboy is guaranteed to get you to market before your competition.  But God help you if you put their code in the hands of end users: you'll have more bugs than a bait shop.

Perfectionists


If your goal is 100% code coverage, you only code TDD, or you spend more time refactoring than producing customer-centric code, you might be a perfectionist.  Perfectionists write pristine, maintainable, refactorable, low-defect code.  Applications built by perfectionists will last the test of time.  The problem is it'll take them twice as long to get to production as it should and if their project doesn't run out of money first, their competition is liable to beat them to market, invalidating all that beautiful code.

Body Builders


Quick quiz: how do the most successful body builders maximize muscle mass while minimizing body fat?  Steroids of course.  Ok, how do they do it legally?  Maximizing muscle mass takes lots of lifting, lots of protein, and lots of calories.  Minimizing body fat requires cardio work and dieting.  But dieting and consuming calories are mutually exclusive.  So the answer is that body builders alternate between periods of cutting (dieting) and bulking (eating and lifting).  Neither activity alone will allow them to reach their goals.

Cowboy + Perfectionist = Cowfectionist?

So what do body builders have to do with software developers?  A fantastic software developer I once worked with used to say he felt software developers should be paid based on the lines of code they delete.  I always enjoyed his deleting code theory (clearly he was more of a perfectionist than a cowboy), but it struck me recently that deleting (refactoring) is to writing code as cutting is to bulking.  In other words if you try to do one exclusively, or (worse) both simultaneously, you're setting yourself up for failure.

Summary

We all have tendencies toward cowboys or perfectionists, but to become the most successful software developers our goal should be to alternate between cutting and bulking, refactoring and producing.  Obtaining a high muscle, low fat code base that gets to market on time with high maintainability requires neither a cowboy nor a perfectionist: it requires  avoiding extremism, embracing moderation, and recognizing the strengths of those different from ourselves.  If we can find that balance we can truly accomplish great things.