The “Leafing” Method of Structuring Unit Tests

I’ve been practicing Test-Driven Development for the last five years, and in conversation this past week, realized that I’ve internalized a bunch of habits that are really helpful for me, yet they’re generally not written down anywhere on the internet.

The main thing I’m going to discuss in this post is the “leafing” method. I have never heard it used in mainstream TDD communities; I think my brain just globbed onto the term because it may have been casually used when this method was taught to me. By the way, it was taught to me by Anand Gaitonde of Pivotal, when we briefly paired for a week on the Cloud Foundry CLI, which was a little open source CLI tool written in Golang. I think many of the Pivotal teams that worked on CloudFoundry in the San Francisco office used this method, though!

“The Leafing Method”

The Leafing Method broadly means that the order of your tests should reflect the logical flow of your function. The tests closer to the top of the test file should represent “early bailout” conditions: early returns, showstopping error cases, and so on. The tests at the bottom of your test file should represent logical cases that return at the end of the function.

It’s also worth briefly discussing that Go and Ruby are languages with different written styles, and even within each language, different companies have their own preferred flavors. When I worked at Pivotal, our general preference for Go functions was that only the happy path makes it to the bottom of the function. Error cases should be checked rigorously along the way, and returned immediately with an if err != nil check. We would never return the error at the bottom; the happy path would always return some actual value, or otherwise the zero value of the function signature and a nil error. This also implies that we’d usually invert conditionals such that the unhappy path was checked first (but, we’d prefer early returns over conditionals anyway).

Essentially, this method helps you to track (and eventually reduce) the cyclomatic complexity of your code. It can also help identify opportunities for code optimization: if common showstopping errors are being checked after a lot of computations are performed, then the test file will probably look verbose and clunky, because of all the setup that you need, to even reach that execution point. In these scenarios, early returns, or guard clauses, can help boost readability and reduce some complexity.

tl;dr Every subsequent test should make it a little further down the function than the previous test.

Let’s go through a toy example. Here is Conway’s Game of Life, in Ruby:

def next_tick(cell)
  if cell.dead?
    if cell.neighbors.count == 3
      cell.revive!
    end
  end

  if cell.alive?
    if cell.neighbors.count < 2
      cell.die!
    end

    if cell.neighbors.count > 4
      cell.die!
    end
  end
end

To test this top to bottom, if you collapsed the test blocks in your editor, they’d look something like:

context "cell status changes" do
  test "dead cells with three neighbors revive" do
  end
  test "dead cells otherwise stay dead" do
  end
  test "live cells with fewer than 2 neighbors die" do
  end
  test "live cells with more than 4 neighbors die" do
  end
  test "live cells otherwise stay alive" do
  end
end

What’s happening here?

First, we “trip” the top conditional. Then we trip the first nested condition. Then, we trip only the outer, and in this example, we fall through to the bottom. And repeat for the second set of conditionals. It’s kind of weird to reason about a “happy path” in Conway’s Game of Life, but I think it would end with a cell that’s alive :-)

This ordering scheme might feel fairly obvious with an easy example like this! But with real codebases under active development, oftentimes, new features involve slotting in logic somewhere in the middle of existing functions. If the tests are already in a random order, then most people would pop new tests into the bottom of the test file, by default. The nice thing about using an ordering scheme like this is that it’s very easy to figure out where new tests should go, and as a bonus, the commit diff will be easy to explore, with context, while you’re code-reviewing: you can just click the little expander in your Git-hosting SaaS platform of choice to see the lines above and below the diff.

Use a “Greedy Algorithm” to structure conditional flow

You might be wondering, “Okay, I get the top-to-bottom thing. But how do I decide what order to write my guard clauses in?”

I’ve found that thinking about contextualized usage and greedy algorithms provide useful ways to reason about how code should be structured.

Before anything else, let’s review how greedy algorithms behave.

Greedy algorithms try to aggressively reduce the “search space”, or set of possibilities for solutions, by taking the biggest bites possible, as soon as possible. The textbook example is making change using the fewest number of coins possible:

Imagine I give you four quarters, four dimes, four nickels, and a handful of pennies, and I ask you to make 57 cents of change. A greedy algorithm works pretty well here, and it’s also how most people actually solve this change-making problem already, intuitively: first, you give me two quarters, then a nickel, then two pennies.

When it comes to thinking about control flow, first, think about how this method will be used in the context of your domain. That’s the part I can’t help you with, since I don’t know your domain, your common error cases, the integrity of the data flowing through your systems. In the CloudFoundry domain, a very common class of errors is malformed configuration files. Sometimes, people will pass in a file that doesn’t parse to valid YAML, or it’s missing required keys, and so on. Whatever your “showstopping errors” are, figure out which are the most common ones, and chomp those away first: those are the biggest, earliest bites to take, as you greedily eliminate error-causing scenarios from flowing further down in your function’s logic. For example, if I write an API method and I know that API accepts user-supplied input, you can bet your dollar that I’m going to be aggressively handling naughty input values at the very top of the endpoint. Looking at you, Mrs. Tables.

The "Leafing" Method of Structuring Unit Tests

“The Leafing Method”

Use a “Greedy Algorithm” to structure conditional flow

Put it all together