Maintainable test setup with scenario pipelines

Shared test setup operating at the application level or below always made my test code hard to maintain. I stopped using test framework mechanics for this, in favor of concise repetitious setup pipelines at the start of each test.

Abandoning a bad habit hardly ever works in a subtractive fashion: few people manage to stop smoking from one day to the next, just by sheer force of will. It is much easier to overwrite bad behaviour with a new, pleasurable, convenient one.

I have been unable to stop a bad testing habit, until I discovered a new approach of doing things that just clicked. But first, let me give you some background.

Common test framework mechanics

I learned testing using the RSpec test framework, when working on Ruby on Rails applications. It allowed me to write beautiful assertions and, most importantly, to organize and group my tests in flexible ways. Grouping or scoping tests is good for two things:

  • tests that share a common theme can be kept together for clarity
  • tests can share setup code, avoiding duplication

The same holds true in Elixir’s built-in testing framework, ExUnit. It is different in many respects, for example, it does not provide assertion statements as sophisticated as RSpec. For the scope of this post, most differences between ExUnit and RSpec do not matter.

Here is a skeleton of tests in ExUnit, demonstrating scoping and setup:

elixirdefmodule MyTest do
  use ExUnit.Case, async: true

  setup do
    # This setup code is always executed. It is run first
    ...
    :ok
  end

  describe "the first scope" do
    setup do
      # This setup code is executed for tests within this `describe` block
      ...
      :ok
    end

    test "first test in first scope" do
      ...
    end

    test "second test in first scope" do
      ...
    end
  end

  describe "the second scope" do
    setup do
      ...
      :ok
    end

    test "first test in second scope" do
      ...
    end

    ...
  end

  test "an unscoped test" do
    ...
  end
end

There is a bit more to this, but I will get into this later.

The dangers of relying on what the testing framework offers

Scroll up and have another look at the list of reasons to use test scoping. Notice something? The feature of test scoping mixes two concerns, notably one targeted at the human (logical grouping of tests related by a concept), and one, more technically oriented, targeted at the necessity of coupling tests that need to share a setup.

I do not want to enter the DRY vs. some-level-of-repetition-is-ok debate here. Having a means of grouping tests that need to change for the same reason and collapsing the location of the change to one point (the shared setup) is certainly a good thing. I want to point out that mixing these two concerns with the same feature is problematic, for the same reasons as multiple levels of scoping tests is a bad idea, as expained here by José Valim, the creator of Elixir. ExUnit’s nesting restriction, the topic of that post, helps to reduce setup complexity.

I want to demonstrate that this might not be enough, due to the fact that we want the two things mentioned above from our test scoping. Time for a contrived example: Users can up-vote blog posts, but each user can up-vote a post only once:

elixirdescribe "upvoting of a blog post" do
  setup do
    # set up the blog post and a user
  end

  test "a user can up-vote a blog post" do
    # invoke the upvoting, assert vote count
  end

  test "an additional upvoting by the same user has no effect" do
    # invoke upvoting twice, assert vote count stays the same around the second voting
  end
end

Now we add an additional requirement: Posts can be locked, in which case a user cannot upvote it:

elixir...

  test "a user cannot up-vote a locked post" do
    # * make sure the post is locked
    # * invoke upvoting, assert vote count is untouched
  end

Here we have a problem: the setup has already happened, so we have a post that is not locked (because otherwise the other tests would break).

Possible attempts at a solution

Allow for modification of set-up data in that one test

In the example above, we could change the locked flag in the database for the one test affected.This introduces new problems:

  • we handle setup in separate places, once “officially” in the setup block and once within the test block, when “fixing up” the post
  • a cursory scanning of the test file might give the impression that all tests within the describe block deal with unlocked posts, which is wrong.

Also, this strategy only works for simple cases and quickly gets messy. Imagine deleting a number of rows from the database the setup block has just inserted…

Move the new test out of the describe block

That means we let the technical necessity for shared setup win over the necessity of logical grouping by feature. We would need to re-phrase the description of the first setup block, because is has become too general. I can’t count the number of times I forgot to do so when the problem scope of such a block had changed… In effect, the test scopes lied to me and led me to making errors further down the road.

Nest the new test in another describe level

This is like the previous solution, just that the messy in-place update of the post can happen inside a setup block (misusing the word “setup”).

Also, the setup mechanics and the problem scope of that block have diverged slightly, which is easy to miss when adding another test in the future. This is the ugliest solution, and as mentioned in the discussion, ExUnit does not support this by design.

Always have two levels of scoping

The intention is to separate logical grouping (occuring at the top level) from technical separation of setup (at the next level).

Go try to enforce this convention in a team, I bet it won’t fly. Lazy people (like me) will skip this step when the technical necessity is not there yet, because “it’s easy to refactor later”. Somebody will need to add a test with a varying setup at some later time and figure out all the “solutions” in this post…

Again, note that nesting of describe is not possible in ExUnit.

Varying setup by using test framework tricks

We could make use of the information architecture of our testing framework to switch between setup variants in the setup block. Without going into details here, both RSpec and ExUnit support this by adding metadata (RSpec) or tags (ExUnit) to individual tests and groups, which the setup can query. This solution is technically the most complicated, hard to read and reason about, and difficult to get rid of. I would use this only as a last resort, and only as a temporary stepping stone when I really need to refactor my testing code toward a more sane structure (from bad via ugly to good).

Side note: using this feature is sometimes hard to avoid when the setup happens far away from the test code, for example when doing a very broad and general configurational setup at the framework level that requires occasional differentiation.

Move setup into tests – the most stupid approach!

Finally, we could try to get rid of setup blocks for (application or unit level) setup altogether by handling setup inside each test. This approach comes with two challenges:

  • How do I keep setup repetition at a minimum to avoid having to touch many places when things inevitably need to change?
  • And how do I cope with the fact that I now have setup code and test code in the test blocks? As a reader of test code, I do not want to search for the place where the setup ends and the real testing begins, I want a clear delineation of the two.

Back to square one, moving from stupid to inspired

This was my landscape of options until a while ago. Nothing really worked and stuck. My bad testing habits crept back into my day-to-day work out of habit and familiarity. But in retrospect, I had almost all components for a clean solution layed out before me, I just needed to re-assemble them with a twist. By focusing on the problem of setup dependencies, we are almost naturally pushed into a direction that solves these problems altogether. We do not need to compromise on legibility nor on maintainability of our tests.

Start from the last attempt, the unconventional solution of avoiding the setup block provided by the testing framework.

Avoiding the repetition: Extract.

The most pressing problem with this simple approach is probably all the setup repetition throughout the tests. Avoiding repetition and keeping setup commonalities localized has an obvious solution. How do you deal with this situation in your normal code? You extract for re-use and abstraction. In tests we can do the same: extract our setup sub-steps into small to-the-point functions, defined privately at the bottom of the file.

At first it could look like this:

elixirtest "a user can up-vote a blog post" do
    post = insert_post()
    user = insert_user()

    Blog.Votes.upvote_post_for_user(post, user)

    assert reload(post).votes == 1
  end

The setup ends after the last call to an insert_ function. It is delineated from the actual test code with a blank line, which is almost nice. For the test "a user cannot up-vote a locked post" we can concisely call insert_post(locked: true) if our setup function supports that.

Sidenote: In this simple example we might as well just use factories. Attribute this to the fact that this demo code needs to be really simple, and imagine that insert_post and insert_user need to set up additional things which do not belong into an application-wide factory. You should strive for extracting setup steps that are meaningful in your domain, and not merely talk about the database table they touch.

Now imagine that some setup steps depend on previous setup results in a non-trivial way, and that we also might need to pass test-specific attributes. Perhaps a setup step requires multiple things to be passed in that have been set up in previous steps. Things become messier, our clear delineation between setup and test code is quickly diluted and our test code no longer focuses on what it should, namely executing our domain and asserting.

Dealing with setup dependencies: Pipeline.

Setting up for a test in interdependent steps is a problem of state transformation! We start from “nothing has been set up”, going through various partially set-up states, and think about the whole process as a series of transformations, until we reach a system state of “everything is set up as needed”. In Elixir, these are the steps to take:

  1. Rewrite the setup functions to take a simple Elixir map, dubbed scenario, and modify it
  2. Let the setup functions pick data from previous steps out of the map
  3. Place the result of the setup step inside the scenario, for further steps or the test itself to pick it up
  4. Destructure data that is needed out of the final scenario before the proper test code

Here is the final form of the test itself:

elixirtest "contrived example" do
    %{
      user: %{id: user_id},
      post: %{comments: [%{id: comment_id}]}
    } =
      setup_user()
      |> setup_post
      |> setup_comment(moderated: true)

    # proper test code, using user_id and comment_id
  end

And here are the setup functions needed to make this happen:

elixirdefp setup_user(scenario \\ %{}) do
    user = insert(:user)

    Map.put(scenario, :user, user)
  end

  defp setup_post(scenario) do
    # Picks `user` out of the scenario
    post = insert(:post, owner: scenario.user)

    Map.put(scenario, :post, post)
  end

  defp setup_comment(scenario, attributes \\ []) do
    # Picks `post` out of the scenario
    comment = insert(:comment, attributes ++ [post: scenario.post])

    Map.put(scenario, :comment, comment)
  end

The tests now strictly follow this general pattern:

elixirtest "general pattern" do
    %{
      # destructure needed data out of the scenario
    } =
      ... # setup pipeline

    # proper test code, using data destructured out of the scenario
  end

The execution order is of course 1) setup 2) destructuring 3) proper test, while notationally, the order is 2, 1, 3. This is unusual at first, but I quickly got used to this quirk. I try not to overburden my setup functions with responsibility, instead now I have the freedom to swap out a setup step with a different function if needed.

Things get a little more complicated when setting up lists of items. Often, this can be accomplished by maintaining a scenario structure like this:

elixir%{
  last_comment: the_last_comment_inserted,
  comments: comments_in_insertion_order # if necessary
}

In very rare cases, a setup step needs to be told which part of the scenario data it should use as input. This is possible by passing a keys array as accepted by get_in. But since this is a super-rare exception, you could arguably prefer to break with the mandate of a single setup pipeline in this case.

Conclusion

I have written a decent amount of tests in this way now, and they have survived substantial change to the application code without causing any headache. Try it for yourself, and shed bad habits!


Interested in building a product with background processing, high connectivity or at scale? Get in touch and find out whether Elixir might be a good fit for your problem.