Things I've realized about testing Elixir code • OVERBRING Labs

I’m currently on my third rewrite/refinement of Elixir File Browsing and the book has changed significantly, from an original write-up on implementing an API client for the undocumented REST API of File Browser to an exploration of software engineering and functional programming with Elixir, with the API client serving more as an excuse for the exploration and its codebase ending up as a mere “side-effect” of it. In fact, so much has changed since I first published the last version of the manuscript on Leanpub , that “Elixir File Browsing” is now an inaccurate working title that I’ll be updating soon, alongside with the subtitle.

Part of the new rewrite is also a section in spinning off some parts of the ExFileBrowser.Client module into the ExFileBrowser.Client.Token and ExFileBrowser.Client.Credentials “resource modules” that derive the state of the JWT and of the credentials nested within the value of the :auth field of the %Client{} struct, and the capabilities they grant to the client (:login, :renew, :admin, etc.)

Before I started the rewrite of that section, I had already implemented a ClientTest module with ExUnit tests, and so was able to validate the overall concept of deriving the state and the capabilities of a client depending on the token and the credentials the struct contains, if any. After I spun off the two modules, I arrived at a point where I needed to decide what to test, and how. The book now includes a sub-chapter on answering this question. Here are my take-aways:

IEx is key for spot-checking code, even without `ExUnit`

Elixir gives us something most languages don’t: IEx (the REPL). Prototyping code that will end up getting enshrined as a private or public function in the codebase is a great way to build understanding; it’s the tightest feedback loop you can get, as you get instant results. Testing functions by using IEx is the least you can do to quality-assure code. It helps you close the “knowledge gaps”, i.e. to mitigate the first waste of Lean Product and Process Development (a la Allen Ward): wishful thinking.

There’s nothing “lesser” about testing code in IEx instead of writing a module for tests with ExUnit in terms of the verdict on whether something works, even though it’s not a long-term solution. Well, not quite, because IEx usage cannot be automated. Still, use IEx early, aggressively, and unapologetically, before writing ExUnit tests.

Spot-check testing in IEx doesn’t scale

You can’t rely on IEx forever. Typing the same code snippet in IEx gets old fast, and cannot be autuomated. The “spot checks” only live in your head and in IEx’s history of commands, and this doesn’t document what works or the intent behind the code to your future self or to future maintainers.

Using ExUnit for what deserves to be tested is inevitable, especially for things that others will be relying on in the long run, and/or for things that are likely to change.

Automated tests with ExUnit are more than code verification; they’re executable documentation, as they capture how the system is supposed to behave, so you or others don’t have to figure things out from scratch months later, or have to run code snippets in IEx every time you or they make a change.

Uncertainty makes `ExUnit` tests inevitable

Predicting whether something you implemented today is stable or not is tricky. It’s tempting to think “hey, this function seems stable, it’ll never change, so why write a test for it?” If only we knew today what we’ll know tomorrow!

A small change in requirements, or a new maintainer misunderstanding the intent behind the Credentials or Token modules, and suddenly you will have to implement tests with ExUnit to make sure that you or they don’t introduce regressions.

So, in that light, even though I don’t agree with implementing tests for every public function right there and then after (or before) you write it, I do find that ExUnit tests become inevitable to defend against what “known unknowns” and “unknown unknowns” throughout the life of the library.

Trivial tests have a negative ROI

Not all tests are equal. Some provide the confidence we need to move on to the next part of the code, or to defend against regressions. Others are just noise. For example, take this function, Token.ttl/1:

  defp ttl(exp) when is_integer(exp), 
    do: max(0, exp - System.system_time(:second))

  defp ttl(nil), do: 0

Could you test this with ExUnit? Yes, if you make it public (or introduce a dependency that allows you to test private functions). But should you test it or make it testable? You shouldn’t, as this is simple, pure logic. Will the teacher give you a golden-star sticker on your assignment if you achieve 100% code coverage by implementing a test for this function? At which point does “test everything, always” become a compulsion that wastes time on things that don’t require it? Time, notably, that is better spent on more valuable things, including integration tests like those in ClientTest…

A dedicated test for ttl/1 doesn’t provide any value: it doesn’t mitigate any risks, but only adds costs for implementing the test and maintaining it. In other words: the value of testing ttl/1 is negative.

The real risk isn’t in whether the subtraction within ttl/1 works as intended; it’s in how the rest of the system reacts to a token TTL of 0–and this will reveal itself when we test the function that utilizes it (Token.info/2) or, latest, by the time we run an up-to-date ClientTest and its tests.

Worse, testing trivial one-liners bloats the test suite, increases maintenance overhead, and creates the illusion of quality assurance while, but without reducing any actual risk.

Private functions don’t need direct tests

If a private function feels complex enough that you want to test it directly, that’s a design smell. Either the public function is doing too much (too many responsibilities), or the private function deserves to be a public function in its own module, and written in a way that makes it composable out of simpler private functions like the aforementioned ttl/1. This way you can test it in IEx first, and possibly decide to write ExUnit tests for it later.

ExUnit tests guard the boundaries, i.e. the contracts the module presents to world beyond it. If those hold, the private building blocks are working too.

Testing is design feedback

If writing a test feels like pulling teeth to cover the cases that need to be defended against, I take this as a sign that my software design is crappy to begin with… Most likely, the function I’m testing is doing too much, so that’s a sign that I need to refactor, which TBQH is one of the most pleasant “gardening” experiences of developing software in Elixir.

Coverage is not the goal

As an enthusiastic beginner in the world of software engineering, I’ve been burned a few times by taking emphatically-expressed opinions online as unbreakable rules. Two of those dogmas I’ve run into online are:

You should test everything, always.
You should aim for 100% coverage.

Both are misleading. The former is clearly nonsense, as you should not be testing private functions, and don’t need to test functions that perform trivial things. The value of testing trivialities is negative; you waste time writing code that tests a triviality, therefore not actually reducing risk.

Also, chasing 100% coverage is chasing a metric for the sake of it. What matters is a) that the system works where it counts, and b) that tests cover the parts of the code where regressions are most likely and costly.

100% coverage is meaningless if it drives the wrong behavior, but the code coverage metric itself is not useless; it’s a (potentially) useful heuristic. If coverage is 20%, that might signal that you’ve been leaning too heavily on spot-checks with IEx. If it’s something like 95%, maybe you’ve been wasting effort on quality-assuring trivialities. The metric itself isn’t a problem; the cargo-cult dogma surrounding it with badges on GitHub repositories is, though. It turns the metric into a vanity metric!

Integration tests matter most

Those coming into the world of software engineering from non-software-centric product development and/or systems engineering, and aware of abstractions like the V-model already know this: the most valuable tests are those that check behavior at the boundaries. In Elixir, that often means testing how modules integrate with each other or how external inputs/outputs are handled. For the latter, we make liberal, purpose-driven use of guards and function polymorphism, perhaps with fall-back clauses. For example:

  def info(token, cutoff \\ 5 * 60)

  def info(token, cutoff)
      when (is_binary(token) or is_nil(token)) and
             is_integer(cutoff) and cutoff > 0 do
    token
    |> process_payload
    |> prepare_derivation(cutoff)
    |> derive_state()
    |> deliver_info_map()
  end

  def info(reason, _) when reason in @errors, do: {:error, reason}

Integration tests confirm that the system behaves as promised, and implicitly also that the “implementation details” of their building blocks also function as expected. They’re the tests that make us confident that the “contract” of a function like Token.info/2 still holds, even if the internals of its private-function dependencies (like Token.derive_state/1 above, and the private function this utilizes) have shifted around.

How you test depends on the code’s maturity

In early stages, IEx spot checks are enough, and especially while you are prototyping building blocks. As the codebase of the module/library/application grows, writing ExUnit tests for what truly matters becomes inevitable. Not experimenting in IEx or not performing “percussive testing” of a module’s public API in IEx is a missed opportunity. Waiting until everything is set in stone to write ExUnit tests delays feedback unnecessarily. Never writing ExUnit tests hinders maintainability.

It’s not the one or the other; testing in IEx and enshrining worthy, valuable tests as ExUnit tests are complementary activities.

Tests are neither free nor forever

Tests are code too! They age, they rot as the codebase evolves, and they cost both up-front and over the long run. Even if some tests are worth writing now and keeping around for some time, at some point they have outlived their usefulness. Prune them diligently as the codebase evolves.

Don’t let guilt-tripping kill the joy of coding

Something I’ve bumped into more than once: letting testing dogma drain the fun out of programming. When “you must test everything” becomes a law, the creative, exploratory side of coding gets smothered. Perhaps you, like me, have gotten guilt-tripped about developing something that doesn’t include an ExUnit test suite.

The fact that IEx exists means that programming can be playful, iterative, and exploratory, and it’s one of the reasons behind my love for Elixir. Play around, figure things out, then enshrine what deserves it as ExUnit tests.

Test what matters

We need to test what matters. Testing is an engineering practice, not a obsessive-compulsive (or guilt-tripped) ritual of writing tests and aiming for the “100% code coverage” badge on a GitHub repo, or acclaim (or, rather, the absence of scorn) from TDD absolutists. What matters is “confidence at the boundaries”, i.e. that the interfaces between modules work as intended. If this holds, then their private functions work as intended too (and you could/should have been testing their parts with IEx, anyway…)

In the book:

We often use IEx sessions to prototype and test the functionality of functions and their parts, even of functions that might end up as private functions. This helps us build our understanding before committing them in the code of a module.
We integrate these building blocks into public functions of a module. If these functions are likely to evolve by us or others contributing to the entire library, then it absolutely makes sense to write ExUnit tests for those public functions. It helps everyone to test their changes and prevent regressions. If we deem the public functions as likely to remain stable for a long time, we defer the implementation of ExUnit tests until the moment right before we begin with modifications. Until then, we can always test them in IEx as needed. Even so, writing ExUnit tests for them sooner than later is a good idea.
We indirectly test the aforementioned public functions by focusing on testing the public API that ExFileBrowser’s users will be accessing. That’s why we wrote the tests in ClientTest: we validate that the “contract” we established is working as intended, regardless of what happens with the underlying Token and Credentials modules and their implementation details.
When we write tests, we favor those that check the integration of the system’s components. Implementation details are anyway hidden away within private functions that are (could/should be) written in such a way that they themselves are quality-assured by how they’re written or composed of other simpler functions. And, anyway, we don’t test private functions.
We ignore dogmatic takes or religious zealotry. The “test everything, always” and “100% coverage” mantras are (at best, charitably interpreted) well-intentioned heuristics, possibly overdone to protect novices from the other extreme: not testing at all. At their worst, they represent misleading dogma that drives us to replaces critical thinking with chasing a metric, either for our own ego or for some misguided fulfillment of zealots’ context-free, dogmatic criteria of what’s “acceptable”.

There are two APIs

Most people think a library like the future ExFileBrowser (which I’ll be releasing under the Apache-2.0 license, like everything else that I open-source has one API: the one its users interact with.

But: if you look closer, you’ll realize it has two. The first is its public API, the one you document, the one users call, and the one that needs to be stable. For example, ExFileBrowser.new/1, which creates a new %Client{} struct by delegating to ExFileBrowser.Client.create/1. ExFileBrowser.new/1 is the “contract” the library makes with its users, or with the applications that might depend on it.

The second is ExFileBrowser’s internal API. This is the web of functions and data structures that its own modules use to talk to each other. For example, the Token and Credentials modules expose the info/2 and info/1 interfaces, respectively, for the Client module (which has its own info/2 function that gathers these and derives the client’s overall state and capabilities). Both “resource modules’” info functions return a map shaped like this:

%{
  req: [:protected, :public],
  perms: [:delete, :rename, :create, :share, :execute, :download, :modify],
  authn: [:renew]
}

Their functions are meant to be called from within our library, not by an external user. (Even so, the info functions are appropriately guarded so that ExFileBrowser users can still go lower-level to get info on a JWT, for example.)

So, understanding that there are two APIs (or rather, two types of API) is important for testing. The most valuable ExUnit tests we write are those that guard the boundaries of the public API, i.e. the public functions of ExFileBrowser.Client (and, later, those of ExFileBrowser). For example, the ClientTest exhaustively tests different combinations of tokens (with different expiration times and other claims, or nil), and the presence or absence of credentials in the %Client{} struct being mocked, to evaluate that all 6 client states are being correctly represented, and–most importantly–that no illegal client states pop up.

The correctness of the internal API is implicitly tested by public-facing tests, like those in ClientTest. If a user calls ExFileBrowser.Client.info/1 and it returns the correct state, you can be confident that the underlying Token.info/2 and Credentials.info/1 functions (and their private helpers!) are also working as intended. The internal behavior is an implementation detail that’s proven correct by the external behavior.

And this is why we don’t need to test every single function, let alone make private functions public to make them testable with ExUnit, or bring in a dependency to make private functions testable. By focusing on the public-facing contract, we’re testing what matters most: whether ExFileBrowser fulfills its promises at large.

Test the contracts

We should not be thinking about functions only, but about the contracts the modules are presenting. The contracts with the users of ExFileBrowser are of the highest priority. Those that the internal API presents (e.g., Client’s public functions) also need to be quality-assured, but this is done indirectly through ExUnit tests of the external-facing contract.

So, here’s my summary from all the above: the goal isn’t to test every line of code, or quench the feeling of guilt caused by excessive consumption of TDD evangelists’ opinions, but to ensure that each module keeps the promises it makes. Testing in IEx is obviously useful, but using ExUnit is inevitable, so that codebase becomes maintainable.

It may all sound obvious, but it wasn’t obvious to me up until a few days ago. Big thanks to all those who posted on the “Private, a lib to test private functions in 2022” thread on Elixir Forum. Some comments are golden, and helped me figure things out. As always, Elixir Forum is a goldmine!

Things I’ve realized about testing Elixir code

IEx is key for spot-checking code, even without `ExUnit`

Spot-check testing in IEx doesn’t scale

Uncertainty makes `ExUnit` tests inevitable

Trivial tests have a negative ROI

Private functions don’t need direct tests

Testing is design feedback

Coverage is not the goal

Integration tests matter most

How you test depends on the code’s maturity

Tests are neither free nor forever

Don’t let guilt-tripping kill the joy of coding

Test what matters

There are two APIs

Test the contracts

Isaak Tsalicoglou

Related

Elixir File Browsing: my new book on developing a REST API client for self-hosted file management

"Phoenix Product Codex" updated: families, routes, and flexible identifiers

"Phoenix Product Codex" updated: bundles/kits and product categories

Things I’ve realized about testing Elixir code

IEx is key for spot-checking code, even without ExUnit

Spot-check testing in IEx doesn’t scale

Uncertainty makes ExUnit tests inevitable

Trivial tests have a negative ROI

Private functions don’t need direct tests

Testing is design feedback

Coverage is not the goal

Integration tests matter most

How you test depends on the code’s maturity

Tests are neither free nor forever

Don’t let guilt-tripping kill the joy of coding

Test what matters

There are two APIs

Test the contracts

Isaak Tsalicoglou

Related

Elixir File Browsing: my new book on developing a REST API client for self-hosted file management

"Phoenix Product Codex" updated: families, routes, and flexible identifiers

"Phoenix Product Codex" updated: bundles/kits and product categories

IEx is key for spot-checking code, even without `ExUnit`

Uncertainty makes `ExUnit` tests inevitable