Let’s Smell Some Tests #2 — Asserting The Internal Behavior in Java

Why your tests should verify only the observable behavior, not the implementation details

Krystian Szpiczakowski
Better Programming

--

Photo by Michiel Leunens on Unsplash

Hello and welcome to the new episode of the “Let’s smell some tests” series.

In this article, we’re going to consider what exactly our tests are (not) supposed to verify to keep them free of false alarms, and why sometimes less is better. For a better understanding of this topic, we’ll take a closer look at definitions of a brittle test and observable behavior, so we’ll be able to detect poorly designed tests and make them resistant to refactoring.

Let’s get started!

When your tests want to know too much

Back in the days, before I delved into the subject of automated testing, it has happened to me many times. What exactly? Well, just in case, I wanted to make sure my tests verified even more things than it was necessary. I used to believe the more assert and similar statements my tests included, the bigger value they were bringing.

While the above approach may seem reasonable, choosing it will make life difficult for developers in the long run. I found out about it the hard way when my own tests made me have to come back to them more often than I expected. A reason? It turned out that these tests were related to implementation details, not observable behavior, hence, when refactored, they were failing even if the functionality worked still fine.

Brittle tests? Observable behavior? Implementation details?

Before we go any further, let’s define what is behind these cryptic phrases, as they are essential to understand how to write good tests that add real value to our projects and are not unnecessary baggage.

An excellent source of knowledge in terms of good testing practices is a book titled “Unit Testing: Principles, Practices, and Patterns”, written by Vladimir Khorikov.

According to the author (chapter 4.4.4), tests are brittle when

they can’t withstand a refactoring and will turn red regardless of whether the underlying functionality is broken

In other words, your functionality after refactoring can still produce the right outcome, but at the same time, your tests may fail in case they check how something works, instead of checking what is the observable behavior.

What’s observable behavior, then? In chapter 5.2.1., the book’s author defines observable behavior in the following words:

For a piece of code to be part of the system’s observable behavior, it has to do one of the following things:

Expose an operation that helps the client achieve one of its goals. An operation is a method that performs a calculation or incurs a side effect or both.

Expose a state that helps the client achieve one of its goals. State is the current condition of the system.

Any code that does neither of these two things is an implementation detail.

So, when you are working on a new functionality, think what is the real goal of the client that will call your code (what behavior the client code expects from our solution, or what business case your function is supposed to cover), and forget for a moment how you want to develop that feature (implementation details).

This approach should give you a cleaner distinction between observable behavior and implementation details.

Case Study: Leaderboard

Let’s take a closer look at the following example written in Java:

  • we are developing a leaderboard for a game, let’s call this game “Chase and Race”
  • we want our leaderboard returns the best player based on the points scored

The Player class is responsible for holding the player’s name and their score. Score is updated via the Player#updateScore function.

The Leaderboard class allows us to add players to leaderboard’s list via Leaderboard#addPlayer function, and retrieve the best player of the game through Leaderboard#getBestPlayer.

In the LeaderboardTesttest class, we are checking if the Leaderboard#getBestPlayer method is able to return the player with the highest score:

The test for retrieving the best player passes

So far, so good — as you can see, the test report is green.

Later on, we decided to refactor the internals of the Leaderboard class, so it sorts the list of players in descending order, whenever we add a new player to it:

Implementation details have changed, and the test failed this time, despite the tested method produces the correct output

Let’s omit the discussion of whether this change was needed or not — what I want to show you is how changing implementation details can affect the existing test.

As you can see, the test report turns red, but the observable behavior remained untouched — the Leaderboard#getBestPlayer function still does its job right, it returns the player with the highest score.

How to fix this? It would be best if the test written once didn’t require from us additional attention in case of refactoring the codebase. To do so, the Leaderboard#players list should be inaccessible from outside, thus marking this collection with the private modifier will be enough:

But what about the test? It has a compilation error now:

As your tests should only verify observable behavior, we can safely delete the assertion from line 25, because it checks the implementation detail, which makes this test brittle.

Conclusion

The consequences of having brittle tests in your project can be pretty severe. For instance, such tests can discourage developers from code refactoring, because let’s be honest — when you had done refactoring, and this resulted in a bunch of failed tests, this doesn’t necessarily put you in a good mood. The other possible consequence is that developers can get used to tests that raise false alarms, lowering their overall alertness, thus the chance of a bug sneaking into production grows.

Let’s summarize the dos and don'ts:

  • Remember to test observable behavior, not implementation details, and your developer’s life will be much easier
  • Don’t expose implementation details to the world, as the internals should be easily replaced in the future while keeping the published functionality unchanged
  • Code in a test suite is as important as the production code, so pay attention to what you verify, otherwise the overall quality of the project will decrease

References

Unit Testing: Principles, Practices and Patterns, Vladimir Khorikov

--

--