Units are Not Classes: Improving Unit Testing By Removing Artificial Boundaries

Many developers think of unit tests as tests that test a single class. In fact, I myself once thought this way. If I wanted to write unit tests for a two-class system in which a class used another class, I’d write two unit tests. After all, if I created instances of both classes in my test, that wouldn’t really be a unit test, would it?

In recent years, I have revised my stance on this matter. This distinction between what is and is not a unit test is one I no longer draw in the same way.

Unit Tests need to be:

  • Fast – Unit Tests should never run so slowly that they discourage developers from running them all the time.
  • Focused – Unit Tests should focus on a single area so that they can isolate problems to that area when they fail.

Once upon a time, I thought ‘focused’ meant that it should be focused on a single class. Instead, it should focus on an area, but that area can include multiple classes.

An Example

Let’s say we have a simple Calculator class with some tests (normally we’d write the tests first, but since we’re talking about the tests in this post we’re going to write them afterwards). As with any example fabricated for demonstration purposes, the classes in this post have little need for the level of testing I will be applying to them, but they serve an illustrative purpose.

package com.rallydev.engblog.unittesting;

public class Calculator {

  public int calculate(char operation, int operand1, int operand2) {
    switch (operation) {
    case '+':
      return operand1 + operand2;
    case '-':
      return operand1 - operand2;
    default:
      throw new UnsupportedOperationException(String.format(
          "'%s' is not a known operation", operation));
    }
  }

}

Here is the test for this class:

package com.rallydev.engblog.unittesting;

import static org.junit.Assert.assertEquals;

import org.junit.Before;
import org.junit.Test;

public class CalculatorTest {

  private Calculator calculator;

  @Before
  public void setUp() {
    this.calculator = new Calculator();
  }

  @Test
  public void shouldAdd() {
    assertEquals(5, calculator.calculate('+', 2, 3));
  }

  @Test
  public void shouldSubtract() {
    assertEquals(1, calculator.calculate('-', 5, 4));
  }

  @Test(expected = UnsupportedOperationException.class)
  public void shouldThrowExceptionForUnknownOperation() {
    calculator.calculate('Q', 1, 2);
  }
}

But after thinking about it, we realize that the responsibility for parsing what the caller wants to do and the responsibility for actually performing that operation should be separated.

So we’re going to create a new class, CalculationProcessor, which actually handles the calculations themselves.

package com.rallydev.engblog.unittesting;

public class CalculationProcessor {
  public int add(int operand1, int operand2) {
    return operand1 + operand2;
  }

  public int subtract(int operand1, int operand2) {
    return operand1 - operand2;
  }
}

The tests for this class should be obvious, so we won’t go into them.

We will also modify our Calculator class to be injected with CalculationProcessor at construction time, which it will use to perform the operations. That looks like this:

package com.rallydev.engblog.unittesting;

public class NewCalculator {
  private CalculationProcessor calculationProcessor;

  public NewCalculator(CalculationProcessor calculationProcessor) {
    this.calculationProcessor = calculationProcessor;
  }

  public int calculate(char operation, int operand1, int operand2) {
    switch (operation) {
    case '+':
      return calculationProcessor.add(operand1, operand2);
    case '-':
      return calculationProcessor.subtract(operand1, operand2);
    default:
      throw new UnsupportedOperationException(String.format(
          "%s is not a known operation", operation));
    }
  }
}

So the question is, what should the test for NewCalculator look like?

If we follow the strict “One Unit Test, One Class” paradigm, we have to create a mock for the CalculatorProcessor and pass it into the Calculator.

One possible way it could work is by mocking out the add and subtract methods so that they have mock implementations which actually perform the operations. That solution really only works here because this example is so simple (the implementation of add is trivial). Were this problem more real-world, mocking out the responses based on the inputs may not be feasible, so let’s imagine that we cannot do this for the purposes of this post (when mocking out the responses is possible, most of this post is rendered moot, as that is usually the best way to go).

The other way to write these tests would be to create a mock and verify that our NewCalculator delegates to CalculationProcessor appropriately. It would like this:

package com.rallydev.engblog.unittesting;

import org.junit.Before;
import org.junit.Test;

import static org.mockito.Mockito.*;

public class NewCalculatorBadTest {
	private NewCalculator calculator;
	private CalculationProcessor mockCalculationProcessor;

	@Before
	public void setUp() {
		this.mockCalculationProcessor = mock(CalculationProcessor.class);
		this.calculator = new NewCalculator(mockCalculationProcessor);
	}

	@Test
	public void shouldAdd() {
		calculator.calculate('+', 2, 3);
		verify(mockCalculationProcessor).add(2,3);
	}

	@Test
	public void shouldSubtract() {
		calculator.calculate('-', 2, 3);
		verify(mockCalculationProcessor).subtract(2,3);
	}

	@Test(expected=UnsupportedOperationException.class)
	public void shouldThrowExceptionForUnknownOperation() {
		calculator.calculate('Q', 1, 2);
	}
}

I hate this test. This test does little but double-check the implementation of NewCalculator.

This test is extremely brittle. If the calculator had more responsibilities (say, multiplication, exponents, etc.) and the tests were more thorough (boundary checking, looking for division by zero, and so on) then changing the implementation of NewCalculator would require changing dozens of broken tests.

The tests are there to make us feel safe about refactoring the code, but virtually any refactoring breaks them, which requires us to manually fix the tests. When this happens, a refactoring effort loses its test harness: the code and the tests must both change together, rather than one at a time with the one verifying the correctness of the other.

Instead, I would prefer to leave the test as it was, only modifying the setUp method to correctly construct the NewCalculator with real instantiations of its dependency.

package com.rallydev.engblog.unittesting;

import org.junit.Before;
import org.junit.Test;

import static org.junit.Assert.assertEquals;

public class NewCalculatorBetterTest {
  private NewCalculator calculator;

  @Before
  public void setUp() {
    CalculationProcessor calculationProcessor = new CalculationProcessor();
    this.calculator = new NewCalculator(calculationProcessor);
  }

  @Test
  public void shouldAdd() {
    assertEquals(5, calculator.calculate('+', 2, 3));
  }

  @Test
  public void shouldSubtract() {
    assertEquals(1, calculator.calculate('-', 5, 4));
  }

  @Test(expected = UnsupportedOperationException.class)
  public void shouldThrowExceptionForUnknownOperation() {
    calculator.calculate('Q', 1, 2);
  }
}

There are a few things to notice about this test. First and foremost, the bodies of the three test methods are now identical to how they were before the refactoring. This means that the refactoring could have taken place while using the test to ensure it was correct, though the setUp method would have had to change.

Secondly, even though a real CalculationProcessor was instantiated for the test, it was not assigned to a field. It is scoped exclusively to the setUp method. This is because the test really has no reason to know about the class beyond the fact that it needs it to construct the NewCalculator.

Blurring the Line

Is this a unit test, or is it something else? Unit-testing pedants would argue that it’s not a unit test, but an integration test. Michael Feathers may disagree:

“In the industry, people often go back and forth about whether particular tests are unit tests. Is a test really a unit test if it uses another production class? I go back to the two qualities: Does the test run fast? Can it help us localize errors quickly?” (Source: Working Effectively with Legacy Code)

The test above is extremely fast, but how effectively does it help us localize errors? If it were the only failing test for a system full of classes and tests, it would help us narrow down the error pretty quickly: there is either a problem parsing the operator, or a problem performing the operation.

If we had a separate test for the CalculationProcessor itself, but it was passing, then we would know the error is in the parsing. It didn’t help us localize the problem quite as quickly as the mocking test would have, but it is also a far stronger test, less likely to break under the strain of a refactoring.

It’s a trade-off, of course, but I now tend to lean toward creating instances of the dependency classes in the setup of the test in these situations. I have simple guidelines to help me determine when I should create a mock versus a concrete instance of a class.

Verify() is a Crutch

When you’re writing a test, you go through three phases, which BDD proponents refer to as the Given, When, and Then. First you write the Givens, which attempt to fabricate a pre-existing situation for the purposes of your test. Then you write the Whens, which are the operations being tested. Last you assert some Thens, which are the expected results of performing the operation.

In the case of our simple unit test above, the Given is that a NewCalculator exists. The When is that calculate is called with some parameters. And then Then, of course, is that we get the desired result. But how do we write our Thens?

My feeling is that, if we reach into our toolbox and pull out a verify to write our Thens, we are cheating our tests out of the chance to be truly valuable. When we reach for verify, we are creating brittle tests that resist refactoring rather than enable it. We are using verify as a crutch, but we should only reach for it as a last resort.

Instead, reach for concrete instantiations of classes where possible. Create a mock that will be used to verify method invocation for one of two reasons:

  1. Speed – If creating an instance of the class will significantly slow down the test, it should be mocked out instead. This is likely to be the case for classes that clearly turn tests into integration tests, such as classes that facilitate a connection to a database or over a network. This may also be the case for classes that perform a great deal of processing.
  2. Complexity – Create a mock if creating an instance of a class requires creating a number of additional classes that it depends on, which in turn depend on more classes, and so on. You do not want your setUp methods to contain thousands of lines of object instantiation to support your tests. I recommend creating real objects as deeply down the dependency tree as possible, then mocking out the rest. How far down the tree should you go? That’s really a matter of taste, but I recommend trying two layers deep and no more. When you do this, you are likely to find that you have no need to verify the calls on the mock objects (other tests already cover those scenarios), but instead can set up mock objects to respond in specific ways to method invocations, simulating their behavior for the purpose of the test so that your tests can verify actual results.

Going Forward

Writing tests without the hard Unit/Class restriction allows us to write tests that enable the safe refactoring of classes, but come at the potential price of slowing down our tests and complicating the setUp methods for those tests. There is a delicate balance there, but in general I recommend trying to err on the side of concrete class instantiation in tests, at least over method invocation verification (mocking out responses is always still preferred).

This view is, of course, subject to change (as it was when I felt that the restriction was valuable), but since blurring this line my tests have remained fast and, more importantly, allowed me to code faster by letting me refactor when necessary.

Agree? Disagree? Leave a comment.


Wedding Photos Online




JB0959

Originally uploaded by rodhilton

The photos from my wedding are up on Flickr. Check ‘em out here. Our photographer was excellent. I highly recommend her.

The wedding was great. Virtually nothing went wrong – we even just barely avoided some rain but avoided it nonetheless. I was kind of dreading having to be social for a whole evening, but even the reception turned out to be a good time (though I had to go hide in order to eat my slice of cake).

Overall, being married doesn’t feel too much different from not being married. I’ve known I’d marry Julia since second week we dated, and I knew she’d make a beautiful bride (she did).

It was great seeing family and merging my 7-person family with her 100-something-person family. I also really loved how many of our friends from college were able to make it, including one of my professors. It was great seeing everyone again.

My friend Dennis was a fantastic Best Man, and Julia’s sister Olivia did an wonderful job as Maid of Honor. It was a truly unforgettable time.


I’m Your Moon (Wedding Remix)

After I proposed to my girlfriend, I was tasked with finding a good first dance song for us. This was not easy, as I typically listen to hard rock, with occasional smatterings of other genres.

I got quite lucky, however, when I went through my one of these other genres, my Jonathan Coulton folder, and found a song called I’m Your Moon.

It’s a super geeky little song about Pluto and it’s moon, Charon. The chorus goes:

I’m your moon,
you’re my moon,
we go round and round.

Anyway, we both really liked this song because it was kind of nerdy and had a double meaning (to someone unaware of the science, it just sounds like a nice love song).

Continue reading ‘I’m Your Moon (Wedding Remix)’ »


5 Ways To Hose Your Estimates

Everyone knows the drill with Agile – the developers put estimates on stories, and those estimates are used to plan releases. Of course, some releases go awry, with the engineering team delivering far fewer completed stories than expected. When this happens, people start to ask “why are our estimates so inaccurate?” What many don’t realize, however, is that this is the wrong question. Estimates don’t need to be accurate, they need to be consistent.

Our velocity corrects for the inaccuracy of our estimates. As long as engineers are consistently wrong, the velocity will ensure that releases can be planned and executed smoothly. If releases start to become increasingly bumpy and unpredictable over time, there’s a good chance that your consistency is to blame. Poor consistency means rocky, less predictable releases, which in turn can lead to incurring technical debt in order to meet release commitments.

Here are five things to watch out for within your team. These are inhibitors to consistency that typically will plague a seasoned agile team, not a new team transitioning into agile.

1. Lack of Analogy

Estimates on stories are meant to be relative to other estimates on stories. Whenever engineers think of an estimate, they need to be comparing the story to other stories in the past in order to evaluate consistently.

Whenever the engineers put, say, 3 points on a story, find a 1-point story nearby and ask “Alright, is this 3 times as complicated as this earlier story?” This is called triangulation, and it’s vital to ensuring consistency. Do this periodically throughout planning, particularly for stories that generate a lot of discussion and disagreement.

It’s helpful to have a set of stories close at hand to make this comparison. I recommend creating a point board in whatever room is used for planning. On that board, list a handful of previous stories and the estimates that were given to them. Try to have a story or two for each possible estimate. When you are planning, you can say “we gave this a 5, which is the same thing we gave story so-and-so according to the point board. Are these approximately the same in complexity?”

How To Fix: Triangulate regularly, keep a point board.

2. False Precision

It takes a certain level of arrogance to decide you can make a computer do anything you want, so it’s no surprise that engineers have a natural hubris. Unfortunately, this hubris can often mean that engineers think they have a better understanding of things than they really do. A good indicator that your team is overconfident in its analytical skills is that their estimates are unnaturally precise.

Do discussions like “well, I don’t think that story is really a 3. Maybe a 2.5,” sound familiar? If so, your team is working with a fictional level of precision. The difference between a 3 and a 2.5 is smaller than the natural margin of error for something so inexact as story point estimates. If an engineer starts arguing that story should have a 4 since a 3 is too low and a 5 is too high, there’s a good chance the engineer is treating points with a level of a precision that they simply do not carry. If that engineer were in a sour mood, they may have said a 5 instead.

You want your points to be forced into buckets – buckets with wide enough gaps between them that there is no need to quibble over insignificant differences. Mike Cohn recommends either a fibonacci-like sequence (0, 1, 2, 3, 5, 8, 13, 20, 40, 100) or a base-2 sequence (0, 1, 2, 4, 8, 16, 32, 64, 128) to ensure that the buckets are reasonable.

Seasoned agile teams may find themselves eventually estimating within a very small range of possible values and making up for the loss of range with an increase in precision. Don’t buy into it, the precision is imaginary.

How To Fix: Pick a sequence and stick to it.

3. Groupthink

Over time, your agile team will increasingly gel together. This is normally a good thing, but it can have negative consequences if the team eventually starts to act and think as a pack.

For a group to be smarter than the individuals in the group, it has to be diverse, independent, and decentralized. The longer an agile team works together, the more trust they build together (that’s good). The more trust they build, however, the more they start to think that a subset of the group is a good representative of the entire group (that’s bad). When only part of the group is consulted for estimates, you lose the diversity and decentralization properties of the team, and as a result your estimates become less consistent.

Planning Poker Deck

Planning Poker Deck

It’s also very common for seasoned teams to tire of using planning poker cards, instead electing to hold fingers in the air. This simply makes it too easy to move as a group. If you hold your fist in the air with the intention of throwing a 3, but notice that another team member throws a 2, it may affect your vote, even if you are not aware of it. Furthermore, using fingers instead of cards makes it difficult to estimate at an 8 or above (as soon as someone lifts both arms, you know they aren’t going to throw a 5).

How To Fix: Always estimate with the entire team, even if it’s a little inconvenient. Use planning poker cards instead of fingers (or at least have everyone close their eyes when throwing their estimates).

4. Mental Drain

The longer a meeting goes, the less able people are to maintain a consistent level of focus. A story estimated towards the end of planning simply will not get the same consideration as one towards the beginning. You want to limit the amount of time planning takes without limiting the amount of good, useful discussion that takes place during the meeting.

Handily, there are diminishing returns on conversations. The first few minutes of a discussion will reveal the bulk of information that engineers should consider. A few more minutes reveal more still. But after that, people are just going to be repeating what they said earlier. Cut the conversation off and move on.

I recommend playing planning poker twice, separated by a sand timer. After the first set of estimates are made on a story, start the timer (you can find these at hobby shops and game stores). When the time is up, play another round of planning poker. If the team still can’t agree, it should shake confidence in the team’s estimate as a whole, and you should take the largest estimate (not the average).

How To Fix: Play two rounds of planning poker and use a timer to limit discussion.

5. Large Stories

It’s easier to wrap your brain around smaller things than larger things. Next time you have a large story and decide to break it into smaller stories, see how many points fall out? Does your 13-pointer break into two 3’s, two 2’s, and three 1’s? Probably not. In all likelihood, your 13-point story, when broken into smaller stories, becomes something closer to a 20-pointer. Not only is this a problem for scheduling, it is a good indicator that the engineers aren’t considering all of the angles when estimating the larger story.

It’s not the 20 points worth of small stories that are wrong, it’s the 13 points initially. It’s simply too hard to consider everything when estimating large stories. Product Owners need to understand this and break stories up as small as possible BEFORE engineers estimate them. Once the number has been assigned to a large story, it affects the points given to the smaller stories that break out of it. This is called anchoring, and it’s a well-known cognitive bias.

Encourage your product owners to break stories up before they are estimated.

How To Fix: When a story ‘feels’ too large, break it into smaller stories before any estimates are given.

Wrapping Up

These problems are symptomatic of a seasoned agile team that is falling into a rut. Being a successful agile team requires work; it requires a continuous level of self-assessment and self-improvement.

Engineers have a duty to be as consistent as possible with their estimates so that their estimates can be used to create predictable, smooth releases. If your team starts slipping in this regard, try to figure out if you are making any of the mistakes listed here.


Star Trek 1-10: An Outsider’s Perspective

For whatever reason, Star Trek was just not something I got into as a kid. Sure, I was a geek, and lots of geek friends loved Star Trek, but it never appealed to me. I liked Star Wars, but I was generally into a different kind of sci-fi (anything dealing with robots, time travel, and superheroes). Space just wasn’t typically my thing.

I had managed to avoid Star Trek almost my entire life. Some friends wanted to see Star Trek Insurrection when it came out in the theater, so I went with them, but I didn’t really know what was going on or who anyone was, and I found the movie to be pretty damn boring, so I wrote Star Trek off.

When the new movie came out, some coworkers organized a trip to see it and I joined. I thought it was pretty good, but I won’t really be discussing it much. After seeing it, however, I felt like I was missing something due to having never seen the other movies, so I decided to go back and watch the first ten movies. A Trek-a-thon.

In this post, I’d like to discuss my thoughts on Star Trek, speaking as an outsider that never watched the television show, and had avoided nearly all things Trek my entire life.

Continue reading ‘Star Trek 1-10: An Outsider’s Perspective’ »