Code Coverage Has a Blind Spot
Here’s the coverage report on a recent PR of mine:
All modified and coverable lines are covered by tests ? Comparison is base (a765aef) 85.95% compared to head (fe02e1b) 85.95%. Additional details and impacted files @@ Coverage Diff @@ ## master #546 +/- ## ========================================= Coverage 85.95% 85.95% Complexity 3480 3480 ========================================= Files 230 230 Lines 8225 8225 Branches 960 960 ========================================= Hits 7070 7070 Misses 865 865 Partials 290 290
Precisely identical. What happened? Did I change a comment? Well, no. In fact I added tests for
situations that were not currently covered, so why didn’t coverage increase?
The two new tests covered exceptional cases that Codecov doesn’t see because they test different arguments that can be passed to a public method. Code coverage percentage check line and sometimes branch coverage, but they have a blind spot for the much larger number of permutations of input values.
In this case those arguments are null, and the tests cover null pointer exceptions, but really they could have been any different arguments to the method not already covered. Real coverage increased but the metrics don’t show it.
This is another example of why it’s important to write the tests first. Relying on tools to identify cases that aren’t covered misses a lot. Without delving too deep into the commit history, I’d bet that what happened here is that someone changed the behavior of an existing method. The existing tests failed, and they commented them out rather than understanding the change in behavior and deciding whether it made sense.
What they should have done is written a test for the desired new behavior or changed the existing tests to test for the new behavior. Of course, if the new behavior wasn’t desired then they should have instead reverted it, or fixed their new code so the tests didn’t break. Again, without delving into commit history, I’m not sure if this change was intended or not. For now, I’m sticking with the current behavior since it seems reasonable. But test first changes, or at the very least not commenting out the test failures, would have alleviated this uncertainty.
There are opportunities for code coverage tools to do better, though I don’t yet know of any that do. For instance, if a method is declared to throw a certain exception, or has an @throws
Javadoc tag indicating a certain runtime exception, the tool could check that the exception was indeed thrown at some point during the tests.
Similarly it could inspect the arguments passed to a public method during a test and warn if the method only received positive numbers, or non-null objects, or non-empty strings, or whatever. It could even check that common edge conditions like 1, -1, 0, Inf, NaN, and Integer.MAX_INT
are covered.