• Unit tests (produced in TDD style) too often concentrate on testing classes. They should be targeting module public interfaces/behavior of your software.
  • Taking into account that you target implementation details (read: some internal classes), you couple your test code heavily to the implementation.
  • The above problems result in a higher cost of refactoring because each time you do even minor refactoring, you have to fix tests.

And his recommendation is to test only behavior/module public API. If you had to write some tests around some (internal) functionality, you should delete it afterward.

The rest of this article will be an act of me spitting against the wind. At least, It definitely feels like it, trying to argue against somebody who spent a decade researching and talking about this topic and read all foundational books by Kent Back, and Martin Fowler back and forth multiple times.

My overall feeling about the talk. I think the arguments are pretty strong.

  • Yes, if you implement unit tests for some internals, then these tests will be coupled to the internals.
  • Yes, if you refactor the code, you need to change your tests.
  • Yes, it often leads to (a lot of) mocking and overly complex tests. (That being said, this also is probably a good indicator that the design in this area is not great.)

He is heavily (really heavily) concentrated on just two aspects:

  • spending time on fixing tests while refactoring
  • comparison to people who ship code without tests (that they can ship way faster than people who write tests).

However, it also feels like he is trying to put a carriage before the horse. Somehow, things that are truly crucial (like quality or where we spend most of the time while developing) are mostly left out.

None
Photo by Clark Van Der Beken on Unsplash

Before trying to retort his arguments directly. Let me tell you two stories (and people who know me well have heard these stories, probably many times before)

Story #1. Paranoia

It's probably not a story but rather an overarching theme.

I touched networking programming for the first time about six years ago, and I learned fast that literally each and any line of networking code is out to get ya.

You probably think that I am joking or paranoid.

Not even a little bit. Each error, each crazy edge case, and each condition that can happen — WILL happen. Your smallest mistake or misstep (in handling them) will find you and will bring production down at 3 am in the morning. And you and your team will be fixing and fixing and fixing things (if you don't lock it all down very early).

Sure, if you build casual games. Nobody cares if it crashes once in a blue moon.

However, If you are building networking code and, let's say, a million people can't do THEIR job when your code fails, somehow you become paranoid really fast.

Story #2. Buy cheap, buy twice.

At one of the places where I worked, there was literally a function with two lines of code without tests. Some code (that was using it) was pushed to QA, and QA went down like a ton of bricks. We went through a rollback procedure and spent half a day figuring out what exactly happened. The function was fixed (still no tests). It was deployed to QA, and it went down again. This time, the investigation was shorted. Finally, it was fixed a second time, and this time, it worked.

In total, two people pretty much spend the whole working day fighting with these two lines of code. If there were tests covering it, we would have spent zero time on it. It would have been fixed way before it reached QA (or even the main branch).

Side note: It takes me (on average) about 10–15 minutes to write a test. So, in the time we spent, we could have easily written 70–90 tests, which would have covered way-way-way more code than these mere two lines of production code.

BTW. I do realize that Ian is not arguing for not writing tests. However, he is arguing for deleting tests to save some small amount of time (of updating them in the future). And this leads me to a question. Will we have to pay twice?

Direct arguments

Will all execution paths be covered?

It's all fun and games (to rely only on module public API testing) until you get to a place where these tests cover only 80% of the code and 60% of branches. And if you get there, I will just sit back, light a cigar, and wait until your production looks like that:

None

I should admit it's possible to have good test coverage only via testing public module API/behavior. However, it's possible mostly for "shallow" modules. I call code shallow, where there is a very limited call depth (maybe 1–2 calls) and very few conditional statements in it.

As soon as code becomes "deep," it will get exponentially harder to test things purely based on module public APIs.

BTW. I can see the counter-argument. Well, if the depth of the code is the problem, let's make all modules shallow. (And if you squint your eyes a bit, you can read this as "go build microservices"). I totally agree you can do that. However, the further you go in this direction, the closer you get to a place where you get back to square one (testing each and everything thing via unit tests) because things that were implementation details before are now a separate module and you are testing them because they are now a public API for this module.

Zero play in the system

Following up on the previous theme. I really like systems with zero play (meaning that they behave exactly the way you planned from the top to the bottom).

You saw two stories that I mentioned. I am worried about allowing any play in the system's guts. Removing the tests for implementation opens the opportunity for future changes which will introduce this play.

Additionally, how the hell does anybody know that you had the tests for implementation detail in the first place? Somebody sends a PR without tests for implementation details, and now you wonder whether they truly made sure that everything in the guts works correctly or just swung it?

So, we are getting really fast to the question of whether you can have great test coverage only via module public API.

BTW. I wrote about this before in the article 100% code coverage is not enough. This article had tons of arguments applicable here — complexity of testing permutations, writing high-level (vs. low-level tests), and so on.

I have a litmus test. If you can break something in the code and with the test you can catch, and without this test, you can't catch it, it's a really good indicator that some test is needed there.

Don't confuse hard, time-consuming, and annoying.

None
Photo by Егор Камелев on Unsplash

I think we engineers tend to confuse these three things.

Proving Ferma's theorem is hard (extra-extra hard with a cherry on the top). Moving from one house to another is time-consuming. And having a pebble in your shoe is annoying.

So, let me ask you a question. If you do some refactoring (especially small ones), it breaks, let's say five tests. Is this a hard, time-consuming, or annoying problem?

Pause here for a second.

How long does it actually take you to go in these five tests, look them over, and update one assertion in each test? Compare it with the time you spend on researching how to build some new functionality or the time that you spend debugging some crazy issue happening in production.

Don't get me wrong. I am all for getting annoyances out of the way. However, I wish we had the same rigor to try to figure out how to make hard problems easier or time-consuming problems less time-consuming.

I think you should fight annoyances to make your work smoother. However, I am not sure I want to throw a baby with bath water here.

Tests are more than just tests.

Tests help to explain how to use some specific piece of code. What is expected, and what is unexpected? Yes, the most important definition is the external behavior of the system.

However, when you look at some code in the system's guts, you still need to understand it. Do I need to call this one or twice? Is it ok to call it twice? It accepts a string. What the hell should be in this string?

Having unit tests around the code makes it easy (almost trivial) to answer this question (even after the original engineer has long left the company). Removing them eliminates this source of information.

I am not saying that it's the only value of unit tests, and they should be used instead of documentation. However, they often can give you a way more nuanced view.

Intermediate summary

Yes. I agree with the argument that tests are coupled to implementation, and it slows things down when you change the implementation.

However, I completely disagree with the approach. You are solving some mid-size annoyance by increasing the chance of quality going down.

What is the business requirement, and what is the implementation?

I think the actual problem is that he barely touched upon in the video. How do you distinguish tests concentrating on the business requirements/public interfaces vs. tests covering implementation details?

The earlier one (business requirements or public interfaces) should be changed only after careful discussion, explicit acknowledgment, and some plan around it.

Later ( tests covering implementation details) could be changed on a whim along with the changes of implementation details. If it's easier for you, you can delete old tests and write new ones or modify existing ones. They are way more malleable.

Summary

I think we (in engineering) have a tendency to overconcentrate on some aspects, so much that we forget the bigger picture.

I think the true North Star is

  • How to have a high-quality product
  • How to be able to deliver new functionality (with high quality) fast
  • How to do these two things in the long term.

Each of these things has tons and tons of subproblems/complexities and questions involved.

It feels to me that "I refactored the code, and it broke tests" barely registers in the overall scheme of things.

I think "Which code and tests can I change without breaking a contract?" is way-way higher on my list.

P.S. Here is a great article by Clay Shentrup about Tests and dependencies touching on pretty much the same subject.

·