Testing code

1. References and quotes

1.1. UnitTest (Martin Fowler)

https://martinfowler.com/bliki/UnitTest.html

A more succint explanation (but lacking examples) of the "mockist" versus "classic" style of doing unit tests. The examples (and a lot more elaboration) is covered in "Mocks Aren’t Stubs".

1.2. Mocks Aren’t Stubs

Mocks Aren’t Stubs, by Martin Fowler.

1.3. Test-induced design damage

https://dhh.dk/2014/test-induced-design-damage.html

One conclusion of this is that I think it’s a mistake to try to unit test controllers in Rails (or similar MVC setups). The purpose of the controller is to integrate the requests from the user with the response from the models within the context of session.

Mocking all that away to test just whether it’ll send this redirect or that notice doesn’t make the least bit of sense to me. Controllers are meant to be integration tested, not unit tested. But the testing pyramid prescribes that the unit level is where the focus should be, so people are sucked into that by default.

This blog post mentions the "TDD is dead" phrase for probably the first time, and it’s the one that spurred the video chat of DHH with Martin Fowler and Kent Beck.

1.4. Is TDD dead?

https://martinfowler.com/articles/is-tdd-dead/

1.5. Test Isolation Is About Avoiding Mocks

https://www.destroyallsoftware.com/blog/2014/test-isolation-is-about-avoiding-mocks

This is a very interesting article by Gary Bernhardt. Gary taught me the functional core, imperative shell notion, which has been the most enlightening concept that I’ve seen in my life after much frustration doing unit testing.

Gary has explained how it’s possible to make testable code without mocks and stubs, so when I read the article I was sort of assuming that he would be positioned in one side, but his position seems with much more nuance than I imagined, which I find interesting.

It’s bad test design. Those nested stubs are telling us something about the method under test: it reaches deep into its user argument. The code under test can only traverse data that the test creates for it, so deep traversal of objects in the production code leads to deeply nested mocks in the tests. […]

[…] That class is deeply, but invisibly, coupled to its collaborators. A glance at the isolated test tells us this, but getting that information from the code would require a slow, careful reading.

In addition to avoiding nested mocks, I’ve been using fewer over time, even when I’m writing isolated tests. The old Python system I mentioned had multiple mocks per test on average. Early DAS code written under time pressure averages around one mock per test. Later DAS code is under half a mock per test. Moving into late 2013, all of Selecta’s logic is tested in isolation with no test doubles at all. (That’s Functional Core, Imperative Shell again.)

This post was triggered by Kent’s comment about triply-nested mocks. I doubt that he intended to claim that mocking three levels deep is inherent to, or even common in, isolated testing. However, many others have proposed exactly that straw man argument. That argument misrepresents isolated testing to discredit it; it presents deep mocks, which are to be avoided in isolated testing, as being fundamental to it; it’s fallacious. It’s at the root of the claim that mocking inherently makes tests fragile and refactoring difficult. That’s very true of deep mocks, but not very true of mock-based isolation done well, and certainly isn’t true of isolation done without mocks.

In a very real sense, isolated testing done test-first exposes design mistakes before they’re made. It translates coupling distributed throughout the module into mock setup centralized in the test, and it does that before the coupling is even written down. With practice, that preemptive design feedback becomes internalized in the programmer, granting some of the benefit even when not writing tests. There may be other paths to that skill, but I’m still learning from my tests after seven years of isolating around 50% of the time. This path also happens to produce a trail of sub-millisecond tests fully covering every component designed using it, which is alright with me.

1.6. TDD, Straw Men, and Rhetoric

https://web.archive.org/web/20160318065335/https://www.destroyallsoftware.com/blog/2014/tdd-straw-men-and-rhetoric

This post was deleted, but it contains many useful links and resources with a bit of history, which I find quite interesting. The fact that this is a bit of a "flame" between Gary and DHH maybe was the reason why this was removed.

1.7. Do not mock everything

http://jmock.org/oopsla2004.pdf

Don’t use mocks to test boundary objects

If an object has no relationships to other objects in the system, it does not need to be tested with mock objects. A test for such an object only needs to make assertions about values returned from its methods. Typically, these objects store data, perform independent calculations or represent atomic values. While this may seem an obvious thing to say, we have encountered people trying to use mock objects where they don’t actually need to.

From 'Stop Mocking, Start Testing':

When choosing between zero and one mock, try zero!

You don’t even necessarily have to mock every interface that’s used. We believe that if an object type is very cheap, if it’s just a stack allocation, a few storage elements, it’s not anything expensive. A data container that doesn’t have any behavior… You should not be mocking it at all. You should never mock a tuple. And we’ve seen that.

— Augie Fackler and Nathaniel Manista
minute 10:06

1.8. To Kill a Mockingtest

https://www.rea-group.com/about-us/news-and-insights/blog/to-kill-a-mockingtest/

1.9. What does not need to be tested

First, a few quotes from the JUnit FAQ/Best Practices.

Do I have to write a test for everything?

No, just test everything that could reasonably break.

Be practical and maximize your testing investment. […]
How simple is "too simple to break"?

The general philosophy is this: if it can’t break on its own, it’s too simple to break.

First example is the getX() method. Suppose the getX() method only answers the value of an instance variable. In that case, getX() cannot break unless either the compiler or the interpreter is also broken. For that reason, don’t test getX() there is no benefit. […]

And here is the best quote, in form of pseudocode, of how much/little one should test:

becomeTimidAndTestEverything
while writingTheSameThingOverAndOverAgain
    becomeMoreAggressive
    writeFewerTests
    writeTestsForMoreInterestingCases
    if getBurnedByStupidDefect
        feelStupid
        becomeTimidAndTestEverything
    end
end
// The loop, as you can see, never terminates.

Then, some opinions from Software Engineering Stack Exchange on When is it appropriate to not unit test?.

So? You don’t have to test everything. Just the relevant things.

Test the limits and border cases. Test the risky code. Lots of code is simple enough to verify by inspection, although inspecting your code is more error prone than inspecting someone else’s code.

1.10. Unit tests versus other tests

What is an integration test exactly?

It’s not important what you call it, but what it does

One of the answers defines an "integration test" what others would call an "end to end test". But some comments say:

[…] integration tests can also test just part of the system, but more than one piece at a time.

Integration tests certainly do not (and probably should not) test "the system as a whole". Anywhere that two or more components are tested together, in particular when testing against external components (database, network etc.) you are doing integration testing.

2. A simple thought about a TDD problem with C++

Say that you want to write Diceroll::sigma(). It calculates the standard deviation from a dice roll, which is a class calculating things from an input like "2d4+1".

The standard deviation calculation can probably be implemented in a few ways, but say that you reason that first you need to create a helper function called permutations() that will return all the possible permutations of the dice, so you can loop over them. Then that function itself can be written in terms of another helper function called cartesianProduct(). The TDD approach, as explained by Gary Bernhardt, has the quality of the "outside in", or as I would prefer to call it, "top down". This is good from the POV of the "customer": you start with something which is what matters to the business logic, so you can start writing assertions which are valuable to the final product.

The problem is that in Gary’s presentations, he does mock/stub the helper functions right away. He shows how it’s doable in Ruby, where the helper classes don’t even need to exist: they can just be mocked in the unit tests. But we can’t hardly do that in C++.

I realized about this while reading: https://www.reddit.com/r/cpp/duplicates/xas3d2/the_little_things_my_radical_opinions_about_unit/ https://codingnest.com/the-little-things-my-radical-opinions-about-unit-tests/

3. A simple thought about test order

It suggests using all the tests on the same program, and make sure that way that tests actually "are not isolated that well", so some possible accidental interactions ARE found by the test. To solve that the interaction has a positive effect and makes test pass more than they should, the order in which they are run is randomized.

Seems a theoretically good thing to have, but I wonder how TF does it work if you start code review on a change, and you happen to be the lucky person for which a certain randomized order fails, and the failure is entirely unrelated to your changes.

4. A simple example of extracting logic to a helper object

Various simple examples to elaborate on some time:

The GameBrowserResourceFilter from Moebius Toolkit.
The "case insensitive function" from my experiment.