TDD and Hard-To-Test Areas
I wanted to talk about the issues that people get when they begin working with TDD, the same issues that tend to make them abandon TDD after an initial experiment. Those are the ‘hard-to-test’ areas, the things production code needs to do, that those presentations and introductory books just don’t seem to explain well. In this post we will start with a quick review of TDD, and then get into why people fail when they start trying to use it. Next time around we will look more closely at solutions.
Clean Code Now
TDD is an approach to development in which we write our
tests, before writing production code. The benefit of this are:
help us improve quality: Tests give us prompt feedback. We receive
immediate confirmation that our code behaves as expected. The cheapest
point to fix a defect is at the point you create it.
help us spend less time in the debugger. When something breaks our tests
are often granular enough to show us what has gone wrong, without
requiring us to debug. If they don’t then we probably don’t have granular
enough or well-authored tests. Debugging eats time, so anything that helps
us stay out of the debugger helps us deliver for a lower cost.
help us produce clean code: We don’t add speculative functionality, only
code for which we have a test.
help us deliver good design: Our test proves not just our code, but our
design, because the act of writing a test forces us to make decisions
about the design of the SUT.
help us keep a good design: Our tests allow us to refactor – changing the
implementation to remove code smells, while confirming that our code
continues to work. This allows us to do incremental re-architecture,
keeping the design lean and fit while we add new features.
help to document our system: If you want to know how the SUT should behave
examples are an effective means of communicating that information. Tests
provide those examples.
Automated tests lower the cost of performing these tests. We
pay a cost once, but because we can then re-run our tests at a marginal cost
they help us keep those benefits throughout the system lifetime. Automated
tests are ‘the gift that keeps on giving’. Software spends more of its life in
maintenance than in development, so reducing the cost of maintenance lowers the
cost of software.
The steps in TDD are often described as Red-Green-Refactor
Red: Write a failing test (there are no tests-for-tests, so
this checks your test for you)
Green: Make it pass
Refactor: Clear up any smells in the implementation
resulting from the code we just added.
Where to find out more
Kent Beck’s book Test-Driven Development, By Example remains
the classic text for learning the basics of TDD.
System Under Test
(SUT) – Whatever we are testing, this may differ depending on the level of
the test. For a unit test this might be a class or method on that class. For
acceptance tests this may be a slice of the application.
Component (DOC) – Something that the SUT depends on, a class or
What do we mean by hard-to-test?
When we start using TDD we rapidly hit a wall of hard-to-test
areas. Perhaps the simple red-green-refactor cycle gets begins to get bogged
down when we start working with infrastructure layer code that talks to the Db
or an external web service. Perhaps we don’t know hot to drive our UI through a
xUnit framework. Or perhaps we had a legacy codebase, and putting even the
smallest part under test quickly became a marathon instead of short sprints.
TDD newbies often find that it all gets a bit sticky, and
faced with schedule pressure, drop TDD. Having dropped it they lose faith in
its ability to deliver for them and still meet schedule pressure. We are all
the same, under pressure we fall back on what we know; hit a few difficulties
in TDD and developers stop writing tests.
The common thread among hard-to-test areas is that they
break the rhythm of development from our rapid test and check-in cycle, and are
expensive and time-consuming to write. The tests are often fragile, failing
erratically and difficult to maintain.
- Slow Tests: Database
tests run slowly, up to 50 times more slowly than normal tests. This
breaks the cycle of TDD. Developers tend to skip running all the tests
because it takes too long.
- Shared Fixture Bugs: A
database is an example of a shared
fixture. A shared fixture shares state across multiple tests. The
danger here is that Test A and Test B pass in isolation, but running Test
A after test B changes the value of that fixture so that the other test
fails unexpectedly. These kinds of bugs are expensive to track down and
fix. You end up with a binary search pattern to try and resolve shared
fixture issues: trying out combinations of tests to see what combinations
fail. Because that is so time consuming developers tend to ignore or
delete these tests when they fail.
- Obscure Tests: To
avoid shared fixture issues people sometimes try to start with a clean
database. In the setup for their test they populate the Db with any values
they need, and in the teardown clean them out. These tests become obscure,
because the setup and teardown code adds a lot of noise, distracting from
what is really under test. This makes tests hard to read as they are less
granular, and thereby harder to find the cause of failure in. The Db setup and teardown code is
another point of failure. Remember that the only test we have for out
tests themselves is to write a failing test. Once you get too much complexity
in your test itself it can become difficult to know if your test is
functioning correctly. It also
makes them harder to write. You spend a lot of time writing setup and tear
down code which shifts your focus away from the code you are trying to
bring under test, breaking the TDD rhythm.
- Conditional Logic:
Database tests also tend to end up with conditional logic – we are not
really sure what we are going to get back, so we have to insert a
conditional check to see what we got back. Our tests should not contain
conditional logic. We should be able to predict the behavior of our tests.
Among other issues, we test our tests by making them fail first.
Introducing too many paths creates the risk that the errors are in our
test not in the SUT.
- Not xUnit
strength: xUnit tools are great at driving an API, but are less good at
driving a UI. This tends to be because a UI runs in a framework that the
test runner would need to emulate, or interact with. Testing a WinForms
app needs the message pump, testing a Web Forms app needs the ASP.NET
pipeline. Solutions like NUnitAsp
have proved less effective at testing UIs than scripting tools like Watir or Selenium, often lacking support for
Tests: UI tests tend to be slow tests because they are end-to-end,
touching the entire stack down to the Db.
Tests: UI tests tend to be fragile, because they often fall foul of
attempts to refactor our UI. So changing the order and position of fields
on the UI, or the type of control used will often break our tests. This
makes UI tests expensive to maintain.
The Usual Suspects
We can identify a list of the usual suspects, who cause issues for
successful unit testing.
Across a Network
the File System
the Environment to be configured
out-of-process call (includes talking to Db)
Where to find out more
XUnit Patterns: Gerard Meszaros’ site and book are essential reading if you want to understand the patterns involved in test-driven development
Working with Legacy Code: Michael Feathers’ book is the definitive guide to test-first development in scenarios where you are working with legacy code that has no tests.
Next time around we will look at how we solve these issues.