Ian Cooper

Sponsors

The Lounge

Advertisement

Images in this post missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at imagehelp@codebetter.com
TDD and Hard to Test Areas, Part1

 

TDD and Hard-To-Test Areas

I wanted to talk about the issues that people get when they begin working with TDD, the same issues that tend to make them abandon TDD after an initial experiment. Those are the 'hard-to-test' areas, the things production code needs to do, that those presentations and introductory books just don't seem to explain well. In this post we will start with a quick review of TDD, and then get into why people fail when they start trying to use it. Next time around we will look more closely at solutions.

Review

Clean Code Now

TDD is an approach to development in which we write our tests, before writing production code. The benefit of this are:

  • Tests help us improve quality: Tests give us prompt feedback. We receive immediate confirmation that our code behaves as expected. The cheapest point to fix a defect is at the point you create it.
  • Tests help us spend less time in the debugger. When something breaks our tests are often granular enough to show us what has gone wrong, without requiring us to debug. If they don’t then we probably don’t have granular enough or well-authored tests. Debugging eats time, so anything that helps us stay out of the debugger helps us deliver for a lower cost.
  • Tests help us produce clean code: We don’t add speculative functionality, only code for which we have a test.
  • Tests help us deliver good design: Our test proves not just our code, but our design, because the act of writing a test forces us to make decisions about the design of the SUT.
  • Tests help us keep a good design: Our tests allow us to refactor – changing the implementation to remove code smells, while confirming that our code continues to work. This allows us to do incremental re-architecture, keeping the design lean and fit while we add new features.
  • Tests help to document our system: If you want to know how the SUT should behave examples are an effective means of communicating that information. Tests provide those examples.

Automated tests lower the cost of performing these tests. We pay a cost once, but because we can then re-run our tests at a marginal cost they help us keep those benefits throughout the system lifetime. Automated tests are ‘the gift that keeps on giving’. Software spends more of its life in maintenance than in development, so reducing the cost of maintenance lowers the cost of software.

The Steps

The steps in TDD are often described as Red-Green-Refactor

 Red: Write a failing test (there are no tests-for-tests, so this checks your test for you)

Green: Make it pass

Refactor: Clear up any smells in the implementation resulting from the code we just added.

Where to find out more

Kent Beck’s book Test-Driven Development, By Example remains the classic text for learning the basics of TDD.

Quick Definitions

System Under Test (SUT) – Whatever we are testing, this may differ depending on the level of the test. For a unit test this might be a class or method on that class. For acceptance tests this may be a slice of the application.

Depended Upon Component (DOC) – Something that the SUT depends on, a class or component.

What do we mean by hard-to-test?

The Wall

When we start using TDD we rapidly hit a wall of hard-to-test areas. Perhaps the simple red-green-refactor cycle gets begins to get bogged down when we start working with infrastructure layer code that talks to the Db or an external web service. Perhaps we don’t know hot to drive our UI through a xUnit framework. Or perhaps we had a legacy codebase, and putting even the smallest part under test quickly became a marathon instead of short sprints.

TDD newbies often find that it all gets a bit sticky, and faced with schedule pressure, drop TDD. Having dropped it they lose faith in its ability to deliver for them and still meet schedule pressure. We are all the same, under pressure we fall back on what we know; hit a few difficulties in TDD and developers stop writing tests.

The common thread among hard-to-test areas is that they break the rhythm of development from our rapid test and check-in cycle, and are expensive and time-consuming to write. The tests are often fragile, failing erratically and difficult to maintain.

The Database

  • Slow Tests: Database tests run slowly, up to 50 times more slowly than normal tests. This breaks the cycle of TDD. Developers tend to skip running all the tests because it takes too long.
  • Shared Fixture Bugs: A database is an example of a shared fixture. A shared fixture shares state across multiple tests. The danger here is that Test A and Test B pass in isolation, but running Test A after test B changes the value of that fixture so that the other test fails unexpectedly. These kinds of bugs are expensive to track down and fix. You end up with a binary search pattern to try and resolve shared fixture issues: trying out combinations of tests to see what combinations fail. Because that is so time consuming developers tend to ignore or delete these tests when they fail.
  • Obscure Tests: To avoid shared fixture issues people sometimes try to start with a clean database. In the setup for their test they populate the Db with any values they need, and in the teardown clean them out. These tests become obscure, because the setup and teardown code adds a lot of noise, distracting from what is really under test. This makes tests hard to read as they are less granular, and thereby harder to find the cause of failure in.  The Db setup and teardown code is another point of failure. Remember that the only test we have for out tests themselves is to write a failing test. Once you get too much complexity in your test itself it can become difficult to know if your test is functioning correctly.  It also makes them harder to write. You spend a lot of time writing setup and tear down code which shifts your focus away from the code you are trying to bring under test, breaking the TDD rhythm.
  • Conditional Logic: Database tests also tend to end up with conditional logic – we are not really sure what we are going to get back, so we have to insert a conditional check to see what we got back. Our tests should not contain conditional logic. We should be able to predict the behavior of our tests. Among other issues, we test our tests by making them fail first. Introducing too many paths creates the risk that the errors are in our test not in the SUT.

The UI

  • Not xUnit strength: xUnit tools are great at driving an API, but are less good at driving a UI. This tends to be because a UI runs in a framework that the test runner would need to emulate, or interact with. Testing a WinForms app needs the message pump, testing a Web Forms app needs the ASP.NET pipeline. Solutions like NUnitAsp have proved less effective at testing UIs than scripting tools like Watir or Selenium, often lacking support for features like JavaScript on pages.
  • Slow Tests: UI tests tend to be slow tests because they are end-to-end, touching the entire stack down to the Db.
  • Fragile Tests: UI tests tend to be fragile, because they often fall foul of attempts to refactor our UI. So changing the order and position of fields on the UI, or the type of control used will often break our tests. This makes UI tests expensive to maintain.

The Usual Suspects

We can identify a list of the usual suspects, who cause issues for successful unit testing.

  • Communicating Across a Network
  • Touching the File System
  • Requires the Environment to be configured
  • An out-of-process call (includes talking to Db)
  • UI

Where to find out more

XUnit Patterns: Gerard Meszaros' site and book are essential reading if you want to understand the patterns involved in test-driven development

Working with Legacy Code: Michael Feathers' book is the definitive guide to test-first development in scenarios where you are working with legacy code that has no tests.

Next time around we will look at how we solve these issues.

 


Posted Mon, Jul 7 2008 4:10 PM by Ian Cooper
Filed under: ,

[Advertisement]

Comments

Martin Laufer wrote re: TDD and Hard to Test Areas, Part1
on Mon, Jul 7 2008 12:30 PM

Seems to me, as all non-pure fucntional areas are hard to test?

DotNetKicks.com wrote TDD and Hard To Test Areas
on Mon, Jul 7 2008 12:43 PM

You've been kicked (a good thing) - Trackback from DotNetKicks.com

Ian Cooper wrote re: TDD and Hard to Test Areas, Part1
on Mon, Jul 7 2008 12:52 PM

@Martin

Once we step outside the domain and start to deal with infrastructure code, then yes things start to get harder. That is why separation of concerns is so valuable, because we can reduce the pain areas around the domain.

Rolf Eleveld wrote re: TDD and Hard to Test Areas, Part1
on Mon, Jul 7 2008 2:21 PM

Martin,

I see exactly where you're going with this, and I seem on the wagon that when you’re developing Web Parts for WSS or for Office, BizTalk, Dynamics, Third Party Products, etc. You end up with wrapping their code in a layer for the product you're building just so you can actually test your code. Effectively adding one more layer of abstraction that adds an extra piece of effort and possible faults. I have not found a structural way to effectively test software that uses these server software and is hosted there-in. If you've thought up a clean way to prove that your code does work as designed and it's not an idiosyncrasy of the hosted software you could make me a happier man!

Regards,

Rolf

» TDD and Hard to Test Areas, Part1 A System Of A Down: What The World Is Saying About A System Of A Down wrote » TDD and Hard to Test Areas, Part1 A System Of A Down: What The World Is Saying About A System Of A Down
on Mon, Jul 7 2008 4:14 PM

Pingback from  » TDD and Hard to Test Areas, Part1 A System Of A Down: What The World Is Saying About A System Of A Down

Reflective Perspective - Chris Alcock » The Morning Brew #131 wrote Reflective Perspective - Chris Alcock » The Morning Brew #131
on Tue, Jul 8 2008 3:18 AM

Pingback from  Reflective Perspective - Chris Alcock  » The Morning Brew #131

Ian Cooper wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 4:02 AM

@Rolf, @Martin

You're both hitting the biggest problem with frameworks, in that they are too often intrusive into your application. They cloud your domain and often requires you to spin up the framework itself to test your domain model.

I'm not going to give you an answer that does not involve abstracting yourself from the pain, but be aware that the reason why the alt.net community pushes back against tool sets like the Entity Framework is exactly this problem of lack of a good separation of concerns.

Niki wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 4:07 AM

I think we have to accept that some things can't be unit-tested. Imagine for example you had to find tanks in hires-sattelite images. A requirement would be that the detection rate must be >90%, the the fals detection rate <1%. How do you test this requirement? To get any statistically meaningful results, you'd have to test at least hundreds, better thousands of sample images, so your unit-tests would take literally hours to run. Even worse, your algorithm would probably need some parameters like thresholds, regions of interest, that the end-user has to enter, depending on what her image quality is like and what she is looking for. So every time you improve the algorithm (e.g. switching from an absolute threshold to an adaptive threshold), you'd have to find the optimal parameter set for your test-images again and change your unit-tests.

Of course you could do some basic smoke tests (e.g. does your algorithm crash for random images/random parameters?), and you might test the functionality of parts of the algorithm (e.g. a thresholding function), but I don't think you can automate the tests for the actual functionality the user is interested in.

Niki wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 4:08 AM

I think we have to accept that some things can't be unit-tested. Imagine for example you had to find tanks in hires-sattelite images. A requirement would be that the detection rate must be >90%, the the fals detection rate <1%. How do you test this requirement? To get any statistically meaningful results, you'd have to test at least hundreds, better thousands of sample images, so your unit-tests would take literally hours to run. Even worse, your algorithm would probably need some parameters like thresholds, regions of interest, that the end-user has to enter, depending on what her image quality is like and what she is looking for. So every time you improve the algorithm (e.g. switching from an absolute threshold to an adaptive threshold), you'd have to find the optimal parameter set for your test-images again and change your unit-tests.

Of course you could do some basic smoke tests (e.g. does your algorithm crash for random images/random parameters?), and you might test the functionality of parts of the algorithm (e.g. a thresholding function), but I don't think you can automate the tests for the actual functionality the user is interested in.

Niki wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 5:41 AM

Sorry for the double-posting, I got a server error message when I posted the first time. Can someone maybe delete this?

Dew Drop - July 8, 2008 | Alvin Ashcraft's Morning Dew wrote Dew Drop - July 8, 2008 | Alvin Ashcraft's Morning Dew
on Tue, Jul 8 2008 7:32 AM

Pingback from  Dew Drop - July 8, 2008 | Alvin Ashcraft's Morning Dew

Ian Cooper wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 9:44 AM

@Niki

Be aware though that there is a difference semantically between a unit test - which tests a small unit of code - and an acceptance test - which confirms the software meets the user's requirements. While I expect you could unit test your algorithm was correctly implemented, you would still want some sort of data-driven acceptance tests to confirm the quality of that algorithm.

Which agrees with what you are saying but introduces differentiating terminology.

Kevin Kerr wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 12:48 PM

I'm glad you touched upon the 'shared fixture' concept. I had never heard of that, yet it is purposely designed in to my test GUI/framework.

Niki wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 1:22 PM

@Ian Cooper:

Interesting thought. So what would a unit-test for an image-processing example like that look like? At some point in a software like that, you'd have a function that takes an image and says "yes" or "no" (or "tank" or "no tank"). A unit-test for this function would have to call it with some input and assert some kind of output. But what would that assertion be? What this function does "under the hood" is defined nowhere, and will probably change every time the algorithm is improved, so testing that doesn't really help a lot. (At least that's the way I see it: The point of a unit-test is that it tells you if your code is still working after a change. So a unit-test that will fail after almost every change is not a great help.)

Another problem with this kind of software is that you don't really know the optimal algorithm from the start - you have to test different approaches and compare them statistically to see which solves your problem best. That's quite contrary to the TDD-approach, where have to know what results you expect from your code before you write it.

Ian Cooper wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 2:43 PM

@Niki

I'm not sure I know enough about image processing to answer this :-)

But I would assume that the image is set of data and our algorithm searches it for patterns that match, and presumably tries to remove false positives by looking at the nearby data at the same time. Our unit test is  there to confirm that we have coded the algorithm correctly, so given this test data that should produce a hit,  based upon the heuristics used in this algorithm, have we implemented it correctly.

Now I am assuming you are not trying to discover an algorithm for this (which seems to be  more  of a maths problem). Could you discover an algorithm using TDD. Sure. Would it be any good. Only acceptance testing would tell you over a large enough data set. Would it be a good way to uncover such an algorithm. I don't know, because I don't know enough about how researchers in that field usually uncover their algorithms. But could you. For sure. Should you. I suspect there might be a more effective technique. But that's why agile has the domain expert. But would I code up the implementation via TDD, For sure.

Niki wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 3:32 PM

@Ian Cooper:

My problem is that the usual development process in this area is completely different: You usually can't derive an optimal image processing algorithm on a whiteboard using only maths and then implement it. Also, you don't have a "domain expert" that can tell you what the algorithm should do just by thinking hard. (I'm just guessing, but I can imagine this might be similar for domains like natural language processing or speech recognition.) The more common approach is that you start with a simple algorithm that does more or less "what you think your eyes are doing", then see where it doesn't work so well, and see if you can improve it e.g. by using pre-filters, or by modifying parts of the algorithm, or by trying a completely different approach. Once you're at a point where the recognition results of your algorithm are good enough, you're done. There is no "implementation" step after that, you already had to do the whole implementation to see _if_ the algorithm is good enough. You had to test it on thousands of images, or maybe even test it in producation before you can be really sure about that.

Imagine for example that you start by applying a contrast filter, then searching for the brightest pixel. Of course you could write tests for this before you implement it, but those tests would probably be harder to write and more error-prone than the actual code (you're just calling two library functions FilterContrast and FindBrightestPixel or something, but for the tests you would actually have to calculate the correct results by some different means). And as soon as you see that this first algorithm isn't good enough, you'll have to throw that test away. So, yeah, you could probably do it that way. It just doesn't seem to be very useful.

ALB wrote re: TDD and Hard to Test Areas, Part1
on Tue, Jul 8 2008 5:33 PM

Mock frameworks such as EasyMock are a good starting point to a general solution

Ian Cooper wrote re: TDD and Hard to Test Areas, Part1
on Wed, Jul 9 2008 4:44 AM

@Niki

The key to agile approaches is short feedback loops. design a bit, test a bit, design a bit, test a bit. Putting the test first comes from the idea of eliminating waste by doing some design first, based on hard user requirements driven from stories by tests.

But if you can't do any design first, then you can't. In that case the important thing would be to keep your feedback loop short, which it sounds as though you are doing anyway. So I think this is an agile approach to algorithm discovery, even if not a TDD one.

Make sense?

Richard's Rant wrote Links for Tuesday 15 July 2008
on Mon, Jul 14 2008 6:25 PM

David Cumps has a nice series on Design Patterns Dan Lewis has a post on IE and SharePoint and security

Barry Dahlberg wrote re: TDD and Hard to Test Areas, Part1
on Tue, Aug 5 2008 5:39 PM

I had a rant about exactly these issues from a web perspective a few days ago:

www.genericerror.com/.../unit-tests-vs-productivity-right-vs.html

Seperation of concerns is fine but when the parts you can't test are bigger than the parts you can there is a problem.

Jeremy Gray wrote re: TDD and Hard to Test Areas, Part1
on Tue, Aug 19 2008 12:35 AM

@Niki - I think the situation you just described is exactly _why_ you want test automation, whether in the form of a strict unit test or an integration or acceptance test. If you have a known set of inputs, each with their known desired result, automated execution of your evolving algorithm against that large set of input is exactly what is going to tell you whether or not any one or series of algorithm changes are heading in the right direction. True, you may run your more frequent during-active-development test cycle using a smaller number of inputs to speed things up, but who cares how long the formal run takes: you want to run as many known inputs through it as possible, and you want to do so on an automated basis.

Ian Cooper [MVP] wrote TDD and Hard To Test Areas, Part 2
on Wed, Sep 10 2008 5:02 PM

It&#39;s been a while. I have been heads down on a new project (more about that some other time), and

Community Blogs wrote TDD and Hard To Test Areas, Part 2
on Wed, Sep 10 2008 5:42 PM

It&#39;s been a while. I have been heads down on a new project (more about that some other time), and

Mirrored Blogs wrote TDD and Hard To Test Areas, Part 2
on Wed, Sep 10 2008 6:00 PM

It&#39;s been a while. I have been heads down on a new project (more about that some other time), and

TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas wrote TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas
on Wed, Sep 10 2008 10:15 PM

Pingback from  TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas

Tony Morris wrote re: TDD and Hard to Test Areas, Part1
on Thu, Sep 11 2008 4:21 AM

Just control your side-effects already. Your discussion is only relevant when using languages for people who are afraid of abstraction and high-level programming (and so perpetuate - almost by mandate - their poor technique, which is the core problem, not the language). Not trying to be mean or anything, but I strongly recommend you stop explaining away the symptoms and providing snake oil solutions and solve the *actual problem*.

Colin Jack wrote re: TDD and Hard to Test Areas, Part1
on Thu, Sep 11 2008 7:55 AM

@Tony

What exactly are you suggesting, Ruby?

Colin Jack wrote re: TDD and Hard to Test Areas, Part1
on Thu, Sep 11 2008 8:15 AM

@Tony

Sorry just re-read your point, your not suggesting Ruby at all :)

TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas wrote TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas
on Sat, Sep 13 2008 8:55 PM

Pingback from  TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas

TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas wrote TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas
on Mon, Sep 15 2008 9:04 AM

Pingback from  TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas

TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas wrote TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas
on Tue, Sep 16 2008 9:09 AM

Pingback from  TDD and Hard To Test Areas, Part 2 - taccato! trend tracker, cool hunting, new business ideas

Add a Comment

(required)  
(optional)
(required)  
Remember Me?