Unit testing, playing tennis, and the lack of absolutes in TDD

I’m not a particularly good tennis player.  Mostly from not having played in years, but even then, I suffered because I can’t hit a decent backhand to save my life.  I have a big serve and a decent control with my forehand, but all you’ve got to do to beat me is put the ball where I have to attempt a backhand.  All of the best competitive tennis players can will choose either the backhand or forehand depending on where on the court they play the ball.

Test Driven Developers need to be the same way in regards to choosing a state based style or interaction based style on every single unit test.  I’ve been disappointed with the recent conversations I’ve seen lately on the subject.  Many people are proudly describing themselves as Classicist or Mockist TDD’ers because they favor either state or interaction based unit tests and see the other style as something to avoid.  Specifically, I’m agitated at the people who are trumpeting themselves as “Classicists” because I feel like they’re doing real harm to the state of TDD practice.  I’ve never met a self-described mockist and I’m tempted to say that the very term is a strawman.  On the other hand, self-described Classicists may be just like me on the tennis court — they’re taking bad shots and bad angles just so they can hit a forehand when a backhand is more appropriate.


Over Specified Test

Many people are concerned about the “over specified test” anti-pattern and rightly blame mocks as a primary source.  So mock objects are evil right?  Survey says:  bzzzt.  An over specified interaction test is generally caused by a fine-grained, chatty API or a lack of encapsulation between a class and it’s dependencies.  In other words, you may be staring at a code smell.  I ran into this lately at work.  I had a Supervising Controller that was issuing a lot of fine-grained commands to its View.  Make the addresses editable, make the address read only, show Canadian address fields, make the stored address dropdown shown for certain customers, and hide the stored address dropdown at other times.  The code became nasty and the unit tests got ugly with runaway expectations.  I refactored the Supervising Controller to a bit of a state machine that could correctly create a “ScreenState” object at any point and rigorously tested that logic.  I then changed the interaction between View and Presenter to just ensuring that the Presenter was correctly passing a new ScreenState to the View upon certain View events.

Some points about over specified interaction tests:

  • Do not try to mock chatty interfaces.  Favor coarse grained API’s that hide more details of the internals of a class’s dependencies.  I wrote a post a couple years back called Best and Worst Practices for Mock Objects.  I read over it this morning and I think it still holds up.
  • When you spot an over specified test brewing in your code, treat it as a code smell and reevaluate your class structure.  Remember that TDD is a DESIGN PROCESS.  If the unit test is going badly, your first assumption should be that the design needs refinement.  Listen to what the unit test is telling you about your code.
  • Be aggressive with dynamic and partial mocks to write smaller, more focused interaction based tests
  • You don’t have to always use ReplayAll() and VerifyAll().  Sometimes you might want to call Verify on only one mock object to create a smaller, more focused unit test


State Based Testing isn’t all that and a bag of chips

I generally tell people that State Based Testing is easier than Interaction Based Testing on the whole — except when it’s not.  Let’s consider this state based test:

public void The_presenter_saves_the_whatsit_if_the_whatsit_is_valid()
	Whatsit model = ObjectMother.ValidWhatsit();
	WhatsitPresenter presenter = new WhatsitPresenter(new StubWhatsitView(), model, new StubWhatsitRepository());
	bool returnValue = presenter.Save();

	// Look ma!  I should have saved the Whatsit in this scenario

public void The_presenter_does_NOT_save_the_whatsit_if_the_whatsit_is_InValid()
	Whatsit model = ObjectMother.InvalidWhatsit();
	WhatsitPresenter presenter = new WhatsitPresenter(new StubWhatsitView(), model, new StubWhatsitRepository());
	bool returnValue = presenter.Save();


When the Save() method of my WhatsitPresenter is called the WhatsitPresenter should perform a validation of its Whatsit member and either use the WhatsitRepository to save the Whatsit, or have the WhatsitView display the validation errors.  That’s the intent of the unit tests above, but how much of that intent is really coming through in these unit tests?  Very little right?  We’re doing a nice simple state based test to check that the Save() method correctly returns a boolean value saying that it did or did not save the Whatsit.  That’s nice, but the real intent behind the test is that the Whatsit was really saved.  This is an example of a bad unit test.  We’re not even testing the real functionality, we’re just testing a side effect of the code by checking the return value.  What we really need to do is to verify that our WhatsitPresenter did or did not send a Save() message to the WhatsitRepository, i.e. we should use some sort of mock object here. 

I use a lot of contrived examples in my blog, but this example was adapted from another blogger.  Sometimes, favoring a state based testing philosophy leads you to useless tests that test through side effects.  Other times, a state based test will cause you to relax encapsulation in a harmful way to make the state based assertions where an interaction based test would maintain encapsulation.  When you play a ball on the backhand side of the court, you better use your backhand.


Bottom Up versus Top Down

Do you start from the “top” and code that against mock objects for the lower level concerns, or do you write the lower level pieces first, then assemble them together to create the aggregate structure?  Which is best?  Well, it depends.  Here’s an easy rule of thumb.  Start with whatever you do know how to do. 

If you know exactly how one or more steps of a complex algorithm should work, build those steps first in isolation.  Building out those steps will often suggest the structure of the coordinating code.  That’s bottom up development. 

At other times, you’ll know the general workflow of the code, but not necessarily know how some of the lower level tasks will be performed.  In this scenario I write the controller type unit tests first and just drop off mock objects as a placeholder for concrete classes to be added later.  This is frequently a great way to work with any variant of the Model View Presenter pattern.  Building the overarching workflow will help define the API to the underlying steps in the algorithm and often give you some insight into how those tasks should be implemented.

I don’t see the usage of acceptance tests to have any impact whatsoever on my choice of bottom up or top down.  Either way, I’m going to break down the feature into a finite number of coding tasks and write fine grained unit tests for those tasks until I feel like it’s appropriate to start running the coarse grained acceptance test.

The beauty of TDD as a design process is the ability it grants you to work on one issue at a time without leaving yourself in the hole to create the other pieces.



You might have a mighty forehand with your state based tests, but there are going to be times when it’s better to swing a backhand stroke and use an interaction based test with a mock object.  My advice is to focus on the goal of each unit test and make sure the unit test is mostly concerned with that goal.  If the goal of a unit test is to ensure that a change of state or a return value is correct, it’s a state based test.  If the goal of a test is to ensure that a class is passing the correct messages to other classes during its internal functioning, you should be writing an interaction test.  Don’t pick sides on the Classicist versus Mockist argument.  It’s a false dichotomy and a harmful line of thought.  You aren’t going into your toolbox at home and throwing out all the close-ended wrenches because you only want to use open-ended wrenches are you?  Same thing applies to mocks.


About Jeremy Miller

Jeremy is the Chief Software Architect at Dovetail Software, the coolest ISV in Austin. Jeremy began his IT career writing "Shadow IT" applications to automate his engineering documentation, then wandered into software development because it looked like more fun. Jeremy is the author of the open source StructureMap tool for Dependency Injection with .Net, StoryTeller for supercharged acceptance testing in .Net, and one of the principal developers behind FubuMVC. Jeremy's thoughts on all things software can be found at The Shade Tree Developer at http://codebetter.com/jeremymiller.
This entry was posted in Test Driven Development. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://www.clariusconsulting.net/kzu Daniel Cazzulino

    Hi Jeremy.

    I just re-read your entry (dunno if it was updated). I’d appreciate if you don’t refer to my blog as the inspiration for the bad example. If you read my entry, you’ll see that it’s an example I took from Rhino Mocks documentation at http://www.ayende.com/Wiki/%28S%28gtsaaq45luwfjj55wlskhgjg%29%29/Default.aspx?Page=Rhino+Mocks+Introduction.

    And the point was showing the awkward record/reply syntax, not about the state vs interaction nature of it.

    Please re-read Fowler’s anecdote on working with an interaction testing-biased developer and see what I mean about thinking about internal implementation details. I’ve seen the same thing.

    And you can go see for yourself whether in practice I really try to force everything into state testing by checking out the code for GeoChat at http://geochat.googlecode.com.

    You might be surprised at the mixture. I DO use the right tool for the job, but I’m VERY aware of the drawbacks of each mode. And *in general* those are much worse in interaction testing than state testing.

    You seem to be asking to never generalize any kind of advice, and do everything on a case by case basis. Guess what, developers NEED guidance which contains SOME genericity. Telling them “it depends” is not helpful. And in that vein, suggesting them the default mindset that is less likely to cause harm is not a bad thing.

  • http://colinjack.blogspot.com Colin Jack

    Its perfectly possible to come up with an OO design not using the mockist approach, if that wasn’t the case TDD pre-mocking wouldn’t have caught on.

    As you say objects collaborate, but the point is whether you spend your time thinking about those collaborations upfront. I’ve tried both approaches and like you I enjoy both.

    However in the past there have certainly been more people pushing for a purely mockist approach than there were for a purely state based approach, and I think the push back on mockist style is about:

    a) convincing people not to over-use mocking.
    b) convincing people to use mocking well (which I admit I haven’t done in the past).
    c) working out where to use the mocking approach as a design aid

    Anyway I’m looking forward to the mockobjects guys book, I think it will clear a lot of things up.

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller


    Just so you know, the “mockist” style you’re describing is called “Object Oriented Programming”

    “A mockist will almost always start by thinking about the interactions and the internal implementation details of the object under test, so that he can make “assertions” about those interactions.”

    Object collaborate with other objects (again, that’s called OOP). Part of the responsibility of many classes is to decide when and what messages to send to other objects. You can call yourself a “classicist” all you want, but the design should be made in a case by case basis. I think you’re doing real harm to your design by trying too hard to make most of your tests be state based.

  • WardBell

    You win. Passed on my back hand side. I came to the same conclusion after I hit the “submit” button. Somebody has to play the fool. Clearly I would have spent too much time building the fake and, more importantly, not made plain my intent which was to expect that calling Presenter.Save() would tell the repository to save. I’m getting old and slow I guess.

  • http://clariusconsulting.net/kzu Daniel Cazzulino

    I agree with Colin Jack that the distinction is useful to categorize in a very broad way which is your preference “by default” when you start thinking about the unit tests you’ll write.

    A mockist will almost always start by thinking about the interactions and the internal implementation details of the object under test, so that he can make “assertions” about those interactions. (And I can say this with true experience, as I’ve been through this process myself with my team, and came back from it with an enlightening experience and a more cautious approach to interaction testing)

    A classicist will not care much about the interaction in principle, and only switch to using a “true” mock when necessary.

    You can hardly be 100% on one or the other side. You’ll probably need both techniques to achieve full TDD in any complicated codebase. But defaulting to one or the other style will certainly influence your tests and your designs. In my experience, too much interaction testing will sooner or later be painful.

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller


    You just wrote a recording stub to effectively write an interaction based test. The “weird” mock call (sans AutoMocker) would be:

    Whatsit whatsit = ObjectMother.ValidWhatsit();

    using (mocks.Record())

    using (mocks.Playback())

    Not that scary, and I’ve made the specification that WhatsitPresenter sends the “Save(Whatsit)” message to the repository if the Whatsit is valid. Which test is really more clear? The one that says “this message should be sent to the collaborator,” or the one where you have to write a little static fake, and check the state of that dohickey later. What’s the goal of the test? Hint, it’s not to push the Whatsit into a collection of a stubbed repository.

    Mock objects are NOT that scary. If you can look into the abyss of the WebForms event lifecycle and come out with your sanity intact, RhinoMocks or an equivalent is a cake walk 😉

  • WardBell

    Love the analogy, Jeremy. I’m always running around my backhand. Of course my forehand sucks too … as will become apparent in the next few sentences.

    I think your example of an ill-conceived state based test is a bit contrived. To recap:

    bool returnValue = presenter.Save();

    // Look ma! I should have saved the Whatsit in this scenario

    I agree that testing the presenter’s ability to “say” that it saved (or did not) is lame. But the problem may be with the test itself rather than with state based testing per se. Suppose I had written the following instead:


    Of course this means I had to write a fake repository that suits my purpose. That’s what someone who doesn’t know the weird API of a mocking framework would do.

    In so doing, our intrepid player runs around the backhand and hits a successful cross-court forehand to keep the point alive. Feeling like Roger Federer (while looking like Roger Ebert).

  • http://colinjack.blogspot.com Colin Jack

    Yeah I very rarely have my doman classes calling out to the database, not even indirectly. Not all DDD folk design like that though, but I definitely think that keeping your constructors lightweight is a good idea.

    I’d also recommend having a read of DDD (the book) and/or looking at the yahoo group for it, lots of people disucssing this sort of thing.

    Get you on the interfaces. We sometimes, though I guess rarely, design from the GUI down and stub out at the service layer (or at the highest point in the domain layer). We don’t use interfaces though, but it sounds like a similiar approach. Mind you we do try to focus on the domain desing quite early, which we find works well.

  • Lucas Goodwin


    The best example would be from my latest project. I have a warehouse entity which contains a bunch of part entities. These part entities can have screening sheet entities attached to them. In one of my CTORs for a part I’m loading the part from the DB (Data Adapter) and, if appropriate, the Screening Sheet attached to that part is being loaded in as well. So when I’m testing the warehouse loading I don’t care if a part is loading screenig sheets or not or even if the part is loading correctly (already covered in tests for the part and screening sheet) so I mock the parts in the warehouse.

    One thing is, this should be a stub, but I was mocking because I didn’t know how to use Rhino Mocks to fake the part, but still return certain values on certain property calls. Seems to be more of a Dynamic Mock in this circumstance.

    I feel all of this results from the fact that my part is creating a screening sheet in its CTOR and thus, I’ve got a dependency between these two objects I question.

    On the front of designing entity interaction via interfaces. I usually design my domain layer in two steps. First I go from the top down (driven by the application UI/Feature needs) and then I refactor from the bottom up. In order to do the top down, I usually end up creating interfaces for everything so I can fake it until I need it. I’m in agreement, this isn’t an ideal situation and I end up doing a lot more full rearchitecting of my domain layer during my “refactoring” sessions as a result.

    I’m thinking this is my C++ training coming out as I don’t do any sort of repository design and tend to have heavy CTORs, which I’m finding causes some real issues in the implementation classes…

  • http://colinjack.blogspot.com/ Colin Jack

    Sorry I understand now what you mean about it being an interaction test.

    When you say you’re mocking entities I’m guessing that means either through interfaces or virtual members (unless you are using TypeMock). If you are doing it through interfaces and are using it as a technique to develop role interfaces for your domain objects then great. I’ve not seen a lot written about this but when I’ve tried it I’ve thought it could result in rather poor abstractions but the topic does still interests me and I’m sure it can result in interesting designs. On the other hand I think what people seem to do is just extract IOrder/ICustomer interfaces where the interface contains most of the members of the only type implementing it, I’m not sure that approach is useful at all to be honest.

    On the last bit, are you saying that one entity ends up creating other entities which means you want to mock them?

  • Lucas Goodwin


    Thanks for the reply and the link the great article.

    Using the DDD terminology I mock entity objects… alot.

    I never mock value objects. These always end up being Transfer Objects in my designs and are basically just structs, sometimes with a minimal amount of logic for “Is” properties, etc. They never have a heirarchy in our designs either and are primarily used for data transfer from Data Services to Domain Entities.

    I actually started playing around with DynamicMocks and Stubs in Rhino last night. I already see some of the mistakes/head-aches I was creating for myself.

    I usually concider integration testing to be using the actual instances of repositories, data adapters, etc. The “interaction” tests (as I call ’em at the moment) are domain objects interacting together. I suppose this difference is caused as a result of my mocking domain objects more then an actual difference in test fixture goals of “integration” and “interaction” testing.

    I wonder if alot of my testing habits are driven by the fact that I’m not using any kind of dependency injection in our projects other then CTOR injection… I do notice my designs have a bit of interdependencies in the domain entities (calling implementation CTORs of an entity inside a different entities CTOR for instance).

  • http://colinjack.blogspot.com/ Colin Jack

    I’ll take a stab at it, I’m going to use DDD terms because those are the ones I am most comfortable with.

    I don’t think anyone can say what you are doing is wrong, if you want to mock domain objects thats fine.

    Personally I don’t mock my domain entities/value objects (http://jbrains.ca/permalink/90) but I sometimes mock services/repositories.This works for me partially because our domain entities/value objects do not contact any services/repositories so it should be pretty easy to setup/configure them for use in a test (using a Builder or Object Mother).

    Anyway as you say sometimes you don’t care about the domain objects in the higher level tests, so you could stub them but you could also just use them. So rather than mocking a Customer and checking we’re setting its Name I’ll verify that the Customer.Name is correct at the end.

    One thing, you called it a “warehouse interaction fixture ” but that actually sounds more like a little integration test espeically if its saving.

  • Lucas Goodwin

    It seems I’m not understanding the point of mocking. I use ’em quite a lot in our tests, but for very specific reasons.

    1.) Don’t want to do resource management in the test/fixture (DB access, file access, etc)
    This seems to be the standard use I read about.

    2.) I don’t really care what the dependent Domain object is doing in my tests, only that the right methods/properties are being called/set

    For example, if I have a warehouse and a collection of parts in that warehouse I usually split my testing of the warehouse into two test fixtures:

    -A warehouse test fixture that is just testing the warehouse and all the dependent objects are mocks. This is mainly for code coverage and isolated logic/state verification. I find it very handy for testing error paths through the code as well.

    -A warehouse interaction fixture that is testing the warehouse and part objects together. These are usually heavy tests and capture alot of obscure bugs and behaviors, but aren’t really testing a specific trait of the warehouse, but rather testing a long path through the business processes. (Move part to holding, save the part, move the part to returning, save the part, move the part to installed, move the part back to holding, save the part – all the steps in one test)

    Then again, I rarely do true TDD, but kinda slide back and forth between TDD and Unit Testing after the fact. Maybe this is my problem.

    Does this seem a sensible way of testing to anyone else?

  • http://colinjack.blogspot.com Colin Jack

    I think when people say classicist they are using it in the terms that Fowler describes:

    “The classical TDD style is to use real objects if possible and a double if it’s awkward to use the real thing. So a classical TDDer would use a real warehouse and a double for the mail service. The kind of double doesn’t really matter that much.

    A mockist TDD practitioner, however, will always use a mock for any object with interesting behavior. In this case for both the warehouse and the mail service.”

    So I’m not really sure that this post has hit the mark in that regard. I also think the distinction between classicist/mockist is important because people do quite often use one approach over the other and it results in different designs as well as different tests.

    On the over-specified software question. I think the discussion may be partly related to Meszaros’ discussion of the term where he does relate it to mocking. In particular I would say coupling yourself to implementation and/or using mocks when a stub would do are indicators of an issue. Those are not necessarily indicators of bad design, they are indiciators of people misuing mocking. I would also argue that quite often when people do use mocking its not as a design technique, but I guess thats a different discussion.

    On the existence mockists, I’ve definitely met them and they are out there on the Web. I’ve definitely met people who would happily mock every domain object (except maybe value objects) and that does make them mockists (based on Fowlers narrow definition).

    Personally I am a classicist, definitely for most of my domain entities/value objects, but that doesn’t mean I never use mocks. I just prefer to use real domain objects in tests, especially for other domain objects and quite often use test spies/stubs as well as mocks. Also for higher layers, and for services, I am more likely to stub or mock.

    I personally don’t think theres been anything wrong with your blog entries, far from it. In fact having read your blog entries/ALT.NET posts I think you’re probably as reasonable a voice on this as anyone is likely to find.

    Just read this bit: “In business logic it’s bottom up all the way”

    That’s interesting, I’ve always felt part of the reason these sorts of discussions go off is because different styles of development/testing can be better suited to different layers.

  • http://blog.troyd.net/Test%2bSupported%2bDevelopment%2bTSD%2bIs%2bNot%2bTest%2bDriven%2bDevelopment%2bTDD.aspx Troy DeMonbreun


    I believe that’s a fair answer. It’s good to hear that you are doing some Bottom-Up as well as Top-Down TDD. Honestly, from my own experience and from my impression of the blogosphere, many developers looking to understand, implement, or improve their TDD design seem to see TDD as almost entirely Top-Down.

    Do you feel this misunderstanding could be because the quintesential examples introducing TDD are Top-Down, and that most documented examples of TDD in blog articles are also Top-Down?

    – Troy

  • Eric Bergemann

    “My advice is to focus on the goal of each unit test and make sure the unit test is mostly concerned with that goal”

    Right on, I like to use both styles in my testing but I prefer to use classic testing as much as possible. If I find that mocking will simplify the test then I will use it.

    While I really appreciate the mock frameworks out there (we currently use RhinoMocks) I have come to the realization that extensive use of mocking is a code smell. I have seen many tests written using RhinoMocks that are way more complex than they need to be because the developer was in a groove with using RhinoMocks and the test could have easily been written in the classic manner with much less code.

    If you start setting up your unit test class and immediately type “using *.Mocks” without knowing why you need the mocking framework then you may need to rethink your use of mocking frameworks.

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller


    Huh, I had to think about this a minute. Inside of MVP screen coding it’s almost 100% top down. In business logic it’s bottom up all the way. I’d say about 50/50 over all. How’s that for a wishy washy answer?

  • http://blog.troyd.net/Test%2bSupported%2bDevelopment%2bTSD%2bIs%2bNot%2bTest%2bDriven%2bDevelopment%2bTDD.aspx Troy DeMonbreun


    Roughly, what percentage of your TDD-based development is Top-Down and what percentage is Bottom-Up? (among all your TDD development to date)


  • http://agilology.blogspot.com Jeff Tucker

    Nice post. I personally only use one “style” of testing in that I just write tests that actually test my code in useful ways. How that happens is not relevent to the tests themselves as long as they’re easy to understand. People who talk to me about different “styles” of testing by just randomly quoting Fowler get slapped.

    As to where to start testing, I absolutely agree that once someone truly groks unit testing and good design, you can start anywhere with any requirement and just work on it. I’ve seen people’s jaws drop when I tell them that I can build the UI without having a database or even a business layer. I can develope tested, working code for any layer of the application with any amount of information about the application. I call this “development by hand waving” and I’ll post a blog about this soon. It’s by far my coolest development trick.

  • http://andrewmyhre.wordpress.com Andrew Myhre

    Great post. I’ve yet to try out mocking in my tests but I’m much more clear about the purpose of it now. Thanks!

  • Ian Cooper

    Not sure if I am one of the people who you are in dispute with here Jeremy, but I would like to note that despite the headlines of the blogs I hope I have made it clear that my goal is the same as yours here, to see TDD as having a range of techniques, which have trade-offs. If I didn’t make it clear then shame on me, and I want to publically say that I am all for inclusiveness.

    Some of the reason I recently identified myself as a classicist was because I worried by people telling me that you had to mock every dependency all the time or it was not good TDD. The classicist movement, as I understood what Fowler was saying, were happy to use behavior over state, when appropriate, not just the one approach.

    Part of the reason I worried about over-specified software was because people came to me saying that we had to do behavior based testing or we were not doing good TDD and that they had read that on this or that blog.

    Part of the reason I talked about bottom-up, was because everyone was telling me we were not doing good TDD if we went top down.

    The reason it has become an issue of late is that it turned out that a number of us felt that there was a monotone as to what was good TDD practice. And so I wanted to speak out about the fact there were other ways. But agreed it is a always a question of trade-offs, that ‘there is no silver bullet’, and you always need to be guided by context as to what techniques work in this situation.

    I’m just as much against a binary feel to TDD as I am towards a mono feel to TDD. I want polychrome TDD every time. Reacting against the mono feel I was getting may have unwittingly thrust me into the binary camp, but that is not intended to be my final destination.

    I’m sorry if anyone has been using recent blogs by me to justify an entrenched position. However if I have made a few more people ask questions about what alternative TDD techniques there are, instead of embracing one true way, then that is an outcome I’m not unhappy with.

  • http://penmanscratch.blogspot.com Gregory

    Nice post. The state/interaction (stubs/mocks) question came up in the Austin DDD book club meeting on Thursday. It sparked a nice discussion and I’m not certain everyone agreed in the end.

    This is an area I’ve been trying to refine in my own testing lately. I’m not satisfied I always make the best choice and too often (for my tastes) I discover I’ve missed a test when I get to acceptance testing. That’s a price I believe you pay when you stick too rigidly to a ‘style’ of testing. TDD is just too new a practice to have ‘classical’ techniques.