Achieve Better Results by following Jeremy’s Third Law of TDD: Test Small Before Testing Big

Getting back on track with TDD content.  One of the most important lessons learned the software development community has learned is that productivity follows from more frequent feedback cycles.  How fast can you go from writing a piece of code to knowing that code does what it’s supposed to do?  Do the symbols on the UML diagram you’ve worked on all week translate to a design that works?  Does the code you just wrote work?  How often can you get working screens in front of the end users to get usability feedback?  How quickly can you get code into testing, and how fast can the testers validate that code against the desired functionality?  How fast can you identify and correct problems with your code/design/architecture?  All of these “how fast?” questions are directly related to the granularity of your testing and coding.

Test Driven Development is an important and valuable tool to achieve rapid feedback cycles, but I’ve found that the benefits of TDD are only achieved if you religiously write and test software in small pieces.  That lesson is my third (but arguably the most important) law of TDD — “Test Small Before Testing Big.”  Putting it another way, “Code from the Bottom Up.”  Don’t get fooled by Test Driven Development, the benefits of TDD are felt throughout the software lifecycle.  It’s not just extra overhead in writing more unit tests, truly adopting TDD is about:

  • Coding in an efficient manner, and sustaining that efficiency over time

  • Enabling continuous and adaptive design techniques

  • Designing an application that is resilient in the face of change

  • Faster removal of defects from code

  • Making an application easier and quicker to test

TDD felt like more work to me at first, but over the last three or four years I’ve learned how to design and build code as a series of small steps that are validated every stop along the way and my results have improved considerably.  To soften the learning curve for people new to TDD, and to gain some lucidity with my own thinking on TDD, I am distilling the lessons (lumps) I’ve learned about TDD into Jeremy’s Laws of TDD (not that the list is particularly original or new).

  1. Isolate the Ugly Stuff

  2. Push, Don’t Pull

  3. Test Small before Testing Big (this post)

  4. Avoid a long tail

  5. Favor composition over inheritance

  6. Go declarative whenever possible

  7. Don’t treat testing code like a second class citizen

  8. Isolate your unit tests, or suffer the consequences!

  9. The unit tests will break someday

  10. Unit tests shall be easy to setup

And the overriding “Zeroeth Law” is “If code is hard to test, change it.”

Case Study #1:  Code from the Bottom Up

Last week my colleague and I coded a couple of related user stories that illustrate the advantages of following the “Code from the Bottom Up” law, both in the negative and positive. 

In the first story we were taking a “manifest” message that contained a summary of an invoice in progress.  We needed to run a series of validations against the manifest, compare the manifest to existing data, then run the manifest data through a subset of our existing rules engine.  I noticed my colleague had that furrowed brow look we all get when the code isn’t flowing out of our fingertips.  When I asked what was up he told me he was having trouble getting started because he couldn’t see how all the pieces fit together.  I gave the best advice I know for that situation — “Don’t care [about the whole].  What do you know how to do?”  In this case we could start by validating the requested sender and recipient of the invoice message.  We first built a class that would take in the sender and recipient id’s and return a list of possible problems (users don’t exist, sender doesn’t have a relationship to the recipient, etc.).  That class would take in an IDomainSource object that acts as a repository to User objects:

        /// <summary>

        /// Checks the validity of a sender/recipient id pair

        /// Uses the IDomainSource repository to locate User objects

        /// and their relationships to each other

        /// </summary>

        public AddressBookValidator(IDomainSource source)


            _source = source;


with a method for the validation like this:

public EventMessage[] ValidateAll(string senderId, string receiverId)

We went on to unit test this class by mocking the IDomainSource dependency and exercising all the permutations we could think of.  With that task completed, we moved onto the next validation against an existing domain object (a “Bundle”).  We started by assuming we had already fetched the existing Bundle object and were comparing it to the Manifest message.  That leads to a simple method that can be tested with pure state-based testing (we just needed to check the values of the EventMessage objects being returned):

public static EventMessage[] ValidateManifestAgainstBundle(Manifest manifest, Bundle bundle)

Once we have this method completely tested, we do still have the issue of fetching the correct Bundle object to validate against.  We move onto this method and mock the IBundleRepository class in the unit tests.

        public string[] Validate(Manifest manifest)


            Bundle bundle = _bundleRepository.FindBundle(manifest.ControlNumber);

            EventMessage[] messages = ValidateManifestAgainstBundle(manifest, bundle);


            return ResourceMessageProcessor.ToMessageArray(messages);


Now, we’ve dodged around the rules engine enough.  To activate the core of the rules engine we need to pass it data in the form of its own canonical data structure and an array of rule objects.  First we need to translate the Manifest object into the rules data structure.  That led us to build and test the method below:

public InvoiceDataSet BuildInvoiceDataSet(Manifest manifest)

Next we built a class that used the normal rules engine configuration service to retrieve all the rules for a specific receiver and select the subset of the rules that are useful in the new context.  Once we had those two pieces completed, activating the rules engine was simple.

At this point we have little classes that execute all of the various validation rules, but we’re still lacking the actual service endpoint.  Looking at our validation classes we noticed a basic pattern, so we lifted a common interface for all of the validation classes:

    public interface IManifestValidator


        string[] Validate(Manifest manifest);


We knew the “contract” of the service entry point all along.  Now that we’ve built all of the underlying validation pieces, the rest of the service is simply an exercise in connecting the dots:

        public ManifestValidationResponse ValidateManifest(Manifest manifest)


            ManifestValidationResponse response = new ManifestValidationResponse();

            response.ControlNumber = manifest.ControlNumber;


            // Simply loop through all of the manifest validators

            // and collect all of the validation messages

            foreach (IManifestValidator manifestValidator in _validators)


                string[] messages = manifestValidator.Validate(manifest);




            return response;


We didn’t completely understand what all of the pieces of the service method was going to be at the beginning of the coding session, yet we still managed to create working code in short order.  By focusing on little “worker” classes that performed small tasks, the aggregate structure fell into place.

In the second story we had to accept a new message representing an invoice submission.  We had to take this message and run a series of transformations and validations against the invoice data to create a human readable report of validation problems to correct the invoice, or to accept the invoice.  Most of the functionality already existed, but the validation and translation code was bound up into coarse-grained workflow classes.  There wasn’t any way to exercise the logic we wanted without causing side effects.  Most of the effort in that story was a series of refactorings to extract the smaller pieces of code into separate classes that could be called independent of the larger workflow.  We had a safety net of coarse grained integration and regression tests, but no unit tests.  Because of the risk, we had to make the changes in very small steps while constantly running the sluggish end to end tests to make sure we didn’t break anything.

So what’s the point of the second user story, and how does this experience relate to “test small?”  In the first story the rules engine had originally been coded with Test Driven Development, and pursuing testability directly led to the rules engine component being composed of little loosely coupled, cohesive classes.  We didn’t have to change the rules engine code at all, even though we were using it in a completely new way.  All we had to do was recombine some of the existing objects within a different coordinator class.  The code in the second story had not been written with TDD or testability in mind, and it showed.  The TDD code followed the Open/Closed Principle as a byproduct of working “test first,” the non-TDD code required a lot of change to the existing code to create new functionality.  The TDD code was healthier code than the non-TDD code.

Purposely Designing with Test Driven Development

I know this is controversial and I didn’t believe this initially either, but TDD is designing while coding.  Coding one task at a time helps me to discover the structure of the larger whole.  My vision of the whole is informed by the creation of the small pieces.  Using some terminology from Responsibility Driven Design, consciously divide up responsibilities by class stereotypes:

  • Service providers – classes that perform a specific operations

  • Information Holders – classes that have, or provide, data to other classes

  • Coordinators – classes that coordinate the activities of other classes

  • Controllers – classes that control the application flow

The key for doing emergent design is to focus on creating the service provider and information holder classes first that perform small tasks.  After these classes are built and tested the construction of the coordinator and controller classes often turns into a simple game of “connect the dots.”  Get the business logic and the workflow decisions complete first.  Push off tasks like configuration and even data access closer to the end.  Let the needs of the business logic and workflow dictate the interface and design of the ancillary services.

To make this approach work you need to consciously maximize “reversibility” throughout the system.  Notice how we started the user story above by working on the validation of a Manifest object to a Bundle object.  The initial class method isn’t coupled to any particular message handler, data access mechanism, or configuration subsystem.  It simply takes in two objects and returns an array of strings.  The loose coupling is just a side effect of coding for testability, but it enables us to use that class in a multitude of ways.  It isn’t bound to a particular workflow.  If the requirements of the overall workflow changes (and it will), or we determine a better overall design, we can change the workflow controllers and coordinators by rearranging the service providers and information holders.  You’ve also maximized the potential for reuse at a later time.  In the first user story we were able to take the smaller classes of the rules engine and combine them differently for all new functionality.  In the second story we had to spend a lot of energy extracting code out of the large workflow classes to create the new functionality.  Guess which story went faster?

When I was in school I worked with my father building houses in the summer.  On one memorable occasion I watched a pair of plumbers looking glumly at a series of pipes sticking out of the slab concrete form.  It turned out that the plumbers had put the pipes in the wrong place before the concrete was poured around the pipes.  They ended up renting a giant two man concrete saw and cutting up the foundation to move the pipes on a blistering summer day (they missed the second time around too!).  Placing pipes that will have concrete poured around them is an irreversible decision, you simply can’t get that decision wrong because there isn’t going to be a second chance. 

Software doesn’t have to be that way.  So what can you do to maximize reversibility in your systems?  You guessed it, write small cohesive classes that can be moved around and used in different contexts.  Somebody will comment that evolutionary design techniques are inefficient and you should just be doing more research upfront.  Maybe, but designing fluid code that can change covers a lot of potential scenarios.  Building code from an upfront design, especially a design that happens from the top down, can easily result in code that only works for the cases known upfront — and the requirements will change, maybe not tomorrow or the next quarter, but they will change.

Case Study #2:  Permutations

The heart of the first big system I designed was a complex supply chain routing engine* that determined the best way to route requests from the factory lines for parts.  The engine first queried and correlated (with a full outer join no less) data from three different sets of database tables, then applied a complex series of business rules against the data to select the best part channel (part, inventory source, and factory line destination).  Adding to the complexity was a set of business rules that varied by region.  All told, there was a mountainous stack of possible permutations of input to check.  Needless to say, the testers struggled mightily with the engine because they only tested manually and with end to end blackbox tests.  The engine worked great and we found very few defects with it in testing, but because it sucked down such a disproportionate amount of testing resources other parts of the system didn’t receive nearly the same level of testing and serious defects made it into production.  We worked inefficiently.

So with the wisdom conferred upon me by hindsight, let’s take a look at a cleaner way to test this functionality that won’t drown the team in endless permutations.  The first and foremost thing to do is to solve the routing in two steps, first correlate the data from the database tables into an object structure, then from this object structure select the best routing.  I described the full model of the part sourcing as a bookshelf.  On the top shelf are all the books you know you’ll enjoy.  The second and third shelves are books that aren’t as good.  The full model works by dividing the PartSourceChannel objects into several shelves of decreasing ability to fulfill the request for the part. 

    public class PurchaseOrder{}

    public class SupplyChainSource{}


    /// <summary>

    /// Represents a valid source for a part – the “Books”

    /// </summary>

    public class PartSourceChannel


        public string PartNumber;

        public string ChannelNumber;

        public PurchaseOrder PurchaseOrder;

        public long Inventory;

        public SupplyChainSource SupplyChainSource;

        public double AllocationPercentage;




    /// <summary>

    /// Collection of related PartSourceChannel’s – the “Shelf”

    /// </summary>

    public class PartSourcing


        public PartSourceChannel[] Channels;


        public PartSourceChannel SelectChannelWithMostInventory()




        public PartSourceChannel SelectChannelByAllocation()




        public bool HasChannels()


            return true;



And the bookshelf itself is:

    /// <summary>

    /// Represents all of the possible PartSourceChannel’s

    /// for a given part, factory line, and region – the “Bookshelf”

    /// </summary>

    public class PartSourceRouting


        public PartSourcing Shelf1;

        public PartSourcing Shelf2;

        public PartSourcing Shelf3;


        // Select the PartSourceChannel

        public PartSourceChannel Route()


            if (Shelf1.HasChannels())


                return Shelf1.SelectChannelByAllocation();



            if (Shelf2.HasChannels())


                return Shelf2.SelectChannelWithMostInventory();



            if (Shelf3.HasChannels())


                return Shelf3.SelectChannelByAllocation();



            return null;



My strong recommendation is to drive design from the behavior of the business objects and then work forward to the service point and backwards to the data store.  Starting small, assume you already have a PartSourcing bookshelf.  The algorithm to select a part source from a shelf came in two basic flavors, choose by allocation to a supply chain partner or choose the PartSourceChannel with the most available inventory.  The first set of unit tests could be to start the PartSourcing shelf class and test the SelectChannelWithMostInventory() and SelectChannelByAllocation() methods first.  It’s an easy place to start because you can simply create an array of PartSourceChannel objects, pass them into a PartSourcing object, and verify that PartSourcing returns the correct PartSourceChannel.



    public class PartSourcingTester



        public void SelectChannelWithMostInventoryWithMoreThanOneChannel()


            PartSourceChannel channel1 = new PartSourceChannel(100);

            PartSourceChannel channel2 = new PartSourceChannel(400);

            PartSourceChannel channel3 = new PartSourceChannel(200);

            PartSourceChannel channel4 = new PartSourceChannel(300);


            PartSourcing sourcing = new PartSourcing();

            sourcing.Channels = new PartSourceChannel[]


                    channel1, channel2, channel3, channel4



            PartSourceChannel channel = sourcing.SelectChannelWithMostInventory();


            Assert.AreSame(channel4, channel, “Channel 4 has the most inventory”);



That was simple enough.  Once the unit tests for PartSourcing are complete you could move onto the PartSourceRouting.Route() method.  Following the “Push, Don’t Pull” law, PartSourceRouting is completely ignorant of how or where the PartSourceChannel data is stored, it just processes the data it’s given.  Now that we trust the PartSourcing class, we can start building the PartSourcing members of a PartSourceRouting class and then call Route() to verify the expected outcome. 


    public class PartSourceRoutingTester



        public void RouteWithOnlyPartSourceChannelsOnShelf2()


            // Build a PartSourceRouting in memory

            PartSourceChannel channel1 = new PartSourceChannel(100);

            PartSourceChannel channel2 = new PartSourceChannel(400);

            PartSourceChannel channel3 = new PartSourceChannel(200);


            PartSourcing sourcing = new PartSourcing();

            sourcing.Channels = new PartSourceChannel[]


                    channel1, channel2, channel3



            PartSourceRouting routing = new PartSourceRouting();

            routing.Shelf2 = sourcing;


            PartSourceChannel channel = routing.Route();

            Assert.AreSame(channel2, channel, “Channel 2 has the most inventory”);



So now that the entire “bookshelf” routing is working we can move onto the next task — actually building the bookshelf.  We’re still not ready to touch the database though.  Create a new PartSourceRoutingBuilder class that takes in an existing array of PartSourceChannel objects and creates a new PartSourceRouting with all of the PartSourceChannel objects on the proper shelves.

    public class PartSourceRoutingBuilder


        private readonly PartSourceChannel[] _channels;

        private readonly PartRequest _request;


        public PartSourceRoutingBuilder(PartSourceChannel[] channels, PartRequest request)


            _channels = channels;

            _request = request;



        public PartSourceRouting Build()




Just to write simpler tests first, I would try to first test how to shelf a single PartSourceChannel before trying out the larger Build() method. 

        // I’m using a static method here so there’s no issue

        // with prior state of a PartSourceRoutingBuilder

        // instance

        public static void ShelvePartSourceChannel(

            PartSourceChannel channel,

            PartSourceRouting routing,

            MaterialRequest request)


            // analyze the channel against the request and

            // put the channel on the proper shelf


Once the ShelvePartSourceChannel() is unit tested with all the scenarios we can think of, then move onto the Build() method that will delegate to ShelvePartSourceChannel().  If we unit test ShelvePartSourceChannel() thoroughly, we don’t need to write a unit test for nearly as many permutations of PartSourceChannel arrays through the larger Build() method, reducing the complexity of testing. 

Back to my team’s struggle with testing the original routing engine.  In no small part due to the experience with the routing engine I’m a big, big believer in white-box testing.  Because of the way we’ve built the pieces of the routing engine here we can easily write a series of FitNesse fixtures that allow the testers to quickly define a list of PartSourceChannel‘s and check the routing selection for a given MaterialRequest without the database or any kind of web service or user interface being involved.  That should cut down the difficulty of writing automated tests for the routing engine to validate the business rules of the routing engine in isolation before we try to test the engine from service invocation to database.  We can go into the black box testing with confidence that the engine itself works first.

So the business rules are verified, but we still need a service entry point and the database access to correlate the data from the database into the PartSourceChannel objects.  Because we know it’s easier to test by building from the ground up we’ll code and test the correlation from the database first and following the Dependency Inversion Principle we’ll put this functionality behind an interface.

    public interface IPartSourcingDataService


        PartSourceChannel[] FindRoutingOptions(MaterialRequest request);


Finally, we can move on to the service class that will be called by the rest of the application to access the routing logic.

    public class PartSourceRoutingEngine


        private readonly IPartSourcingDataService _service;


        public PartSourceRoutingEngine(IPartSourcingDataService service)


            _service = service;



        public PartSourceChannel Route(MaterialRequest request)


            PartSourceChannel[] channels = _service.FindRoutingOptions(request);

            PartSourceRoutingBuilder builder = new PartSourceRoutingBuilder(channels);

            PartSourceRouting routing = builder.Build();


            return routing.Route();



That class was easy — and that’s a big point to following the “Code from the Bottom Up” rule.  The flow of the PartSourceRoutingEngine controller class falls out because all of the little pieces are already defined and built.  PartSourceRoutingEngine simply has to coordinate the actions of the existing IPartSourceDataService, PartSourceRoutingBuilder, and PartSourceRouting classes.

Controlling Testing Permutations

I started this case study as an exercise in controlling permutations.  In the real world project the possible pathways through the routing looked something like this (feel free to mock my math here):

  1. 12 different combinations of table joins across the three tables * 3 sets of region specific rules for a total of 36 permutations

  2. 5 valid shelves * 0, 1, 2, or 3 channels per shelf = 4 ^ 5 = 1024

I’m not sure how the 36 related to the 1024, but suffice it to say the final answer is >10000.  Covering every permutation simply isn’t feasible, but by focusing on completely testing the smaller steps first we might get something like:

  1. The 36 permutations of the data correlation

  2. Test the selection process of each shelf individually with 0, 1, 2, or 3 channels — 4 * 5 = 20

  3. Maybe test the selection of the five shelves — has channels or not = 2 ^ 5 = 32 (but I think you could get by with many fewer)

That math gives you 80+ tests.  Say you add half again as many tests that run end to end to prove that the pieces work together.  That adds up to 120 tests, far smaller than the 10,000+ combinations from end-to-end. 

Easing the Testing Burden

Quoting my esteemed tester colleague Jim Matthews —

Testing takes a lot longer if all you can do is write end-to-end tests.  It’s also easier to find and remove problems from smaller tests.

When I wrote the routing engine I wasn’t using Test Driven Development much less Acceptance Test Driven Development in conjunction with the testers and analysts.  The testers had no other way to test than to load up the database with data, run the routing engine, and see if the results matched up with expectations.  Three years of development with Agile processes has given me a much greater appreciation for a holistic approach for software development.  Much of the coding process is simply removing defects — compile time checks, code reviews, unit tests, acceptance tests, etc.  The faster you can purify the code by removing defects the better the real productivity of the team.  My productivity on the routing engine in terms of coding alone was actually pretty good, but the tester’s productivity was awful.  In terms of Lean Programming I made a point optimization that didn’t help the whole process.  The lesson I’ve taken away from the routing engine is to write code that is easier for the testers to test — and make sure that the testers are aware of the potential for smaller tests, but that’s a post for a different day…

Looking back, we could have made the testing of the routing engine much smoother is we had focused on testing the business rules in isolation from the database and the interface.  The testers bogged down in the database setup and flat file creation just to setup a business rules scenario.  If they could have just started by saying “I have these PartSourceChannel‘s and this MaterialRequest, I expect this PartSourceChannel to be selected” we could have nailed down the business rules much faster with tests that were human readable.  Once we were confident in the implementation of the business rules we could have moved onto integration with the front end and the database. 

To write acceptance tests in a white box fashion, the system has to have seams, places where we can exercise the business rules in isolation without running the full application stack.  The good news is that you don’t have to spend countless hours in front of a whiteboard devising seams for your testers.  Simply proceed in a “test small, before testing big” manner of constructing code and those seams will already be there in your code.


Wrapping Up

We’ve always known that software development works best when we can divide larger problems into smaller, more easily manageable pieces, it’s just the mechanism for determining the smaller pieces that’s always difficult.  To get the full benefits of TDD it really must be used in combination with other practices like Continuous Integration and true Iterative Development. 

You might ask, why can’t I achieve all of this with upfront design and just plain old unit testing? Maybe you can, at least some of the time, if your design skills are good and the requirements are stable.  I’d argue that over time the odds are in favor of adaptive techniques that give you more opportunities to make corrections along the way.  As far as just plain unit testing goes, TDD gets unit testing into play much earlier and more consistently.  If you build code without thinking about how you will test the code, you’ll often find yourself with code that is hard to test.  Writing unit tests first forces you to write testable code and goes a long way toward better unit test coverage.

Hopefully this post addressed the usage of TDD to solve bigger problems in pieces. I did cut some things out of this post for the sake of length, so there will be some follow up posts this week on TDD and Debugging, TDD and Flow, and using Mock objects to create smaller tests and avoid context switching.

About Jeremy Miller

Jeremy is the Chief Software Architect at Dovetail Software, the coolest ISV in Austin. Jeremy began his IT career writing "Shadow IT" applications to automate his engineering documentation, then wandered into software development because it looked like more fun. Jeremy is the author of the open source StructureMap tool for Dependency Injection with .Net, StoryTeller for supercharged acceptance testing in .Net, and one of the principal developers behind FubuMVC. Jeremy's thoughts on all things software can be found at The Shade Tree Developer at
This entry was posted in Test Driven Development. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Amandeep Kaur

    Can we test services efficiently with Visual studio Teamsystem 2008’s Test Edition? If yes What can be the best approach?


  • http://emmanuel emmanuel adjabeng

    please send me some your items thanks
    greater grace church
    box193 ashaiman
    westafrica 00233

  • jmiller


    That’s a pretty good question/concern. I think I want to write a short follow up for your question early next week.


  • Liang

    It is a great post! But I still have a question for Jeremy. Could you let me know if you have a basic application architecture design in mind before you write the first line of code? Or you just let the test/code lead you to a design. For example, if you find a common method in related classes, you create a Interface, just as you metioned in validation. If that is correct, I think TDD is pretty hard for a beginner, since the developers should have very good knowledge in OO, design pattern, refactoring before beginning implementing TDD. otherwise how she/he know how and when to refactoring?

  • Aaron Robson

    Nice article.
    I especially liked the bit about the class responsibilities – service providers, information holders etc. I haven’t ever explicitly considered my classes in that way, and I can see how the combination of IOC and test driving the service providers first can simplify things.

    I think there is still some top down work needed (for myself at least) in order to determine which service providers are actually needed in the first place. While this is potentially straightforward with a good story in place, I find it’s often harder to do when specifically developing a framework and not an application itself – this was always mentioned as a difficulty with eXtreme Programming too.