Levels of automated testing within a single application

We need a common language for the different types of automated testing.  We’re partially there, but the term “unit test” is still very confusing.  Here, I’ll lay out the different types of automated tests I find helpful with a single application:

  • Unit testing – testing a single class or possible a small group of collaborating classes (absolutely does not call out of process and is the fastest-running of all automated tests).  Running 1000 unit tests in 3 or 4 seconds is common.
  • Full system tests – through the UI integrated with the full application including the database.  May or may not use real system dependencies such as external web services.  (These are the slowest of all tests)
  • Integration testing.  Here, there are some categories.
    • Data access tests.  Used to test repositories, data access classes, etc.  These tests validate the translation from entities to data.  These tests run all SQL and test the structure of the database schema as well.  A real database must be involved.
    • General scenario testing.  Any time it’s appropriate to pull a section of the application in and run a lot of classes together, this is an integration test.  It involves several parts of the system, not just one.  It can run fast if completely in process, or it can be slow if it requires an out-of-process call such as leveraging the file system.

This is not an exhaustive list, but it includes most of the automated testing on a typical enterprise application.  Feel free to comment with any type I may have left out.

Scott Bellware reasoned that the database needs to be left out for unit testing.  I completely agree.  Unit testing, by common definition, excludes external dependencies.  It’s not a unit test if we reach out and touch things.  When you have the right number of unit tests (for example, I’ve worked on a smart client system with 80,000 lines of code and 1300 unit tests and another 700 integration tests), you can’t afford to take more than a few milliseconds to run each one.  You need your unit tests to run very quickly.  Otherwise, you won’t run them very often.

Conversely, this doesn’t mean that the database should be ignored when testing a system.  There are plenty of reasons why a database, SQL, or stored procedures, triggers (shudder), views, etc can cause a bug in the system.  I insist writing an automated integration test for every database operation.  How else can we verify that the database operation works correctly?  We can’t.  It is important, however, for communication’s sake, to understand that these database-inclusive tests are integration tests, as are any tests that exercise an external dependency.

Automated testing with the database REQUIRES the following:

  1. Every developer has a dedicated instance of the database that can be dropped and created at will.
  2. Tests must be responsible for their own data setup.  An empty database should be all that is required to run the test.  The test must be responsible for adding data for the appropriate scenario before testing the scenario.
  3. You will want to generalize test data setup because it isn’t feasible to expect EVERY test to set up all the data.  A general data set that sets a base line of data is very useful and can be invoked with a data helper class.  Then each test can just add specific data necessary for it’s test case.
  4. Data setup, database creation, etc should be automated.  If it’s manual, it cost more, and you won’t run the tests as often.
  5. Database schema must be in source control with the code.  Without that, you never know what the correct version of the schema is.

Another of Scott’s points: “As a side effect of doing the necessary dependency injection, you often get a cleaner and more explicit separation of concerns – which makes software easier to change and maintain.”

He’s right.  If you can’t unit test your domain classes because everything you do with them requires a real database to be online, you have an indication that you aren’t separating concerns.  Data access should be independent of domain object behavior in most cases.  I should be able to verify that a Customer object can Sort() itself without invoking a database query, but if constructing a Customer initiates a database call, my domain model is then materially coupled to the database and needs to be separated.

Jeremy Miller is of the same mind in his comment: “Referential integrity, non null checks, and sundry other data constraints.  All good things.  All a pain in the ass when you’re unit test only needs a single property set on the InvoiceItem class.”

To help clear up some confusion with the term “unit test”, I propose a simple constraint in our dialog:  If the test calls out-of-process, it is then disqualified from “unit test” status and falls into “integration test”.  Feel free to argue in the comments. :)

This entry was posted in Featured. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

15 Responses to Levels of automated testing within a single application

  1. Don says:

    @Ross

    We handle this issue in a way similar to what you describe. We created a batch file that runs a series of osql commands. It crawls the directory tree and runs all of the scripts in the default order (alpha), but we set the order of the folders to run. We keep our scripts in the folders:

    \table
    \pk_constraint
    \fk_constraint
    \index
    \seed
    \trigger
    \view
    \sproc

    In the root folder is the db_create script. This sets up the mdb and log files, creates accounts and roles and sets permissions globally. This is run first. Then the table folder is crawled and all tables created. Tables are created with the rule that only default, IDENTITY and NULL constraints can be defined.

    The pk_constraint folder contains one file. This is a simple list of alter table commands to define the pk for each table. We’ve debated many times whether we should collapse this into the table definitions, but it always gets religious and turns into an old timer vs new timer debate. It is what it is.

    The fk_constraint folder contains only one file too. This is an ordered list of foreign keys. Add the head of the file is a comment list of all tables affected and the target table. It’s a little work to maintain for a lot of gain long run.

    The seed folder has one script for each table that defines it’s seed data. Used mostly for filling in lookup tables.

    Indexes (almost all covering) are defined and run next. Followed by triggers, views and finally sprocs.

    Another rule we have is that if your script includes a CREATE command, it has to assign permissions in the same script file.

    The batch file takes about five minutes to run. The LOC all told is around 75K. Any errors are echoed to the window with the point it occurred. These have to be fixed before checkin (unfortunately unenforceable except by ridicule when a bad checkin is made).

    I’m sure there could be built a more elegant solution and this may seem very Mort, but it has worked for us for 7 years now with very few issues, included four major schema changes.

  2. Ross says:

    @Carlton

    Thanks for your suggestion. In the past I have done something similar to the “big script” idea where I maintained a bat file for running all of the scripts via the command line. This does work, but tends to fall down in multi-developer teams when colleagues forget to maintain the file, or do so incorrectly (i.e. wrong order). I don’t think it is a practical solution.

    It is also possible to do a “topological sort” of the tables in order to make sure they get created in the right order. I believe other objects such as views are late-bound so can be created without their dependencies.

    All of this seems far too clunky to me – at least if you want to roll it out to a multi-developer team.

  3. Carlton says:

    @Ross

    I have solved this issue by having EVERY piece of T-SQL needed to make the DB under source control – that includes referential integrity checks. Then I create a “big script” that knows the “right order” – established mostly through trial and error – and execute the “big script” using the osql.exe command line tool. It is time consuming and painful if you create the “big script” this at the end of a project, but not too bad if you add a little at each step of the way.

  4. @Jimmy,
    I’ve seen that usage. Shops that don’t do any automated testing all all have used the term to test a small portion of the code, but test it manually. Automated testing has proven to be far superior to manual testing for this.

    @Ross,
    I understand the problem, and I have had to tackle the same issues. I’ve solved the problem by using a mix of source control and automation for database building/updating. I’ve found it useful to mark the database with the same version number as the code gets. As the code gets a new build number, the database schema should also.

    I’ll see about distilling that into a future post.

  5. Ross says:

    I agree with the five points regarding automated testing with the database. The trouble is I’ve never found a practical way of implementing this.

    If you wish to keep all of your database objects under source control, then it seems to me that the best way of doing this is to have one script for each object, perhaps putting them in folders such as Tables, Views, Functions, etc.

    That’s all fine and dandy, but when doing a get latest how the heck does one know which order to execute the scripts in? With constraints such as foreign keys this order is crucial.

    I appreciate this is a bit OT but if you could share some of your experiences in this area – perhaps in a future post – I’d be interested.

  6. Jimmy Bogard says:

    I tried to define test types for my team a while back with a post:

    http://grabbagoft.blogspot.com/2007/06/classifying-tests.html

    “Unit testing” was originally defined on my team as “manually going through the web application to verify one feature under the happiest of happy paths”.

  7. Wayne M says:

    Hi Jeffery, my “don’t get hung up” comment was intended to be general advice to the world and I hope no one interpretted it as being directed at you. I’ve gotta watch how I say things when I am philosophizing!

    I agree with you on the confusion with the term “unit tests”. As I recall there was a recommendation several years back to replace unit tests with the term programmer tests to cover everything that is needed from the programmer’s viewpoint. I guess I have had too many run-ins with strict definitionalists who feel unit testing is only individual class level tests and no further tests should be considered or required.

    Anyway, I just wanted to clarify my opinion that TDD needs to use a broad range of automated tests and emphasize I was largely in agreement with what you said and targetting the world at large with my comments.

  8. joeyDotNet says:

    Yeah, I can understand that. I’m just always puzzled when the practice of TDD, as it was intended, is seen by many as “pure” or “elitist”. And why TAD is somehow accepted as the “practical” application of TDD?

    Oh well, guess that keeps us busy in the consulting world. :P

  9. @Joey,
    I don’t think it would benefit the community to retire TDD. We’d then be starting from scratch with a different term. We should use the term properly and assert its meaning.

    I’m reminded of a court case where the defendant called the judge a “pimp” and then backpedaled trying to convince the judge that it was a compliment and not an insult. The term already has a meaning, so confusion ensues when folks try to alter the meaning.

  10. joeyDotNet says:

    Sure, thanks for the clarification. Acceptance testing seems so beneficial, I’m surprised it’s not used more on projects.

    Oh, and while we’re at it, I don’t know how many times I’ve seen regular “unit testing” (the TAD varierty) mistakenly get put under the umbrella of TDD… Once again I’m having to clarify this at my new company. The TDD term seems to be getting just as vague as “service” or SOA (which seems to have 38,000 different meanings).

    I much prefer at the very least Test-FIRST Development or even better, Behavior-Driven Development. Is it possible for us to retire the TDD term once and for all? Or should we just keep clarifying what the english word “driven” means…

    Sorry, this kinda got a little off-topic, but I think it’s a discussion that needs to take place.

  11. @Joey,
    I’m glad you brought that up. I didn’t mention those, did I? I was only thinking about developer tests at the time.

    Automated acceptance tests could be at any level, actually. I think they are their own category. You wouldn’t call an acceptance test a “unit test” even if it was used to validate behavior handled by only one class. acceptance tests need not be full system tests. They can be at any level and should verify behavior from the point of view of the product owner.

    What normally comes to mind is FIT, which I’ve used for projects at two different companies. These are table based, don’t look much like code, but can exercise the entire system or just a small part of it if necessary. The UI can be involved, but it doesn’t need to be for other cases. Either way, thank you for bringing it up.

  12. @Wayne,
    Thanks for the comment. Your point is correct. Different types of tests are required, and each have trade-offs.

    I don’t believe I’m “hung up” on the definitions, but I’ve found that many folks only use the term “unit tests”, and what they mean are “automated tests”. I’d like to build a richer vocabulary around testing, and terms that have various meanings are essential to a rich vocabulary. We can’t just have one term.

    @Ralf,
    Michael Feathers is a genius. I have read his book cover-to-cover. If a test doesn’t meet the criteria, it’s not a unit test.

  13. joeyDotNet says:

    So would you classify automated “acceptance” tests under “full system tests” or “integration tests” or both? Even though acceptance tests probably do fall into one or both of those categories, in my mind, acceptance tests convey a different meaning.

    So, perhaps they need their own “classification” in your list of automated testing types? Your thoughts?


    (Even though I haven’t yet been fortunate enough to fully utilize acceptance testing on a project, I can definitely see the benefits of using them and can’t wait until I get the opportunity.)

  14. Wayne M says:

    The point of the tests is to define what correct operation means. To do effective test driven design, tests at many different levels are required and there are trade-offs in doing various levels of isolation or integration. Don’t get hung up on precise definitions of “unit” test and “integration” test, write the most effecitve tests to get the job done.

  15. I totally agree!
    I always get the creeps when I talk to other devs who are claiming to do unit tests and using the db because “its there anyway”.

    Michael Feathers has a good list of things a unit test must not do in “Working Effectively with Legacy Code”
    “A test is not a unit test if:
    * It talks to a database.
    * It communicates across a network.
    * It touches the file system.
    * You have to do special things to your environment (such as editing configuration files) to run it.”
    That puts it nicely.

    Maybe to have your own database on your dev machine fires back when it comes to unit testing :-) . Nevertheless I really can’t understand that there are still shops that insist on having one or two central development dbs that are managed by someone in another department.

Leave a Reply