A fairly common topic with TDD practitioners, both newbie and experienced, is how the heck to unit test business logic with a database hanging around. I’ve had several conversations lately about this so I thought I’d get a post about it. Before I try to talk about some strategies for this, here are a couple good reasons to avoid data access calls inside unit tests for the business logic:
- Tests with database calls will execute significantly slower than tests that stay within an AppDomain. Fast feedback cycles are an absolute necessity for effective TDD. Slow tests directly impact the velocity of a development team in a negative manner. When I was first out of college I was doing large engineering calculations on an old 486. Each calculation run could take 5 minutes plus. When we moved up to P5-100’s our productivity jumped significantly. Slow unit tests are like working with 486’s. I do miss napping in my cube though.
- Tests with database calls are more labor intensive to write. You have to create SQL statements to create a known database state before you run your tests. Checking the test results probably involves scraping data back out of the database and coercing into .Net types just to do the assertions. If you can stay inside the world of strongly typed .Net or Java code then “Intellisense” and code completion can speed you along.
- Damned if you do, damned if you don’t.
- Embedding a bunch of SQL statements into the test fixture classes can make the tests harder to read and understand (trust me on this one)
- Putting the database setup in another location can make the unit tests harder to understand and troubleshoot because the test data is external to the test fixture class. “ALT-TAB Hell.”
- The intent of the unit tests can be unclear. Verifying a successful unit test by checking a status column in the database isn’t always the height of clarity.
You might not agree with the above, but if you do, the rest of the post is about the various ways I’ve used or observed to isolate the testing of the business logic from persistence infrastructure.
Mock the Database, but Where?
One of the best ways to get the database out of your way is to hide data access behind abstracted interfaces that can be mocked in business logic testing. All you’re doing here is treating data access as yet another service that is invoked from your application. Simply take all of your data access and put them into some sort of Gateway pattern class. Use some sort of Dependency Injection with the business class to substitute the data access implementation with a mock. In the case below, you would create a mock object for IDataAccessGateway.
public class BusinessClass
{
private IDataAccessGateway _gateway;
public BusinessClass(IDataAccessGateway gateway)
{
_gateway = gateway;
}
public void PerformSomeSortOfAction(DataSet dataSet)
{
// Manipulate the DataSet in some way
_gateway.SaveSomething(dataSet);
}
}
public interface IDataAccessGateway
{
void SaveSomething(DataSet dataSet);
}
Do not mock happy, fun ADO.Net. By necessity ADO.Net is a low-level API. As a general rule I would advise anyone to avoid mocking a low-level API under almost any circumstances. Mocking even a simple ADO.Net call would involve several steps and objects for getting a connection, creating a command object, attaching said command to connection, creating a bunch of parameters, etc. Just don’t go there. I noticed a junior-junior pair having some trouble with a coding task last year. When I finally looked over their shoulder I discovered that they were trying to write a unit test by using NMock to create dynamic mocks for the IDb* interfaces. The unit test code was about 9 parts NMock.Expect() calls and 1 part performing the actual test. They changed testing strategies and their work started to move again.
This same stricture applies to any version of Microsoft’s Data Access Application Block (it’s all static methods anyway) or Enterprise Library. These tools are still just thin veneers over ADO.Net and suffer from the same sort of mocking overhead that raw ADO.Net does. Our internal analogue to EntLib has a dedicated static mock mechanism for unit testing low level data access code. We’ve barely used it because it’s still not that convenient.
To put this bluntly in a rule of thumb, a business or even service layer class should never have any reference to any of the System.Data.* namespaces. I would allow an obvious exception for DataSet’s with the caveat that DataSet’s aren’t the best choice for business entities and not particularly suitable for Data Transfer Object’s either (go ahead and argue, I’ll just sick Bellware on you). I’d be really uncomfortable about referencing an IDataReader in business or service code too. That smells awfully wrong to me.
Invert the Control
The section above talks about mocking data access if your business logic follows a Transaction Script pattern for organizing business logic. If you’re starting from scratch on a system that is business logic intensive you’re probably better off to organize your business logic as a Domain Model anyway. In this case your domain classes (business logic layer) are completely independent of any kind of persistence mechanism. For persistence I like to use the Data Mapper pattern to load and persist the business objects (this is what tools like NHibernate do behind the covers). The mapper classes are aware of the business domain classes instead of the business classes calling the data access classes (Inversion of Control). A Service Layer class of some sort would be responsible for calling. Here’s a sample of what I mean:
public class OrderServiceClass
{
public void SendPendingOrder(SendOrderMessage sendOrderMessage)
{
OrderMapper mapper = new OrderMapper();
// Find the correct Order object and call Send(Destination)
Order order = mapper.FindOrder(sendOrderMessage.OrderId);
order.Send(sendOrderMessage.Destination);
// Persist the new Order state
mapper.Save(order);
}
}
The reason why this advantageous for unit testing is that you can test the business logic behind sending an order without any database interaction. The business logic can be verified by checking the state of the Order object and it’s children before and after the call to the Send(Destination) method.
I Do Not Like the Active Record Approach
Another way to handle persistence of business domain classes is the Active Record pattern. In this case each domain class is responsible for its own persistence. Each domain class will have a signature like this:
public class Order
{
public void Save(){}
public void Delete(){}
public void Insert(){}
public void Load(long id){}
}
This came up recently at the Austin Agile lunches because one of our members was evaluating Rocky Lhotka’s CSLA.Net framework for one of his projects. I’ll admit that I’m very biased against CSLA because the single worst codebase I’ve ever seen was a VB6 monster that used CSLA. I could be wrong, but I still would not recommend something like CSLA.Net for TDD projects because I think the Active Record style of domain objects binds the business logic too tightly to the database. I think that this style of data access makes unit testing without the database harder.
Several of the O/R tools for .Net are really Active Record patterns (usually with code generation). Testability has to be a major concern when you choose a persistence strategy. I definitely prefer a Domain Model approach with external mapping, but I’ve spoken with people who swear by Active Record classes with internal mapping.
Point of View – the Database is the Application vs. the Database is Merely the Persistence Mechanism
How you personally answer this question largely determines how you layout software systems. At one extreme, database-centric folks obsess over relational models and treat the middle tier code as merely a conduit to get data in and out of the database. This point of view seems to be much more prevalent in the Microsoft world and definitely among older developers and requirement analysts (flame away but you know its true). I’ve often seen requirement specifications from business analysts that amounted to “get data from this table and go insert it over here in a different way.” This kind of thing leads to some nasty design and architecture smells like gross duplication of code, zero encapsulation, scalability issues, and nightly batch jobs that run for 30 hours at a time locking transactional tables left and right (true story). In one instance it also led to a PM making a waterfall schedule that had 75% of the development man-hours to logical and physical data modeling on a system that probably had only 4-5 database entities but a complex user interface and a bunch of integrations to legacy systems (thar be dragons).
The other issue with a database-centric development philosophy is that automated unit tests are less effective. Toss in largish stored procedures and you’ve got a mess. Procedural code can be more difficult to unit test than well-factored Object Oriented code because it’s more difficult to isolate pieces of functionality. Embedding this procedural logic within a database just compounds the problem. Yes, there are xUnit toolsets for database access now but they’ll never be as easy to use as NUnit or JUnit.
No Business Logic in Stored Procedures!
I make exceptions for cases where set-based logic can be done easier with declarative SQL code than with procedural middle tier code. Otherwise I think business logic in stored procedures is evil.
Posted
Wed, Oct 12 2005 1:57 PM
by
Jeremy D. Miller