OK I’ve decided to jump into this discussion of using stored procedures for data access and manipulation.
First I will start out with some background. I have been involved in building multi tier applications for nearly 10 years. In this time we have adopted a rule for our projects:
Stored Procedures are the ONLY mechanism for accessing data.
In this post I will discuss some of the points made by Jermey Miller in his post:
http://codebetter.com/blogs/jeremy.miller/archive/2006/05/25/145450.aspx
One thing I will compliment Jeremy on is his disclosure of the types of applications he builds. Jeremy discusses using the database as mechanism for object persistence. Jermemy states his apps are “are primarily about business rules, not reporting”. Fair enough.
Generally speaking the applications I work on deal with large amounts of data and an equally large number of users. The largest application I work on has over a Terabyte of live OLTP data and upwards of 1,000 concurrent users. This application started with 70GB of data in 1999 and had grown to that size in 7 years. This application was created with 100% stored procedures.
In the interest of disclosure: I am not a TDD expert, nor do we implement a TDD style of development here. However, we do use concepts that have been adopted by the TDD world. Further down this blog post I will discuss TDD concepts so keep this in mind.
So here are some of our realities:
Stored Procedures are a Requirement for Batch Processing
In systems of any size there are always batch processes: Credit card applications need to be scored, letter files need to be created for export to mail houses, customer statements need to be created. How much of this data will you transfer to a middle tier server? If your data is of any significant size the answer should be none or very little. These types of processes need to be run on the server. Transferring data back and forth to a middle tier is not a wise idea on a number of levels the primary one being network performance. These types of operations are best suited to running in a stored procedure.
Stored Procedures as API
Another benefit of stored procedures is that they are an API. Stored procedures separate implementation from interface. Stored procedures are a contract between the developer layer and the database layer.
The physical storage mechanism for data may differ greatly from the presented view of the data. DBA’s must be able to physically change the underlying data without the permission of the developers. There job is to conform to the contract they provided the developer. How they implement that contract is up to them.
Also… stored procedures can be testing in an automated way. You can incorporate stored procedure calls into your unit testing framework.
Stored Procedures and TDD
One of Jeremy’s criticisms about stored procedures is that they: “make TDD a slower, less productive process.” He goes on to state that referential integrity (or other database constraints) requires you to set up a lot of other data in order to run your tests. From his words he never says it’s not possible, just more difficult. It seems to me that having ordered sets of tests or branched sets of tests might solve this issue.
DBA’s Do Work Separately from Developers
In a lot of companies this is the reality: DBA’s work separately from developers. A lot of organizations separate operational and developmental (R&D) duties. DBA’s responsibility is to keep a database operating within acceptable performance parameters. The function of DBA is not just making sure there’s enough disk space and that the data is backed up.
DBA’s do lots of query analysis and tuning. One thing that may not be obvious in a database is that a query that runs fine today might not run the same way tomorrow. Why not ? Well as data is added to a database the statistics surrounding that data may change. This means queries that ran fine last night might not run the same way today. It is the job of the DBA to fix these problems ASAP when they are found.
So if the queries are all being fired from a client, what is a DBA to do? Are they going to ask a developer to make a change to production code and slip stream a version to 1000 desktops during production? Yah right! What they will do in a production environment is: change the offending stored procedure, test it and put it into production on demand. This is a much less intrusive mechanism for making a change during production. Is this recommended? Not necessarily but it is a reality in a production world.
Another reality is that in a lot of cases DBA’s own their databases. Developers are required to live in the world of the DBAs and they dictate just how data is access from there database and who can access that data. In a lot of cases DBAs will provide data to developers in the form of stored procedures, no questions asked. If this is the case what do you do with your TDD process? As Clint Eastwood said in Heartbreak Ridge: “You adapt, you overcome”. Just because it’s hard doesn’t mean it can be done. TDD developers need to adapt there processes to this reality.
Without saying: Let me know whatcha think!
Rodman
Technorati Tags: SQL Server, SQL, TDD, Stored Procedures