Why I do not use Stored Procedures

I promised myself that I wouldn’t ever make another post about stored procedures, but Eric’s post on sproc’s hit a few of my hot buttons on the subject.  Four years ago the pre-Agile, VB6/ASP coding me would have fervently agreed with Eric’s pro-sproc stance and I wrote PL/SQL by the bushel full, but today my answer to the sproc question is a firm “no thank you” or at least a “guilty until proven innocent.”  Besides, I’m an Oracle guy in a Sql Server shop and I despise T-SQL.


First, some common ground.  I think we can all agree that adhoc SQL in code ala ASP circa 1998 is an abomination.  I don’t have that much trouble with using sproc’s for CRUD, but then again I think a domain model approach backed by O/R mapping is more efficient in terms of the all important developer time and the domain model approach leads to better code.  The newer parts of our applications can accommodate database changes by simply changing the NHibernate mappings.  No sproc’s necessary.


So to Eric’s post, first this:


So when I see people in the TDD crowd slamming stored procedures and views it greatly confuses me.

That’s an easy answer Eric, stored procedures make TDD a slower, less productive process.  Business logic in stored procedures is more work to test than the corresponding logic in a domain model class.  Referential integrity will often force you to setup a lot of other data just to be able to insert the data you need for a test (unless you’re working in a legacy database like ours without any foreign keys;)).  Stored procedures are inherently procedural in nature, and hence harder to create isolated tests and prone to code duplication (a duplicated “where” clause is duplicated code and duplication is bad).  Another consideration, and this matters a great deal in a sizable application, is that any automated test that hits the database is slower than a test that runs inside of the AppDomain.  Slow tests lead to longer feedback cycles.  Trust me on this one, a team will be much slower with a 20 minute CI build versus a 5 minute build time.


I’ll make an admission.  I bet I haven’t written a full 100 lines of direct ADO.Net manipulation code in the last six months, yet we’ve rolled out a multitude of new functionality in that time frame.  I also write very little non-trivial SQL these days.  Between a little OO query engine we’ve developed and NHibernate, there’s simply no need to spend time in the muck of data access.  Like Jeff Atwood said, stored procedures are the assembly language of the database.  I’ll use sproc’s when I have to have a performance gain, but until then they’re nothing but premature optimization.


I think the gap in thinking about sproc’s is about how you see the role of the database in an application — do you talk about the database in passing as a persistence mechanism or is the .Net code merely a way to get data back and forth from the UI and database?.  The applications I build are primarily about business rules, not reporting.  The database to me is simply a persistence mechanism for our domain objects and messages en route.  Maintainability, which is almost synonymous with testability, is king.  A large, comprehensive body of automated tests at both the acceptance and unit test level are, in my opinion, the best way to accommodate changes to the code.


Again from Eric’s post, and this is where I got hot under the collar (it’s not directed at you Eric) –


Your DBA team can work completely independently of your programming team and do pretty much whatever they want and your application doesn’t care!

Expletive.  NO, NO, NO!!!!!  The DBA should most certainly NOT work independently of the programming team.  Stored procedures are code, and potentially destructive code at that.  If you change a stored procedure you *MUST* integrate and test the stored procedure against the application before it gets anywhere near production.  I’ve been burned by this one too many times.  The stored procedure code absolutely has to be built and versioned within your Continuous Integration build.  I’m sick of bugs caused by a sproc getting out of synch with the code.  The idea of a DBA, or a maintenance developer, putting a different version of a sproc directly into the database is a bad, bad, dangerous practice. 


The extra configuration management burden is one of the reasons most Agile teams stay away from stored procedures.  If I’m relying on NHibernate mappings or parameterized SQL in the code that SQL is going to be built and versioned with the C# code automatically.  My risk of having mismatched sproc and C# code is greatly minimized.  Our C# assemblies are signed with the CruiseControl.Net build number and it’s simple to spot mismatches  The stored procedures are, err, well, uh, I don’t know if that’s the same version that the code was tested against to be quite honest :(


The absolute worst thing you can do, and it’s horrifyingly common in the Microsoft development world, is to split related functionality between sproc’s and middle tier code.  Grrrrrrrr.  You just make the code brittle and you increase the intellectual overhead of understanding a system.


 


 

About Jeremy Miller

Jeremy is the Chief Software Architect at Dovetail Software, the coolest ISV in Austin. Jeremy began his IT career writing "Shadow IT" applications to automate his engineering documentation, then wandered into software development because it looked like more fun. Jeremy is the author of the open source StructureMap tool for Dependency Injection with .Net, StoryTeller for supercharged acceptance testing in .Net, and one of the principal developers behind FubuMVC. Jeremy's thoughts on all things software can be found at The Shade Tree Developer at http://codebetter.com/jeremymiller.
This entry was posted in Database and Persistence, Ranting. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://mickyd.wordpress.com/ MickyD

    “The DBA should most certainly NOT work independently of the programming team.  Stored procedures are code, and potentially destructive code at that.  If you change a stored procedure you *MUST* integrate and test the stored procedure against the application before it gets anywhere near production”  -you make it sound as the rest of your application is using just one other language compared to SPs; not only that but the rest of your app is tightly coupled. more than likely the scripting language in your web site is not the same as the language in your service tier; are you just as picky on your web developers? no! so why pick on SPs.  

    using your logic, you would try to use c++, .net or VB in your web front end which is just silly.

    use the best language for each part of your entire enterprise app.

  • DotNetGuy

    You make strong valid points regarding testing, but to have all of your application sitting within the app domain bloats your codebase to the point to where it is border line unmanageable. Also, if you are going to use this technique (ie. minimalist approach on the database) you had better pray to god that your developers are good because if they are not very good (or think they are but are not) then you get spider web code and extreme inefficiency because they dont think.
    For example. You have a method that goes to the database to get a product from a product catalog for an e-commerce store. Your “not so good” developers will usually 1) call a function to return which products should be viewed, 2) iterate through those product id’s and make seperate calls to go get each product, 3) possibly grab metadata for each product for each product return within the iteration. I have seen this type of stuff done (not specifically on just e-commerce applications) many times. Of course this is not just done when using an ORM, but it happens much much more because of them.
    In my opinion, Eric’s post regarding the DBA might have been a little quick to draw a line in the sand between developers and DBA’s but the idea of seperating them (to some degree) I think is a good idea.
    Great post though.

  • jmiller

    All, that’s enough for now. Comments closed, and let’s move on to something more interesting.

  • http://talideon.com/ Keith Gaughan

    There’s another argument against them that people have been omitting, that they, like the database schema itself, are a pain to keep under version control.

  • http://www.chadmyers.com Chad Myers

    Everyone: Please read Frans Bouma (of LLBLGen and flaming-every-blogs-comments-section fame) post on this subject:

    http://weblogs.asp.net/fbouma/archive/2003/11/18/38178.aspx

    He pretty much establishes that all the performance and security arguments for sprocs are bunk.

    Also, there’s a historical argument here: SPROCS were not *MEANT* to be your DAL. It’s a trick in SQL 6.5 for performance that they grew into that role. Now that the main arguments have been yanked out from under them, can anyone seriously defend their use? The extra complexity, the code-in-many-places problem, the two-or-multiple language maintenance problem, the DBA’s-doing-application-development problem, etc, etc.

    It just seems silly to be using Sprocs today when you have things like NHibernate.

    Sprocs have their place and they are very useful *WHEN YOU NEED TO USE THEM FOR THEIR INTENDED PURPOSES*.

  • Jacob Eggleston

    @jace

    The MSDN help item you linked to is a leftover from the days of SQL Server 6.5. A dead giveaway is the reference to ODBC – when was the last time you used that? Every single point made on that page is either out of date or, as in the case of “Greater consistency/accuracy” were at best questionable when that was originally written. The performance argument was a valid one on 6.5, but is not any more. Starting with version 7 MS made huge changes to the way sprocs and queries were compiled and optimized. It has been proven by many people since then that, when used correctly, there is rarely any significant difference in performance between stored procs and the parameterized queries that OR mappers like NHibernate use. I’ve done my own tests and come to the same conclusions. Besides even if there _IS_ a slight difference in performance, using sprocs for this reason is committing one of the most common (and most bone-headed) mistakes that developers can make: premature optimization. Few things get me more worked up than developers who will gladly waste thousands of dollars in development and maintenance of sprocs to milk that extra 6ms out of a round trip to the server that takes 1600ms before they’ve even done any testing to identify where the bottleneck really is!

    The security argument that sproc advocates trumpet is also bogus in all but a handfull of edge cases. Anyone making this claim doesn’t fully understand SQL Server’s security model or application security in general – or maybe they’ve just never really tried to do it any other way. Any security hole that allows a user (malicious or not) to manipulate your data willy-nilly could be exploited just as easilly with sprocs as without them.

    I’ve rarely heard an OR mapping proponent argue that sprocs don’t have their uses, only that you should use the right tool for every task (and in my book anyone arguing that you should never use sprocs is no smarter than one who argues you should always use them). Sometimes sprocs are the best tool for a task. Reporting is often one of those cases, while CRUD functionality rarely is.

    One interesting observation is that I frequently see developers convert from a pro-sprocs-for-everything position to pro-OR-mapping, but seldom, if ever, do I see the reverse. Why is that?

  • http://substantiality.net Sam Smoot

    Jace: I’d recommend you read SQL Books Online a little more thoroughly. Pay close attention to execute plan caching, and you might even want to give a little thought to dynamic sprocs, the WITH RECOMPILE keyword, and how branching in sprocs can actually negatively impact performance if you don’t know the ins and outs.

    You might also want to think about all the great sproc testing frameworks you’ve run across compared to all the great c#/java/ruby/etc testing frameworks available., r

    You might also try to imagine how many applications you’ve ever worked on that actually shared sprocs between them, hitting the same databases… were they in the majority or the minority?

    You might also look into your network packet size, and ponder how big a query has to be before it requires multiple data packets, and what impact that might have on performance, especially in a switched environment.

    Once you’ve thought about all this, and corrected some of your more glaring assumptions, you might understand how it can be a little frustrating debating this subject with people who are obviously just parroting “common knowledge”.

  • http://weblogs.asp.net Jeff Gonzalez

    @Jace

    “Higher performance-SQL Statements are parsed and compiled when procedures are created. This overhead is then saved when the procedures are executed.”

    This is incorrect. As far as I know SQL still operates under an interpreted model. You don’t “compile” code. I realize this is a common misconception. What actually happens when you write a sql statement and execute it, the statement gets stored in a cache. This is typically referred to as the procedure cache.

    Whether you use a stored procedure or a parameterized query here, it doesn’t really matter. The point is that Sql Server holds onto this statement and some reference to its execution plan so that it takes less time to run on subsequent queries. Most, if not all, OR/M tools use parameterized queries. So as long as the statement that is being executed matches exactly each time, the performance for stored procedures and queries would be identical.

    “Reduced network overhead”
    I don’t know how true this really is, I mean, i can see how you would send less data by using a stored procedure name instead of batching the whole sql, but in ADO.NET all parameterized queries (at least where SqlClient is concerned) actually get turned into sp_execute calls. I noticed you mentioned something about T-SQL in one statement and something else about ODBC in another. I am sticking to Sql Server for the scope of this dicussion.

    I actually wrote a much larger post earlier in response to Jeremy’s original post and it seems to have gotten lost.

    I still think of Stored Procedures as an interface or contract. They are at their most basic nature a remote or distributed method call. If this interface changes, your client code changes.

    I am of the opinion that unit tests that hit the database are no longer unit tests, they are integration tests. You can test your code without the data (and you probably should using some sort of Mock framework). You can also test your stored procedures in isolation, I will concede the tool support isn’t as nice for this as we don’t really have a SqlUnit tool =)

    Last but not least, most of the DBAs I have seen or worked closely with have an intimate knowledge of object oriented programming. If you rely solely on an OR/Mapper for all of your sql needs, it becomes a bit more difficult for the DBA to analyze and tune the performance problems.

    This is why I feel the seperation (in some cases) is good.

  • http://dotnetjunkies.com/weblog/johnwood johnwood

    >> Because stored procedures are usually developed by experienced programmers… << That’s quite funny actually… how the heck do they know that? Stored procs are actually usually developed by inexperienced programmers – experienced DBAs perhaps, but not programmers. At least not that I’ve seen.

  • jace

    @ William Dieterich:

    Sorry, but your point number 1 is dead wrong, simply look into the documentation:

    “When executing statements, calling a stored procedure on the data source (instead of directly executing or preparing a statement in the client application) can provide:

    Higher performance
    SQL statements are parsed and compiled when procedures are created. This overhead is then saved when the procedures are executed.

    Reduced network overhead
    Executing a procedure instead of sending complex queries across the network can reduce network traffic. If an ODBC application uses the ODBC { CALL } syntax to execute a stored procedure, the ODBC driver makes additional optimizations that eliminate the need to convert parameter data.

    Greater consistency
    If an organization’s rules are implemented in a central resource, such as a stored procedure, they can be coded, tested, and debugged once. Individual programmers can then use the tested stored procedures instead of developing their own implementations.

    Greater accuracy
    Because stored procedures are usually developed by experienced programmers, they tend to be more efficient and have fewer errors than code developed multiple times by programmers of varying skill levels.

    Added functionality
    Extended stored procedures can use C and C++ features not available in Transact-SQL statements.”

    from:

    http://msdn2.microsoft.com/de-de/library/ms131440.aspx

    Pretty clear isn’t it?

  • william dieterich

    Speed wise…

    This is about CRUD based sprocs, if you have to do a bunch of looping or multiple selects to get a single piece of data then sproc is the way to go.

    dynamic code is as fast or faster then sprocs for CRUD.

    1) people claim that sprocs are compiled and stored in the database differently then dynamic SQL; this is completly false but still found in alot of places. The use of parameterized dynamicA SQL removes any speed boost that sprocs may have from query reuse.

    2) And what most people ignore is all thoses COALESCE statements and OR statements in sprocs are slow. As has been shown in multiple cases, and is easy to duplicate, even a sproc with just 3-5 optional parameters you will get a 15% to 20% slowdown using COALESCE or 30% to 40% using OR statements compared to dynamic SQL.

  • David Kemp

    “The database to me is simply a persistence mechanism for our domain objects and messages.”

    That’s where I’m at too. I think that until someone has used a tool like NHibernate, it’s very difficult to see the database as somewhere where you store data. Your code turns the data into something meaningful. Of course, I still use all the tools in the database to make sure that the data isn’t corrupt, but unless I’m automatically generating sprocs and code to call them, they seem like a bit hassle.

    Eric: If this is a real concern, and something that you can’t enforce otherwise, then you can always use process impersonation and restrict access to methods/classes using security policy.

  • http://weblogs.asp.net/jgonzalez Jeff Gonzalez

    Sorry, I forgot 1 more thing I wanted to say.

    I think that Eric is right when he says that calling a stored procedure is a contract or an interface. It is like consuming a third party library, which has an established API. A stored procedure has a stored execution plan and a very specific list of parameters it needs to execute that plan.

    How is that not a contract or interface?

    Any talk favoring stored procedures for performance or security at this point is FUD. To that I say QFT. Parameterized queries defend just as much injection as a stored procedure. While it does allow you to operate without priveledges for a table, views would allow you to accomplish the same thing in most cases.

  • http://weblogs.asp.net/jgonzalez Jeff Gonzalez

    I made a comment about DBAs and their seperation.

    I think O\R Mappers are a great technology and provide developers with a huge productivity advantage. I am operating under my experience only here, and I am sure things are different for each and every company, but all developers are not sql gods. In face some developers are horrible at SQL. Coding in an object language and understanding set based logic are two totally different mindsets. Some developers do both very well, but I would say most do not.

    This gets me back to the original discussion. I don’t believe that any said you cannot test stored procedures. I would agree with you a unit test shouldn’t hit the database. It crosses boundries that to me signify integration testing, not unit testing. Unit testing should be done in isolation. Clearly having a unit test touch the database isn’t isolating the code. This is where mock frameworks come in handy.

    If you buy an O\R Mapper, and we can say for arguments sake that you purchased a relatively open orm tool that allows you to fully control the sql that is generated by the orm engine, most DBAs who tune for performance will not know or understand object code as intimately as developers. As I said before, it is a different mindset. DBAs think about data, developers think about objects.

    I will not debate the matter of storing business logic in the database, everyone has a different opinion on this and I believe that there is no hard and fast rule that can decide this.

    By encapsulating code in stored procedures, it allows DBAs to tune performance in the area of expertise they are comfortable in. I am pretty sure that JoeBlow developer could write code in an ORM that performed badly, especially if the ORM developer was very lenient in allowing Joe to design his own queries. He could still create inefficient joins or horrible where clauses that cause the sql server to waste cycles. The difference is that now you have a DBA trying to performance tune code that they probably will not be familiar with right off the bat.

  • karl

    Relational databases are too good at what they do to be treated as nothing more than persistence mechanism. I’ve seen catastrophic failures as a result of this belief, and not a single success. Don’t get me wrong, I’m 100% sure that’s only because of my limited exposure, but it’s enough to tell me that there’s a learning curve and the stakes are offly high.

    I can’t be the only one who likes to map relational data to objects manually (codesmith helps)? Sure it’s a lot more grunt work, but it just seems simpler.

  • http://substantiality.net Sam Smoot

    Funny… I would attribute defacing attacks to application flaws. I’m not sure I’ve ever seen the claim that sprocs would have prevented one before… but your experience is your experience.

    The billing argument is more than a bit silly. Sprocs are obviously not the only solution. A view does just as well, but we’re still not out of options yet. Put your cards in a different table (preferrably behind a different service altogether, on a different system altogether if you _must_ keep them at all).

    Even if you decided to go with sprocs in this (oft-repeated as the last bastion of sproc supporters everywhere) scenario, nobody’s going to make fun of your application. It’s _one simple case_. Don’t get so tied up in the purity of anything that some edge case keeps you from delivering…

    WRT: “hijacks your .dll”: If someone is able to start executing arbitrary code in-process on your application server, I wouldn’t take much solace in having sprocs as a line of defense.

  • http://www.codebetter.com/blogs/eric.wise ewise

    I’ve witnessed insert/update attacks to put vulgarities into forum sites. I’ve seen them used to hack a record into an order system so that the hacker could get access to legitimate product keys.

    What about security? How about not using an ORM with access to a database table with billing information? Now anyone that hijacks your .dll can select * from your credit card table. I tend to keep tables with billing and other sensitive personal information on lock-down.

  • http://substantiality.net Sam Smoot

    What about security? Giving UPDATE/INSERT Access doesn’t mean you give the user the SA account with no password and call it a day.

    If you have a robust fault-tolerance solution (including backups), then what’s the danger really anyways? Or phrased another way, when was the last time you heard about a system intrusion whose goal was to INSERT/UPDATE a bunch of random records in a few tables. I’ve _never_ heard of it frankly. And as hard as I try, I’m not naive enough to believe the only security fault (or the most dangerous, or the simplest) in my application would be an Object Relational Mapper vetted by (at least) thousands of developers in tens of thousands of applications.

    The “sprocs are an interface” argument just doesn’t jive with the real world IME, but to each his/her own I suppose. Mostly I see sproc proponents claim “sprocs are great, I can change my tables!”, but how often does it actually happen? During development sproc maintenance is just another layer to slow you down. During production operation, most developers would much rather not touch existing sprocs IME, preferring a little CYA and copy&paste action instead. Which only leads to an even bigger maintenance problem.

  • http://little.xmtp.net/blog/ Jay R. Wren

    What about security? With the stored proc approach you can handle security with different DB user accounts for different queries and allow your users to access the database with the various accounts through your application.

    With NHibernate you have to implement this entire security model yourself in your application. Maybe that is OK, but it does seem like a lot of work. How do you handle it?

  • http://www.developernotes.com Nick Parker

    Jeremy,

    Care to share more about your OO query engine and how it integrates with NHibernate?

    Nick Parker

  • Jaosn

    A very interesting post. I have been a highly vocal proponent of SPs (I too have a PL/SQL backgorund) but as I embrace TDD I am starting to change my mind.

    Your observations about DBAs are all true true but I think the essence of that problem is inproper utilization of that resource. If the DBA (or any other person) is not an inegrated member of the team then the chances whatever they produce doing what it’s supposed to is slim (at best). In my experience most developers are better at SQL cause they have more experience (we don’t spend all of our time worrying about backups etc.!) Where DBA cvan be invaluable is when you have to do something new and leveraging a feture of the RDBMS (I don’t really buy into database independence as a concept) can be a huge time saver or performance boon.

    I do have questions about performance and scalability (especially with SQL Server) but most applications do not need to be perfect in either respect. So next timme perhaps I will try to use SPs only for the slow bits and see where this gets me.

  • http://weblogs.asp.net/jgalloway Jon Galloway

    I think you bring up a good point when you talk about how the database is used. If it’s a persistance medium, there’s not much point in writing a ton of stored procedures. If you’re doing complex queries, though, a tuned query can be significantly better generated SQL. I’m not going to worry about a second here or there – I’ve seen cases (and I’m sure you have as well) where tuning a query took the response time from tens of minutes to a few seconds.

    It all depends on the application. The best approach seems to be a system that assumes simplicity (generated SQL), but allows you to use a query when you need it – inlined assembly, to continue with Jeff’s analogy.

  • Sachin Rao

    Jeremy

    One of the things I have heard in favour of stored procs (as compared to parameterized sql) is performance.

    Another point usually made is reporting (where you might have to use stored procs in spite of having an o/r mapper).

    What’s your take on that?

    Sachin

  • http://spaces.msn.com/dscheidt David Scheidt

    Hi Jeremy,
    I like the article / thoughts. Quick question, though… How does it scale? Are there any example projects that use this approach? I’d love to see a working web site or code to understand how this can be implemented.

    Thanks!
    - David Scheidt