Idempotency vs Distibuted Transactions

This post comes from a discussion I had with Clemens Vasters at NDC but is a much bigger discussion than the context we had it in which was Azure specific. An exaple of where this also comes into play is with the REST crowd who also want all idempotent operations.


Sure distributed transactions are evil … until well they aren’t. There is a trade off between capital investment and *insert whatever reason you don’t like distributed transactions*. First let’s look at the alternative.


To briefly cover idempotency from Gregor


In the world of distributed systems, idempotency translates to the fact that an operation can be invoked repeatedly without changing the result.”


This is a huge benefit because in failure conditions etc, I can just send you the same message multiple times and it won’t affect you in a bad way. As an example I peek a queue and process the message writing something. Then suddenly my power goes out before I have actually removed the item off the queue. When I start up, I can just re-process the message (it won’t cause destructive change).


Many operations are idempotent to start with. Consider a read or the setting of a property. Other options are not idempotent and need code to be written to make them idempotent as an example charging someone’s credit card what happens if you get that message twice, ouch. The trick here is the operations that are not naturally idempotent, they require us to write code to make them idempotent. That code has a cost associated with it in terms of creation, maintenance, testing, and conceptually.


Clemens gave an example of a system he worked on to illustrate his point, a credit card processing system.


A credit card processing system has a very limited number of possible transaction types. For sake of discussion let’s say that there are 15 and they are called very often, each has a large amount of logic associated with each transaction type (a similar example might be a trading system, not a lot of command types but a lot of logic associated with each). In these cases the idempotency is not a big deal because the amount of code being added is very small for it as a percentage of the total code.


This is very different than a stereotypical business application where there will orders of magnitude more transaction types and they will be often called rarely with small amounts of logic behind them … for sake of discussion let’s say closer to 1500 types with the detection code being vastly larger as a percentage (I have seen business systems where there was more code to ensure the idempotency than to perform the actual operation).


We can make everything idempotent but we will have a very different cost associated with it in the stereotypical business system than with the first system. Out of our 1500 operations how many of them are not naturally idempotent? How much code will we have to write in order to make them idempotent (quite likely ALOT!). Looked at as a percentage of the overall code it will be much higher than in the first example as there is much less code per transaction type on average.


Now what are we gaining by this overhead in order to have all idempotent operations? Scalability and Availability are two things that come immediately to mind. It is very hard to scale distributed transactions to an extremely high level (think Azure scale), many other trade offs start to come in at this level of scaling.


Are you scaling to Azure level? What is the ROI of that code you wrote to make things idempotent for the business system? This is not to say to use distributed transactions haphazardly (you always want to limit their usage as much as possible) but to point out that there are certainly situations where they may be favorable, there is a tradeoff involved. Again this is not to pick on Azure alone, REST also does this.


It is important to consider this tradeoff when you decide to say put an application onto Azure. How well do their non-functional requirements as a platform meet yours as an application? What types of decisions like this have they forced upon you and what will be the long term cost of those decisions.

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

11 Responses to Idempotency vs Distibuted Transactions

  1. Greg says:

    Sure it can be a release issue but for many apps this is not that big of a deal.

    I am not saying DTC should be used for everything, I am saying there is a trade off between the work that has to go on to avoid it through say idempotency and the work of maintaining it in production. Very often the former will be less.

  2. DaRage says:

    OK. give me a specific scenario since you want to see concrete code for it. Also, if you read my first post you’ll see that I said “unit testing or else” in order to avoid this straw man argument.

    I didn’t not argue that everything is unit testable but that, unlike DTC, idempotency is lies within a system borderlines and thus is controlled at development time just like any other functionality provided by that system. On the other hand, DTC cannot be addressed at development time and can be a production issue.

  3. Greg says:

    and seriously why would you be unit testing someone’s DTC? You start a transaction it either fails or suceeds, do you test that transactions work in SQL?

  4. Greg says:

    Please put your unit test into a comment.

  5. DaRage says:

    Greg, testing idempotency is easy because it’s done within the context of one system, the system that provides the idempotent service, so it can be done with plain old unit testing.

    testing DTCs on the other hand is super hard because it involves systems that you don’t necessarily control.

  6. Greg says:

    DaRage I would love if you could provide me a sample test testing idempotency … that would be a really neat trick. How does a unit test prove that their are no side effects on a second call? Do you have every boundary of your system hooked?

    It could reasonably be done with a database but what about any other interactions you may have?

    Testing that something did not happen is difficult in most systems.

  7. DaRage says:


    “Ayende the same can be said for idempotent…”

    No they’re not. Idempotent operation that are not really idempotent are a development issue that can be controlled at development time using unit testing or else. DTCs, as Ayende said, can become a production issue that cannot be controlled at development. Totally different.

  8. Sean Thomson says:

    I honestly don’t see how the two concepts correlate here. Distributed transactions are about ensuring a consistent state by being able to roll back previous steps in the event of a later one failing.

    You can argue that if step 1 works and step 2 fails, an idempotent architecture will let you repeatedly try step 2 until you get a successful result, but distributed transactions are about trying to solve a business problem rather than a messaging problem. If step 2 failed once, in all likelihood it will fail every time, and you will need to perform compensating logic to reverse step 1 in a non-transactional architecture.

    If I pay for a flight on my credit card in step 1 an in step 2 the flight is already full when it goes to be booked, nothing about an idempotent architecture will change the fact I need to get a refund to compensate for step 1.

  9. Frank says:

    Ayende, when do you propose to use when not implementing idempotency and not using MSDTC?

  10. Greg says:

    Ayende the same can be said for idempotent code I have run into many people who have thought they were when they really weren’t and were possibly doing really bad things.

    I have also found many administrators etc who are completely clueless about getting DTC setup and running well. That said maybe its just a crap product… wouldnt surprise me :)

    That said, I would prefer to not get technology specific as I was more talking about the concept of a distributed transaction (2pc or 3pc) vs idempotency …

  11. Greg,
    I have had several production issues with DTC. And I am not talking about massively distributed application.
    I am talking about an app that was completely local to the machine. And DTC was used to coordinate queues & database work (both on the local machine!)

    DTC went into a funk a few times and refused to process any transactions, we required a restart to get things working.

    From some informal pools that I run, just about everyone who used DTC has run into problems with them.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>