Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Why Messaging #1 – Intro

So, why messaging?

For me it started when I was tasked with integrating several systems at my current employer. Our goal was to build a data warehousing solution that would take the stress of historical analysis off of our transactional systems. With that goal in mind I set about asking several of my friends how I should go about doing this. The overwhelming response at the time was to read Enterprise Integration Patterns (EIP), and it was there that I discovered messaging as a solution to my problems.

So, again, why messaging? Well I had some goals: integration of bank systems, provide a means that would not turn the overall system into what Arnon calls a Knot (actually so does EIP pg 52), and it should be easy to use. Ophhhhhda! It is my hope that this series of posts will help you to understand the use of messaging and whether its a fit for you or not.

To begin with lets summarize the four basic types of integration according to EIP: file transfer, shared databases, remote procedure invocation, and messaging.

First we have file transfers, the worlds workhorse of integration. We write some data out to a file and then put it someplace the other application can get to. Everything can read them and everything can write them. This makes them simple and effective. However as you environment gets more complex you can start to run into issues around enforcing data standards, when to produce them, archiving them, which applications need them? Thankfully though .Net at least makes it easy to read/write XML (well sort of) and we have the wonderful FileHelpers project for other formats.

Second, we have the good ol’ shared database. Its faster than File Transfers, and it has schema around the data that helps to conform the data. The updates are instant versus the File Transfers batch orientation which is a plus as well. Another boon, like file transfers is that they are pretty simple as well and your application is probably already using SQL anyways. Of course we start to get issues as applications abuse the schema (multiple applications want to use the schema in different ways), or start to put load on the database itself (maybe on a contentious table). But I think the biggest thing is that we have broken encapsulation at a pretty low level. Application A now knows about the internal state of Application B. I think my feelings about this though are best summerized by Jeremy Miller.

I mean, it’s like drug addicts sharing needles. It’s the most insidious form of coupling you can possibly come up with

 

Third, we have remote procedure invocation. This includes things like REST, WS-*, .Net Remoting, WCF and more. This method allows us to share behavior and state something don’t have with files and databases, it better encapsulates the data and for most programmers is easy to grok. The key issue for me though is that the network isn’t reliable. It also opens us up to Arnon’s Knot mentioned earlier. Think of a system where you have a web service calling a web services calling a …. and you will get the picture. So how can we share some state, and some behavior with out getting all tangled up?

Enter messaging. Its data format is really as simple as the file transfer, in fact EIP refers to it as like micro file transfers, with the behavior aspect of RPI but you gain the offline async nature of the messaging systems. Both systems don’t have to be up (like file transfers or shored database), there are ways to intercept and transform messages that you don’t get with RPI (but you can with File Transfers), and most importantly it starts to make you think about non-optimal scenarios. So this is great and all but what am I sacrificing? You lose the call stack, debugging across multiple services / machines is a pain (welcome back to the world of logging), and well, it warps your brain…

This concludes my very short, very contrived overview of the integration options.

Next time, we will discuss what a message is.

-d

About Dru Sellers

Sr. Software Engineer at Dovetail Software.
This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Ahmed

    ..and i think the issue of data concurrency and locking comes in when it comes to shared databases too.
    Nice Article!

  • Andrew

    Dru,

    Ahh, I see. I’m interested to see your future articles on the topic, although I will say I definitely disagree with your stance on replication. 😉

    Andrew

  • http://codebetter.com/members/drusellers/default.aspx drusellers

    Hello Andrew and Victor,

    Sorry for the delay in my response.

    @Andrew: Sorry the blog wasn’t very clear but I was tasked with BOTH integration and reporting. :) But if it was just reporting than replication would be a very good idea. I do have a basic issue with replication in that I think it violates encapsulation, but then everything is a trade off. :)

    Hope that clears things up, I will try to in the next post as well. :)

    -d

  • Andrew

    Victor,

    I don’t really understand your question.

    There’s probably a lot more to the situation that what is listed in this post, but the problem of reporting slowing down a transactional database is a common one. In all but rare situations this problem can easily be solved by pointing reports to a second database.

    I’m not saying that messaging is not a valid solution to the problem, but I am asking why because I just don’t see it. Because changing a few connection strings (which should be all that would be needed) is definitely a lot easier and less time consuming than building a custom Messaging system.

  • Victor Kornov

    @Andrew
    To quote you ” simply move the reporting to a copy of your production database solves that problem”. Isn’t that integration with shared DB, even if just a replica of DB?

  • Andrew

    @Victor?

    How so? In the intial post Dru mentions that certain “historical analysis” (I can only assume reporting) is hitting the transaction database. Now, maybe your definition of disparate is different than mine, but wouldn’t a simple, cheap and efficient solution just be to point those systems (i.e. reports) at a copy of the production database?

    DBAs get to do what they are paid to do, none of your applications need to change beyond a connection string and you even now have a failover database as a bonus. Win/Win/Win.

  • Victor Kornov

    @Andrew,
    the problem here is integrationf of several disparate systems.

  • Andrew

    Granted, I don’t know very much about what you are trying to do, but isn’t messaging (or any other development) just extra work at this point since SQL or Oracle already have built in replication?

    If your main issue is that your transactional database is getting hammered by something like reporting, simply move the reporting to a copy of your production database solves that problem. For example, the bank I worked for used Oracle, and we had Dataguard write to two separate locations (a failover site and a reporting database), so fixing the issue was a simple as changing a connection string to point at the reporting database. Problem solved. Any sort of infrastructure on top of that seems unnecessary.

    I’m interested in reading further articles as to why you decided that messaging was the right solution for the problem.

  • Sean Gough

    Looking forward to this series! I’ve recently reached the same conclusion as you (to use messaging) and I’m now at the “Okay, what now?” stage.

  • Miko Meltzer

    You can use REST/SOPA integrartion as a service like onlinemq.com or Amazon SQS