Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Use of Abstractions in Dynamic & Static Languages

The more and more that I code in .Net the more that I notice we like to layer our own abstraction on top of 3rd party abstractions. In general (and I don’t have enough experience to really say) I feel that dynamic languages don’t do this as often.

Examples of this concept in .Net (C#)

When you go into a .net code base and you see an interface that is a shallow wrapper over the third party’s interface. A well known example of this is: an IRepository abstraction over the NHibernate ISession. I have seen this done for SmtpServer, file system IO and many other things as well.

Examples of this concept in Python and Ruby

I am not sure if its because the projects I have observed are small or that I just haven’t seen the right projects, but I see Ruby projects just taking (in my mind) a hard dependency on things like ActiveRecord and Python projects do the same thing with a hard dependency on Django’s ORM. Often times they seem to pull in the lib they want to use and just get to work. To be clear, ORMs are probably a bad example, but in general I don’t hear about my friends who program in dynamic languages adopting this pattern.

Why do static language programmers (more specifically me) do this?

Now that we have discussed what it is we are talking about, what causes this behaviors? Are they bad? Are they good?

Isolation

Often times I do this to isolate a library from having to be referenced in each project. Instead I can add the reference once, and then its just a project reference elsewhere. I think this is me doing something that made since at one point in my career but now I just don’t get it.

Fear

I also think that underneath the covers is a fear that the 3rd party lib will change (or that we will want to switch out 3rd party dlls) and we want to protect ourselves from it. I feel this is one of my sillier reasons, because we get those compile time errors and we get test failures (and we are writing tests right?) we should have absolutely no reason to fear a 3rd party API change.

Its too bad that the fear part in my .Net code exists and this thought exists that I can just magically change out the provider to the interface, and everything will still work. So much effort to give myself a warm fuzzy feeling.

Testing (in .net)

In .net since we have static typing, we almost always have to create a ‘shim’ interface to make testing simpler. We do this because we can’t just shove any object into any parameter unlike in our dynamic language friends who have duck typing already (so they don’t need to build an abstraction on top of the other code to mock it out). This is by far my best reason for practicing this wrapping behavior, especially in areas that would need to access external systems. But I still need to be careful not to over do it.

So, what can we take away from this?

I think the biggest thing that I plan to take away from this is a new level of awareness to my code. Most importantly I should NOT do this out of fear. If I am afraid something is going to break, than I should do the only thing that WILL protect me. I should write a test to pin down the behavior. No amount of my clever abstractions are going to protect me as much as the time I spend in writing the 2-3 tests that will pin down the correct behavior.

Another take away, for me, is the question. Do dynamic languages have it easier / better? Inherently, I don’t think that they do, but like dynamic languages often tout, they sure do have a lot less ceremony to go through. Which is an important thing, a low ceremony code base is a happy codebase.

Well that’s my sunday morning thought.

Please share your thoughts on the subject in the comments. I look forward to our conversations there.

:)

-d

About Dru Sellers

Sr. Software Engineer at Dovetail Software.
This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Anonymous

    I hear this alot, and I believe it to be true. But I haven’t seen any examples. Would you be able to provide me with one?? Super curious.

  • Steve Py

    I use abstractions to avoid needing to mock 3rd party dependencies, and also to simplify the use of those dependencies for consistency. Many libraries out there are designed with a lot of flexibility in mind: It’s how they can justify releasing new versions. Some people will benefit from those new features or alternative ways of approaching problems, however it can lead to a bit more guesswork and investment in “mastery” of another toolset to ensure you’re using it to its full benefit. A good example is I use a simple IOC wrapper for Autofac to support a lazy-load property-based model to handle the general register/resolve model. For smaller, non-ORM apps I’ll use simple repositories to translate data objects into business-aware containers.

    Amir’s comments are interesting, while I do opt for delegates (Actions) frequently I still avoid the dynamic keyword like the plague. I got a good dose of poisoning with VB’s “variant” and really don’t want to tread back in that remote direction.I haven’t had much exposure to dynamic languages, but from what I’ve read I’d have to say my attitude between static and dynamic languages would be like comparing music with painting. Working with a static language is like music, where a single wrong note or slip in the tempo can really stand out and ruin a piece. Code must be practiced and refined to be performed to perfection. Following SOLID principles and patterns take time and practice, and deviations from them can stick out like a sore thumb. Dynamic languages seem to be more like painting. A single brush stroke does not make or break the painting, each stroke tries to compliment or improve on the last, and if it doesn’t, the artist is encouraged to apply strokes liberally, completely changing direction or intention at any time until they have something that feels good. 

  • Ben Biddington

    I wrestled with the Dependency Inversion Principle and ruby for a while until I accepted that it doesn’t apply. It’s liberating not needing to introduce abstractions purely as testability seams for things that are unlikely to change.

    Many ruby projects _do_ seem to get bullied by their gemfile into a mess of third-party constants, without thought for domain abstraction. Is this just the open-source way?

  • Vadim Kazakov

    This is spot on.  I used to always abstract NH in my projects. Last project I decided to use NH directly and use a custom QueryOver stub. It worked really well and I didn’t really feel like I was writing untestable code as I had both unit tests and integration tests around all my uses of NH.

    Changing the store type is like you said wishful thinking. It will never happen without majorly refactoring all over the place.

  • Anonymous

    :)

  • Anonymous

    I too am a fan of the ports-and-adapters model, and have been using it successfully for a good while now. Isolating the domain is key, and the way NH allows you to map your entities without infecting them with NH goop is also quite nice (it does, in a subtle way, but for the most part it stays out of the way). 

    I would say that switching from NH -> document db or ATOM feed – would almost certainly change your entire Repository/Data Access patterns, which would require you to change the adapter to accommodate the new style of access. But I don’t think that is a bad thing, either.

    To be clear, I don’t think of the Rails projects as the pinnacle of good design, just wanted to record the insight for us all to discuss. :)

    -d

  • Chris Bristol

    I would say that the biggest reason to want to mock the database isn’t speed, but control.  Being able to get between the application and the database allows me to define indirect inputs and is very powerful for writing tests that are deterministic and precise.  Hiding the database behind a repository abstracts away a lot of the complexities of a database like foreign keys, constraints, identity columns, etc… and saves time trying to satisfy the database for a simple test.  The speed that comes along with being decoupled from the db is an awesome bonus though!

  • Robert McCall

    Personally, I’m both a .NET programmer and a lover of wrapper abstractions, for what I believe are a set of related reasons.

    The first reason is that I like to have a specific set of known-good use cases.  I have wrappers for System.IO.Directory.GetFiles(…) that give me particular subsets of the functionality that the single function gives: getting a single file, getting all files, and getting a set of patterned files.  This gives me the ability to test that specific subset of functionality and to encapsulate the knowledge needed to make the successful call (eg: what are the pattern characters that the function will accept, how is ‘*’ different from ‘*.*’, etc)  If I just use GetFiles all over the place then I either have to keep a separate index of ‘how-to-use’ docs or I just have to hunt down another place that I called it and then re-grok how I used it (often falling into the same pitfalls on tricky corner cases).

    Secondly, it gives me a clear set of access points that I can instrument.  If I want to know how many GetFiles calls are being made, how fast they are or how many total files are being returned, this is the only way to be able to pull that information out (and it’s dead-simple quick, too).

    Thirdly, it gives me the ability to choose how I want each specific interface function to appear to the calling code.  For the past few years, I have been enamored of F# (for many, many reasons) and its notion of curried argument patterns gives me more options as to how to structure my own function definitions which then gives me more ways to wrap external functions (which as a rule do not allow currying).  I do this even for such mundane (and pervasive) functions as the String member functions to facilitate my favorite F# language feature: Pipelining.

    As an example, I define:

    let strLen (pStr:string) = pStr.Length

    which, to many, will appear retarded and useless.  However, I now have the ability to, given an array of the tuples such as (int id, string name, string address)

    let tup1of3 (p,_,_) = p
    let tup2of3 (_,p,_) = p

    let maxIdWd = peopleArray |> Array.map (tup1of3 >> string >> strLen) |> Array.max
    let maxNameWd = peopleArray |> Array.map (tup2of3 >> string >> strLen) |> Array.max

    which is much cleaner than

    let maxIdWd = peopleArray |> Array.map (tup1of3 >> string >> (fun s -> s.Length)) |> Array.maxlet maxNameWd = peopleArray |> Array.map (tup2of3 >> string >> (fun s -> s.Length)) |> Array.max

    especially because utilizing a string’s length is truly pervasive.

    Ultimately, I would generalize this reason as:  because the wrapper functions of my own creation can be meaninfully terse, they thus shrink the size of my overall codebase.  This is *huge* for me because those of us prodigious coders have a lot of code to *read* and saving any time on reading our own code (because we are still the traversal algorithm, even if we have VS10 Ultimate) is a lot of time saved.  Being able to be completely consistent in our function naming and argument conventions and, thus, our calling conventions (with the exception of the innards of our wrapper functions) goes a long way to also speeding code creation.

    Lastly, in the same vein as many others have mentioned, it gives me the ability to transparently choose how my function gets the job done.  Sometimes this is, as the others have mentioned, simply swapping out the back-end implementation, but for me, this is to unify subtly different back-end providers.

    My best example of this is that I am currently dealing with both DB/2 and Sql Server 2008 (“SS8″) with the hope of directly running against DB/2 for now while being able to seamlessly transition the database over to SS8 with minimal changes to the codebase.  By wrapping the ADO.NET db providers’ implementations of DB2Connection/SQLConnection and DB2Command/SQLCommand with identically-appearing classes, I have hidden the subtle differences between the two databases as well as their providers.  Now, transitioning a set of tables from DB/2 to SS8 is much more painless.  Here also, my aforementioned list of benefits also comes into play as there is no more important place (in my experience) to have a single point of access than for db access — having consistent error reporting and general logging of db statements is crucial to both unit testing and performance analysis.

    And, BTW, I developed my common db wrapper classes in F# and then, once tested, ported them them to C#.  This was simply because developing code using F#’s REPL is so much faster than having the huge structure of a VS10 C# project to deal with.  The problem is that F# is not ready for production yet due mostly to nobody having any experience with it yet but also because most programmers are reliant on VS10’s designers which F# has zero so far.  That said, while C# will solve its REPL-less problem in .NET 4.5, it will never be as clean as F# and its native treatment of functions as first class values.

    Now, as to if/why such practices are different for Ruby/Python programmers, I can only hazard a guess that perhaps many of those folks simply haven’t had to to deal with the pain of bugs in dependent library code that us semi-grey-beards have. Those experiences and having dealt with very large codebases have given me the understanding that such wrapper classes are a necessary part of every project I undertake.

    Robert W. McCall
    Lasting peace & happiness for *ALL* human beings.

  • Anonymous

    I guess that I am just not comfortable with the dynamic keyword in .Net.  But that is true. The problem I have with delegates (and I have used them a lot in MassTransit 2.0) is that you (or I guess I) don’t get the same level of R#’ability. That feature is critical to me, when I have to explore a larger code base. That said, the delegate model has been of great value to my framework code in my MassTransit and Topshelf projects to make sure that I don’t infect my customers code. :)

  • Anonymous

    Palm to Face! Yes, I do this as well, and it has proved quite helpful. Thank you for reminding me of this one. I will have update the post!

  • Anonymous

    “Don’t mock code you don’t own” hmm, going to have to chew on that one. Thanks!

  • Anonymous

    Hi James, 

    Lets be careful about trying to understand why humans do anything. :) I would agree that’s one reason, but I have heard another one, that is even more terrifying: “Because a blog post said so”. No thinking, just cargo cult. I would agree that thinking you can swap them out is a bit silly.

    While I agree that the current norm in django and rails land seems to be testing against the database, I am starting to hear the leaders in that community (Jim Weirich at CodeMash)  preach against it. As the size of those code bases continue to grow, the slower their test runs will end up taking. Avoiding that slow down has to be the largest benefit to to mocking the database. If I look at even my fast running tests which count upwards of 3500, and imagine adding 100-200ms latency to each test, that would kill my cycle time (an additional 8 minutes to the build!)

    I agree that Java and .Net do indeed have many Architecture Astronauts that often fail to provide value, but this may be one area where the people are right.

    Also, and you probably don’t mean it this way (damn internet), but trying to claim that something is good because ‘It Just Works’ or ‘I am being pragmattic’ is a good way to get me to flip the “bozo bit” on you.

    Thank you again James for sharing your opinion on the matter. I look forward to more conversations!!

    -d

  • Anonymous

    Bitten me. Never! Nagged? Maybe. 😉

  • Anonymous

    Oh. I would much rather write the abstraction than go to the hurdles of Typemock type tools. But this is just my preference. :)

  • Anonymous

    I do think that dependencies in dyn lang. are not as hard. For one thing you don’t have to have the ‘using’ type statements, so your code really doesn’t know. 

    I am not sure that its easier to ‘fix up’ a project. But surely the looser typing is helpful.

  • http://twitter.com/christensena Alan Christensen

    Well put! I tend to follow the same approach as do the authors of the GOOS book http://www.growing-object-oriented-software.com/ “don’t mock code you don’t own”

  • Anonymous

    I think abstracting out an ORM leads to a lot of performance and optimization problems.

    If you’re using (let’s say) nHibernate you want to use everything that nHibernate gives you. Let’s say you want to use lazy loading. once you abstract it out, you lose all the benefits you had over something like EF. You also risk creating other problems, such as enumerating on things twice without knowing, etc. etc.

    If you don’t abstract it, it’ll be much easier to actually *use* the ORM for everything it’s got.

    Changing to a document oriented db is wishful thinking, as you *use* it differently than you use a relational DB. Using map/reduce, and other features that you get from mongo or raven (say, getting Raven’s query statistics, or suggest) or whatever and your classes *will* change, because your information is not stored in a relational manner, and it directly changes how you create objects.

    Aside, I’ve been hurt a few times when I tested against some in-memory collection for tests using LINQ, and then got burned when it was not supported by the DB’s LINQ provider.

    however, as always, imo *it depends*.  I feel abstracting out a logger would be a much easier if you want to. However, again, it might also become difficult if one logger offers deferred executions, and other things… it can get messy.

    So basically yeah, I feel the same way as Dru 😉

  • http://eglasius.blogspot.com/ eglasius

    I couldn’t agree more with this.

    imho it improves all the code around it, and I take that as a large hint it is the right direction.

  • http://jamesmckay.net/ James McKay

    The main reason why people do this in .net is so that you can, in theory, swap out these components for a different provider. For example, you “might” want to be able to swap out NHibernate for Entity Framework. This is usually completely YAGNI, and makes all sorts of performance optimisations difficult.

    As far as testing is concerned, the norm in frameworks such as Django is to test your code directly against a “real” database. This approach tends to offend the purists, who object that it isn’t true unit testing, and it can be slower, and that your tests shouldn’t hit the database, but in practice, it makes your tests easier to write, and it tests your code against a more accurate representation of your database. (It’s all too easy to come up with mocks, fakes and stubs that don’t behave in the same way as your database, especially when factors such as referential constraints, triggers or limitations of your SQL Server data types have to be taken into account.)

    For what it’s worth, as far as I can see, the advice to mock your database in particular seems to be one of many so-called “best practices” that are popular in .net and Java that were drawn up by Architecture Astronauts to fit scenarios that are either highly unlikely to happen or else highly impractical to prepare for — swapping out one O/R mapper or database engine for another being a prime example. On the other hand, in Python and Ruby, the prevailing attitude is to do things that Just Work.

  • http://twitter.com/ICooper ICooper

    @Dru I am big subscriber to the ports-and-adapters or hexagonal architecture@Dru:disqus  Well, in the case where the most valuable part of any software development project is the domain, isolating yourself from technology frameworks as adapters that work with your ports is a valuable part of ensuring the modifiability of the software. This is not an ivory tower concern. Consider how many sites would find transitioning from WebForms to MVC far easier, if they had abstracted themselves from that domain layer. Taking a dependency on NH is, I think problematic. What happens if you later decide to use a document Db or an ATOM feed to source the data hidden by your repository. Again this is not an idle concern as it often comes up – changes to data frameworks have been legion in MS’s life.

    I want not necessarily consider Rails projects to be an example of good design. I think there can be a little bit too much: “Rails does it this way” in the .NET community. Active Record seems to be an unwieldy beast, and I would be cautious about baking in the dependency on it. The support for migrations, the dynamic nature of building queries etc. all have power, but AR has for me become unwieldy and bloated.

  • Александр Сидоров

    Haven’t Ayende Rahien bitten you? (especially regarding repositories point) :)

  • http://twitter.com/amirrajan Amir Rajan

    A lot of these constraints go away if you are using C#4.0, when you have a dependency the needs to be tested, change the type of the dependency to dynamic and then just create an anonymous type/fake that responds to the same methods.  Also, I don’t get why .Net devs forget about delegates…dependency inversion doesn’t have to happen at the interface/class level, you have can inject a delegate and stub that out when it comes time to test.

  • http://twitter.com/davidadsit David Adsit

    I introduce abstractions like this as well, and for similar reasons, but I think one important idea that you missed it that these abstractions provide an opportunity to redefine the interface of the 3rd party tool in a way that is more inline with how you intend to use it in your domain/application.
    When I use an interface in this way, it makes my intention more revealing when I have to come back and maintain my code later. It is a pleasant side-effect that this has made it easier to replace a library with a new one or our own code when we have bumped into limitations of the libraries we have chosen.
    I don’t generally create an interface over objects like NHibernate’s Session. Rather, I create a facade over the whole Repository and then write integration tests to ensure that calls into my interface on the facade result in the expected behavior in the database. Then I can use that interface throughout my application with confidence.

  • Aaron

    I find this necessary for unit testing.  But I am starting to wonder if it’s worth the time.   It might be more pragmatic to use Typemock or JustMock 

  • http://openid.claimid.com/anbsmith Alastair Smith

    Dru, nice post.  I think the difference may come down to the languages themselves.  In statically-typed languages it is considered good design to build layers of abstraction over your dependencies to isolate your code from them.  At my place of work, we’re starting to feel the pain of  a hard dependency on an ORM that is ~10 years old now.  

    Perhaps in dynamic languages, dependencies are naturally somewhat more abstract anyway?  You can change stuff on the fly in dynamic languages, so perhaps the dependency is never as hard as it is in static languages.  Is it just easier to fix up a project in a dynamic language?

  • http://openid.claimid.com/anbsmith Alastair Smith

    > The best solution to every programming problem but one is “write another abstraction layer”. The one exception? “I have too many abstraction layers.”

    No, a Façade will solve that for you: an abstraction over your many abstraction layers 😉

  • http://blog.zoolutions.se Mikael Henriksson

    Like the way you put this, the main reason I love ruby so much is because I got away from building abstractions for everything. This is something I bring back with me to the .NET programming and that is truly for the better.

    The best solution to every programming problem but one is “write another abstraction layer”. The one exception? “I have too many abstraction layers.”