Just some little fundamental things to help you CodeBetter

I’m a big believer in learning the fundamentals of coding and design.  To me, things like design patterns, design principles, and code smells are more important to learn than specific technologies.  To paraphrase a conversation I had yesterday, I think it’s vastly more important for a newbie develeloper to understand OO principles before memorizing the inner workings of ADO.Net.  It can seem like a bunch of academic hotair, but over the last couple weeks my group has come across some cases where a knowledge or ignorance of these little fundamental things has had a real impact on our success that I thought I’d share.  Here you’ll find one principle, one code smell, and a dash of ancient wisdom that we need to occasionally relearn.


I think I’d be a ringer at a software design version of Trivial Pursuit, but rattling off the names for things like the Liskov Substitution Principle isn’t as important as internalizing that principle to the point that it’s just something you do.


 


Liskov Substitution Principle


As interpreted by me, the Liskov Substitution Principle states that if a method or class has a dependency on an abstract type it should be able to accept and work with any implementation of that abstraction without changing the original code.  The official wording is:



Let q(x) be a property provable about objects x of type T. Then q(y) should be true for objects y of type S where S is a subtype of T.


That doesn’t do that much for me, so how about “downcasting from an abstraction is a code smell.” 


To explain this principle let’s look at a violation from my current project that’s just begging for a fix.  We use a third party grid control (not Telerik) in an unbound mode for maximum flexibility of display and rendering.  Before the grid is displayed, we create an “ICell” object for each cell in the grid that models the value and formatting of each cell.  When the grid starts to draw any cell for the first time it raises an event requesting information for the row and column coordinate.  One of the arguments to the event handler is an object that contains all of the attributes of the cell display like the following:


 


    public class CellFormatArgs : EventArgs
    {
        public string Contents;
        public Color BackgroundColor;
        public Color FontColor;
    }

Our event handler simply has to find the correct ICell for the requested row and column, then set the properties on the CellFormatArgs object appropriately.  Fine and dandy, but the code that does that work looks like this:


 


    // The ICell implementations are just data holders in 
    // this first version
    public interface ICell
    {
 
    }
 
    public class CellDouble
    {
        public double Value;
    }
 
    public class CellRate : CellDouble
    {
    }
 
    public class CellPrice : CellDouble
    {
    }
 
 
    public void RenderCell(int rowIndex, int columnIndex, CellFormatArgs args)
    {
        ICell cell = fetchCell(rowIndex, columnIndex);
        if (cell is CellPrice)
        {
            // render the price cell
        }
        else if (cell is CellRate)
        {
            // render the rate cell
        }
        else if (cell is CellDouble)
        {
            // render the double Cell
        }
    }

Pay particular attention to the RenderCell method.  It can’t just accept any old version of an ICell and go with it.  The RenderCell method has to know exactly which type of ICell it is then act accordingly.  If you need to add a new type of ICell you’ll have to also change the RenderCell, violating the Open/Closed Principle.  When we modify our code to comply with the Liskov Substitution Principle, we’ll get this:


 


    public interface ICell
    {
        void Format(CellFormatArgs args);
    }
 
    public class CellDouble : ICell
    {
        public double Value;
 
        public void Format(CellFormatArgs args)
        {
            args.Contents = Value.ToString();
            if (Value < 0)
            {
                args.FontColor = Color.Red;
            }
        }
    }
 
    public void RenderCell(int rowIndex, int columnIndex, CellFormatArgs args)
    {
        ICell cell = fetchCell(rowIndex, columnIndex);
        cell.Format(args);
    }
 

We’ve made the ICell interface implementations responsible for knowing how to render and format themselves.  All RenderCell() has to do is find the right ICell and call the Format() method on it.  RenderCell() no longer has to have any knowledge whatsoever of what exactly each ICell is.  We can now add new ICell types without touching the RenderCell() method, putting us back into conformity with the Open/Closed Principle.  I’d also argue that RenderCell() is easier to understand, and the formatting rules for each ICell type is easier to understand because the logic is in the context of each ICell instead of being in the middle of a long method containing all of the formatting rules.  Our formatting rules also got easier to test here as well by pulling the formatting of the CellFormatArgs away from the third party grid’s event handler.  We’ve improved coupling from RenderCell() to the ICell implementations.  We’ve improved the cohesion of our classes by putting the formatting responsibilities into the ICell implementations.  Now all of our rules and logic for formatting rates are completely contained inside the CellRate class instead of scattered over the code. 


In general, I’d say that any downcasting from an abstract type to a subclass or specific implementation is a code smell.  There may be a perfectly justifiable reason to downcast in your code, but when you see it creeping in you should assume that it’s “guilty until proven innocent.”  As a general rule, you want the interfaces and contracts of your classes and services to be taked completely at face value.  Everything you need to use the class or service should be evident or visible through the contract.  If you need to take into account the internal workings behind the interface or contract, you’ve got a nasty form of tight coupling.


Why is there a fancy name for this?  Isn’t this just common sense engineering and polymorphism?  Well yes I guess it should be common sense, except that it’s commonly violated to negative effect.  Besides, if you had a fundamental principle of software design named after yourself, wouldn’t you want people to make over it just a bit instead of blowing it off?


 


Shotgun Surgery


Shotgun Surgery is one of the Code Smells from Martin Fowler’s Refactoring book.  Have you ever found yourself in a situation where changing one piece of code always forces changes to one or more other pieces of code?  That’s doing Shotgun Surgery.  What the smell tells you is that all of those pieces that have to change together should probably be unified into a single entity so that changes can be streamlined.  It’s following the DRY Principle so you can make changes with fewer mechanical steps.


My group hit something like this last week.  We have an enumeration that basically identifies every possible type of financial number in the system.  In another namespace altogether is a class that keeps a string description for each unique value of the enumeration for screen display.  Every time we add a new value to the enumeration we have to remember to go over to this other class and add a new string to the other string array or bad things will happen when the client runs.  Our eventual fix is to create a small new class to represent the enumeration values that will also contain the description.  New enumeration values will then be completely defined in one place.


 


Optimize by Measurement


So you know how for your entire career you’ve heard that the best way to optimize performance is to measure the performance first because the hotspots may not be where you think they are?  And you’ll be better off in terms of effort spent to deal with the hotspots before anything else?  I lived that little exhortation the other day.  I have a screen that was far too slow a couple days ago and knew I was going to spend some time optimizing.  While I was stuck in a useless meeting I started jotting down ideas to eliminate some fat out of my calculation and aggregation code that would have required some significant changes.  Once I got back to my desk I ran the JetBrains profiler on the screen just to get some benchmarks.  Unsurprisingly to the experienced folks out there, the biggest performance problem was something completely unexpected.  The entire set of calculations and expensive grid rendering was performed twice inside the twisty event rippling of the screen.  Thirty minutes later I got the double screen rendering solved and the screen was much more responsive — without so much as touching the calculation code.


Lesson learned.  Pay attention to the older folks.


 

About Jeremy Miller

Jeremy is the Chief Software Architect at Dovetail Software, the coolest ISV in Austin. Jeremy began his IT career writing "Shadow IT" applications to automate his engineering documentation, then wandered into software development because it looked like more fun. Jeremy is the author of the open source StructureMap tool for Dependency Injection with .Net, StoryTeller for supercharged acceptance testing in .Net, and one of the principal developers behind FubuMVC. Jeremy's thoughts on all things software can be found at The Shade Tree Developer at http://codebetter.com/jeremymiller.
This entry was posted in Featured. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • si

    I’m curious about your last paragraph “Optimize by measurement”…why did you have to fire up the profiler? If something is being processed twice, then shouldn’t that get picked up in your log?

    Perhaps I’m just lazy, but I find that for most cases logging solves the first part of that paragraph: measurement, if you place a log entry at the start of a process, then compare the timestamp against the start of the next process, it’s easy to narrow down the culprit when required. At that point you may want to fire up the profiler. Or am I missing something?

    The other benefit to logging rather than profiling is that it can be more easily applied in a production environment as well as in development.

    Just my AU$0.02 cents, worth what you paid for them ;-)

  • http://gleichmann.blog.com Mario Gleichmann

    Well spoken, i fully agree with your mentioned priorities (knowledge of common design principles over specific technologies).

    You’ll find more of these basic design principles in Robert C. Martins (‘Uncle Bob’) fantastic book ‘Agile Software Development’ (Single Responsibility Principle, Open Closed Principle, Dependency Inversion Principle, …) that will guide you towards common design goals (loose coupling, high cohaesion, …).
    (but i think you already know about that book ;o))

    Obeying the contract of a base abstraction is good, but unless you can only specify a syntactical contract, LSP remains only a kind of advice a developer might follow or not! What we need is a contract specification not only on a syntactical level but on a semantical one that’s being checked at runtime (forcing a base abstraction and all of its subtypes to obey that behaviour oriented contract). Preconditions, invariants and postconditions should specify the semantic behaviour of a ‘component’, so if a client want to interact with that component the contract is clear for both sides (client have to obay the preconditions, servers / subtypes have to obay the postconditions)
    That said, i think Design by Contract is that kind of ‘tool’ that’s missed for forcing LSB on the right level- the semantical level.
    Even the fact that subclasses can only weaken preconditions (that clients have to fullfill) and strengthen postconditions (the server has to fullfill) goes pretty hand in hand with LSP!

    Greetings

    Mario

  • Greg M

    Responding to Max: Languages without algebraic datatypes _are_ an anti-pattern. Deal with it. :)

    OTOH, you only use ADs when you don’t expect to add more cases later on an ad hoc basis, so the main objection from the article vanishes. It still _looks_ ugly because without ADs there is nothing to make you realise that IMaybe is meant to be a transparent, closed (as in no additions) type.

  • http://blog.ashmind.com Andrey Shchekin

    I do not agree that the enum sample is good. Display names may change way more often than enumeration values, may require localization and belong to display logic anyway.

    Thus said, I often put name in the same class as value. But it is not something I would recommend as a good practice. Resources are much better place for such things.

  • http://www.rgoarchitects.com/nblog Arnon Rotem-Gal-Oz

    “Someday we’ve gotta have a talk about your stance on the Architect position though”
    Maybe this would help you understand my position on the subject
    http://www.rgoarchitects.com/nblog/2007/09/29/TheNETJavaArchitectTrainingProgram.aspx

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    Arnon,

    I generally read just about anything you put up on your blog or DDJ. Someday we’ve gotta have a talk about your stance on the Architect position though

  • http://www.rgoarchitects.com/nblog Arnon Rotem-Gal-Oz

    @ Jeremy
    If you liked it, then you might also want to take a look at a paper I wrote which aggregates this and a few other posts I made on a few other OO principles. There’s a paper http://www.rgoarchitects.com/files/ooprimer.pdf
    and a presentation http://www.rgoarchitects.com/Files/ooprimer.ppt

    Arnon

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    @Max:

    I like to use “Double Dispatch” in cases like that to handle the variance. I’ve been wanting to do a post on that forever that I guess I could get going.

    @Chad:

    Forget the DTO idea. Think in terms of a Presentation Model (like Acropolis). You may create a class that represents the screen behavior and data that completely wraps the inner domain. It is a lot of potential duplication and a wormhole effect between Presentation Model and Domain Model, but it completely decouples the Domain from the screens.

    In some cases I’d say that decoupling the domain from the view does more good than setting up the wormhole and shotgun surgery effect. For example, driving the domain model from behavior and business logic will often create a hierarchical structure where the dadgumn data binding crap needs a flattened structure. Going to a Presentation Model allows the domain model to be built in a way to make the behavior easier to express while the data binding goo gets an object that conforms to its view of the world.

  • http://community.hdri.net/blogs/chads_blog cmyers

    I’m having a hard time reconciling Shotgun Surgery and DRY vs. what J.P. Boodhoo said in his recent post about DTOS.

    Both seem good design, but the problem with having a bunch of DTOS for every user interface situation would require me to have to change a billion classes every time I, say, add a new field/property/column to the UI, Domain, Database.

    It seems like there is no good solution for not having to do Shotgun Surgery when changing the domain/schema.

  • Brad Mead

    Thanks a bunch. I’ll return to this often in the near future as I am surely a perpetrator of the if… then violation :)

  • Max

    I’ve sometimes found its unavoidable to do this kind of instance testing when working with a class hierarchy which would be implemented in a functional programming language as an algebraic sum type. A trivial example:

    interface IMaybe
    { }

    class Some : IMaybe
    {
    public int Value { get { … } }
    }

    class None : IMaybe
    { }

    Where I’ve hit this in the real world is where I’ve been sending data holding message objects between disconnected applications: I have to switch on the concrete type of message to unpack and act on it.

    I’ve always felt a bit bad about this antipattern, but I haven’t come up with a better solution.. in particular it seemed inappropriate to move the actual doing-work methods onto the messages because e.g. they may be shared between many applications where the work to be done varies…

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    @Brendan,

    Sure, but in this case there’s more than just the description. I want to move it to a strongly typed enumeration instead ala Java.

  • Brendan Tompkins

    Jeremy, this is completely an aside, but be sure you check out the description attribute for enum values.. It works well for on-screen representations of enumerations.

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    @Kim,

    I’m just interpreting it a bit more narrowly than you are, or at least emphasizing the part about working only from the contract of MyBase. If each subclass truly honors the contract of MyBase, it should satisfy your interpretation.

    In the real world, I’d be doing an integration test of a new subclass of MyBase with the original MyFunc regardless of LSP ;-)

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    @Yex,

    You’re correct, it was my bad in the sample code. In that example CellDouble should be implementing ICell, I missed that wrinkle while coding the sample.

  • http://blogs.microsoft.co.il/blogs/kim Kim Major

    I have a question regarding your interpretation of LSP. It was my understanding that LSP in layman terms means: If the method MyFunc(MyBase B) is passed a derivative of MyBase e.g a MyDerivative and that instance causes MyFunc to malfunction, then MyDerivative does not follow the LSP principle. To go with your example. After making the RenderCell() work against the abstraction, if one of those derivatives causes RenderCell() to do the wrong thing then that derivative does not conform to LSP. I may be misreading your sample, but it looks to me more like “Replace conditional with polymorphism” than LSP. Am I missing something?

  • http://bob.yexley.net Yex

    One quick comment on something that I noticed. Your code sample for the substitution principle seems a little bit confusing and I believe it’s because you missed something in it. Please correct me if I’m wrong here.

    In the original code sample, in the RenderCell method, you have IF statements checking to see if cell is of type X…that will *never* return true in the case of the code sample you’ve provided, because I’m guessing that you intended for CellDouble to implement the ICell interface, correct? As it stands now, without it implementing that interface, all of those IF statements in the sample would return false, and no cell rendering would occur at all.

    Am I missing something?

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    @Arnon,

    Thanks for the link.

    @Everyone else,

    I like Arnon’s definition better than mine:

    “A sub class should Honor the contracts made by its parent classes”

    + the caller should work through the base class contract.

  • http://www.rgoarchitects.com/nblog Arnon Rotem-Gal-Oz

    LSP is an important principle and as you noted it is a prime reason why sub-classing is hard.
    You may want to read what I wrote about it in DDJ (http://www.ddj.com/blog/architectblog/archives/2006/08/liskov_substitu.html) The example I give there is how Microsoft violates this principle with ServicedComponents

    Arnon

  • http://codebetter.com/blogs/jeremy.miller Jeremy D. Miller

    I’ve got more little things coming soonish, but my son threw a cup of water on my keyboard and I haven’t gotten all the keys working again yet. Try coding without the “(” and “)” keys sometime for a challenge.