CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Matthew Podwysocki

Life of a Functional Programmer
  • Your API Fails, Who is at Fault?

    I decided to stay on the Design by Contract side for just a little bit.  Recently, Raymond Chen posted "If you pass invalid parameters, then all bets are off" in which he goes into parameter validation and basic defensive programming.  Many of the conversations had on the blog take me back to my C++ and early Java days of checking for null pointers, buffer lengths, etc.  This brings me back to some recent conversations I've had about how to make it explicit about what I expect.  Typical defensive behavior looks something like this:

    public static void Foreach<T>(this IEnumerable<T> items, Action<T> action)
    {
        if (action == null)
            throw new ArgumentNullException("action");

        foreach (var item in items)
            action(item);
    }

    After all, how many times have you not had any idea what the preconditions are for a given method due to lack of documentation or non-intuitive method naming?  it gets worse when they don't provide much documentation, XML comments or otherwise.  At that point, it's time to break out .NET Reflector and dig deep.  Believe me, I've done it quite a bit lately.

    The Erlang Way

    The Erlang crowd takes an interesting approach to the issue that I've really been intrigued by.  Joe Armstrong calls this approach "Let it crash" in which you only code to the sunny day scenario, and if the call to it does not conform to the spec, just let it crash.  You can read more about that on the Erlang mailing list here.

    Some paragraphs stuck out in my mind.

    Check inputs where they are "untrusted"
        - at a human interface
        - a foreign language program

    What this basically states is the only time you should do such checks is at the bounds when you have possible untrusted input, such as bounds overflows, unexpected nulls and such.  He goes on to say about letting it crash:

    specifications always  say what to  do if everything works  - but never what  to do if the  input conditions are not met - the usual answer is something sensible - but what you're the programmer - In C etc. you  have to write *something* if you detect an error -  in Erlang it's  easy - don't  even bother to write  code that checks for errors - "just let it crash".

    So, what Joe advocates is not checking at all, and if they don't conform to the spec, just let it crash, no need for null checks, etc.  But, how would you recover from such a thing?  Joe goes on to say:

    Then  write a  *independent* process  that observes  the  crashes (a linked process) -  the independent process should try  to correct the error, if it can't correct  the error it should crash (same principle) - each monitor  should try a  simpler error recovery strategy  - until finally the  error is  fixed (this is  the principle behind  the error recovery tree behaviour).

    It's an interesting approach, but proves to a valuable one for parallel processing systems.  As I dig further into more functional programming languages, I'm finding such constructs useful.

    Design by Contract Again and DDD

    Defensive programming is a key part of Design by Contract.  But, in a way it differs.  With defensive programming, the callee is responsible for determining whether the parameters are valid and if not, throws an exception or otherwise handles it.   DbC with the help of the language helps the caller better understand how to cope with the exception if it can.

    Bertrand Meyer wrote a bit about this in the Eiffel documentation here.  But, let's go back to basics. DbC asserts that the contracts (what we expect, what we guarantee, what we maintain) are such a crucial piece of the software, that it's part of the design process.  What that means is that we should write these contract assertions FIRST. 

    What do these contract assertions contain?  It normally contains the following:
    • Acceptable/Unacceptable input values and the related meaning
    • Return values and their meaning
    • Exception conditions and why
    • Preconditions (may be weakened by subclasses)
    • Postconditions (may be strengthened by subclasses)
    • Invariants (may be strengthened by subclasses)

    So, in effect, I'm still doing TDD/BDD, but an important part of this is identifying my preconditions, postconditions and invariants.  These ideas mesh pretty well with my understanding of BDD and we should be testing those behaviors in our specs.  Some people saw in my previous posts that they were afraid I was de-emphasizing TDD/BDD and that couldn't be further from the truth.  I'm just using another tool in the toolkit to express my intent for my classes, methods, etc.  I'll explain further in a bit down below.

    Also, my heavy use of Domain Driven Design patterns help as well.  I mentioned those previously when I talked about Side Effects being Code Smells.  With the combination of intention revealing interfaces which express to the caller what I am intending to do, and my use of assertions not only in the code but also in the documentation as well.  This usually includes using the <exception> XML tag in my code comments.  Something like this is usually pretty effective:

    /// <exception cref="T:System.ArgumentNullException"><paramref name="action"/> is null.</exception>

    If you haven't read Eric's book, I suggest you take my advice and Peter's advice and do so.

    Making It Explicit

    Once again, the use of Spec# to enforce these as part of the method signature to me makes sense.  To be able to put the burden back on the client to conform to the contract or else they cannot continue.  And to have static checking to enforce that is pretty powerful as well. 

    But, what are we testing here?  Remember that DbC and Spec# can ensure your preconditions, your postconditions and your invariants hold, but they cannot determine whether your code is correct and conforms to the specs.  That's why I think that BDD plays a pretty good role with my use of Spec#. 

    DbC and Spec# can also play a role in enforcing things that are harder with BDD, such as enforcing invariants.  BDD does great things by emphasizing behaviors which I'm really on board with.  But, what I mean by being harder is that your invariants may be only private member variables which you are not going to expose to the outside world.  If you are not going to expose them, it makes it harder for your specs to control such behavior.  DbC and Spec# can fill that role.  Let's look at the example of an ArrayList written in Spec#.

    public class ArrayList
    {
        invariant 0 <= _size && _size <= _items.Length;
        invariant forall { int i in (_size : _items.Length); _itemsIdea == null };  // all unused slots are null

        [NotDelayed]
        public ArrayList (int capacity)
          requires 0 <= capacity otherwise ArgumentOutOfRangeException;
          ensures _size/*Count*/ == 0;
          ensures _items.Length/*Capacity*/ == capacity;
        {
          _items = new object[capacity];
          base();
        }

        public virtual void Clear ()
          ensures Count == 0;
        {
          expose (this) {
            Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
            assume forall{int i in (0: _size); _itemsIdea == null};  // postcondition of Array.Clear
            _size = 0;
          }
        }

    // Rest of code omitted

    What I've been able to do is set the inner array to the new capacity, but also ensure that when I do that, my count doesn't go up, but only my capacity.  When I call the Clear method, I need to make sure the inner array is peer consistent by the way of all slots not in the array must be null as well as resetting the size.  We use the expose block to expose to the runtime to have the verifier analyze the code.  By the end of the expose block, we should be peer consistent, else we have issues.  How would we test some of these scenarios in BDD?  Since they are not exposed to the outside world, it's pretty difficult.  What it would be doing is leaving me with black box artifacts that are harder to prove.  Instead, if I were to expose them, it would then break encapsulation which is not necessarily something I want to do.  Instead, Spec# gives me the opportunity to enforce this through the DbC constructs afforded in the language. 

    The Dangers of Checked Exceptions

    But with this, comes a cost of course.  I recently spoke with a colleague about Spec# and the instant thoughts of checked exceptions in Java came to mind.  Earlier in my career, I was a Java guy who had to deal with those who put large try/catch blocks around methods with checked exceptions and were guilty of just catching and swallowing or catching and rethrowing RuntimeExceptions.  Worse yet, I saw this as a way of breaking encapsulation by throwing exceptions that I didn't think the outside world needed to know about.  I was kind of glad that this feature wasn't brought to C# due to the fact I saw rampant abuse for little benefit.  What people forgot about during the early days of Java that exceptions are meant to be exceptional and not control flow.

    How I see Spec# being different is that since we have a static verification tool through the use of Boogie to verify whether those exceptional conditions are valid.  The green squigglies give warnings about possible null values or arguments in ranges, etc.  This gives me further insight into what I can control and what I cannot.  Resharper also has some of those nice features as well, but I've found Boogie to be a bit more helpful with more advanced static verification.

    Conclusion

    Explicit DbC constructs give us a pretty powerful tool in terms of expressing our domain and our behaviors of our components.  Unfortunately, in C# there are no real valuable implementations that enforce DbC constructs to both the caller and the callee.  And hence Spec# is an important project to come out of Microsoft Research.

    Scott Hanselman just posted his interview with the Spec# team on his blog, so if you haven't heard it yet, go ahead and download it now.  It's a great show and it's important that if you find Spec# to be useful, that you press Microsoft to give it to us as a full feature.
  • Command-Query Separation and Immutable Builders

    In one of my previous posts about Command-Query Separation (CQS) and side effecting functions being code smells, it was pointed out to me again about immutable builders.  For the most part, this has been one area of CQS that I've been willing to let break.  I've been following Martin Fowler's advice on method chaining and it has worked quite well.  But, revisiting an item like this never hurts.  Immutability is something you'll see me harping on time and time again now and in the future.  The standard rules I usually do is immutable and side effect free when you can, mutable state where you must.  I like the opt-in mutability of functional languages such as F# which I'll cover at some point in the near future instead of the opt-out mutability of imperative/OO languages such as C#.

    Typical Builders

    The idea of the standard builder is pretty prevalent in most applications we see today with fluent interfaces.  Take for example most Inversion of Control (IoC) containers when registering types and so on:

    UnityContainer container = new UnityContainer();
    container
        .RegisterType<ILogger, DebugLogger>("logger.Debug")
        .RegisterType<ICustomerRepository, CustomerRepository>();

    Let's take a naive medical claims processing system and building up and aggregate root of a claim.  This claim contains such things as the claim information, the lines, the provider, recipient and so on.  This is a brief sample and not meant to be the real thing, but just a quick example.  After all, I'm missing things such as eligibility and so on.

        public class Claim
        {
            public string ClaimId { get; set; }  
            public DateTime ClaimDate { get; set; }
            public List<ClaimLine> ClaimLines { get; set; }
            public Recipient ClaimRecipient { get; set; }
            public Provider ClaimProvider { get; set; }
        }

        public class ClaimLine
        {
            public int ClaimLineId { get; set; }
            public string ClaimCode { get; set; }
            public double Quantity { get; set; }
        }

        public class Recipient
        {
            public string RecipientId { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
        }

        public class Provider
        {
            public string ProviderId { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
        }

    Now our standard builders use method chaining as shown below.  As you note, we'll return the instance each and every time. 

    public class ClaimBuilder
    {
        private string claimId;
        private DateTime claimDate;
        private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
        private Provider claimProvider;
        private Recipient claimRecipient;

        public ClaimBuilder() {}

        public ClaimBuilder WithClaimId(string claimId)
        {
            this.claimId = claimId;
            return this;
        }

        public ClaimBuilder WithClaimDate(DateTime claimDate)
        {
            this.claimDate = claimDate;
            return new ClaimBuilder(this);
        }

        public ClaimBuilder WithClaimLine(ClaimLine claimLine)
        {
            claimLines.Add(claimLine);
            return this;
        }

        public ClaimBuilder WithProvider(Provider claimProvider)
        {
            this.claimProvider = claimProvider;
            return this;
        }

        public ClaimBuilder WithRecipient(Recipient claimRecipient)
        {
            this.claimRecipient = claimRecipient;
            return this;
        }

        public Claim Build()
        {
            return new Claim
           {
               ClaimId = claimId,
               ClaimDate = claimDate,
               ClaimLines = claimLines,
               ClaimProvider = claimProvider,
               ClaimRecipient = claimRecipient
           };
        }

        public static implicit operator Claim(ClaimBuilder builder)
        {
            return new Claim
            {
                ClaimId = builder.claimId,
                ClaimDate = builder.claimDate,
                ClaimLines = builder.claimLines,
                ClaimProvider = builder.claimProvider,
                ClaimRecipient = builder.claimRecipient
            };
        }
    }

    What we have above is a violation of the CQS because we're mutating the current instance as well as returning a value.  Remember, that CQS states:
    • Commands - Methods that perform an action or change the state of the system should not return a value.
    • Queries - Return a result and do not change the state of the system (aka side effect free)
    But, we're violating that because we're returning a value as well as mutating the state.  For the most part, that hasn't been a problem.  But what about sharing said builders?  The last thing we'd want to do is have our shared builders mutated by others when we're trying to build up our aggregate roots.

    Immutable Builders or ObjectMother or Cloning?

    When we're looking to reuse our builders, the last thing we'd want to do is allow mutation of the state.  So, if I'm working on the same provider and somehow change his eligibility, then that would be reflected against all using the same built up instance.  That would be bad.  We have a couple options here really.  One would be to follow an ObjectMother approach to build up shared ones and request a new one each time, or the other would be to enforce that we're not returning this each and every time we add something to our builder.  Or perhaps we can take one at a given state and just clone it.  Let's look at each.

    public static class RecipientObjectMother
    {
        public static RecipientBuilder RecipientWithLimitedEligibility()
        {
            RecipientBuilder builder = new ProviderBuilder()
                .WithRecipientId("xx-xxxx-xxx")
                .WithFirstName("Robert")
                .WithLastName("Smith")
                // More built in stuff here for setting up eligibility
     
            return builder;
        }
    }

    This allows me to share my state through pre-built builders and then when I've finalized them, I'll just call the Build method or assign them to the appropriate type.  Or, I could just make them immutable instead and not have to worry about such things.  Let's modify the above example to take a look at that.

    public class ClaimBuilder
    {
        private string claimId;
        private DateTime claimDate;
        private readonly List<ClaimLine> claimLines = new List<ClaimLine>();
        private Provider claimProvider;
        private Recipient claimRecipient;

        public ClaimBuilder() {}

        public ClaimBuilder(ClaimBuilder builder)
        {
            claimId = builder.claimId;
            claimDate = builder.claimDate;
            claimLines.AddRange(builder.claimLines);
            claimProvider = builder.claimProvider;
            claimRecipient = builder.claimRecipient;
        }

        public ClaimBuilder WithClaimId(string claimId)
        {
            ClaimBuilder builder = new ClaimBuilder(this) {claimId = claimId};
            return builder;
        }

        public ClaimBuilder WithClaimDate(DateTime claimDate)
        {
            ClaimBuilder builder = new ClaimBuilder(this) { claimDate = claimDate };
            return builder;
        }

        public ClaimBuilder WithClaimLine(ClaimLine claimLine)
        {
            ClaimBuilder builder = new ClaimBuilder(this);
            builder.claimLines.Add(claimLine);
            return builder;
        }

        public ClaimBuilder WithProvider(Provider claimProvider)
        {
            ClaimBuilder builder = new ClaimBuilder(this) { claimProvider = claimProvider };
            return builder;
        }

        public ClaimBuilder WithRecipient(Recipient claimRecipient)
        {
            ClaimBuilder builder = new ClaimBuilder(this) { claimRecipient = claimRecipient };
            return builder;
        }

        // More code here for building
    }

    So, what we've had to do is provide a copy-constructor to initialize the object in the right state.  And here I thought I could leave those behind since my C++ days.  After each assignment, I then create a new ClaimBuilder and pass in the current instance to initialize the new one, thus copying over the old state.  This then makes my class suitable for sharing.  Side effect free programming is the way to do it if you can.  Of course, realizing that it creates a few objects on the stack as you're initializing your aggregate root, but for testing purposes, I haven't really much cared. 

    Of course I could throw Spec# into the picture once again as enforcing immutability on said builders.  To be able to mark methods as being Pure makes it apparent to both the caller and the callee what the intent of the method is.  Another would be using NDepend as Patrick Smacchia talked about here.

    The other way is just to provide a clone method which would just copy the current object so that you can go ahead and feel free to modify a new copy.  This is a pretty easy approach as well.

    public ClaimBuilder(ClaimBuilder builder)
    {
        claimId = builder.claimId;
        claimDate = builder.claimDate;
        claimLines.AddRange(builder.claimLines);
        claimProvider = builder.claimProvider;
        claimRecipient = builder.claimRecipient;
    }

    public ClaimBuilder Clone()
    {
        return new ClaimBuilder(this);
    }

    Conclusion

    Obeying the CQS is always an admirable thing to do especially when managing side effects.  Not all of the time is it required such as with builders, but if you plan on sharing these builders, it might be a good idea to really think hard about the side effects you are creating.  As we move more towards multi-threaded, multi-machine processing, we need to be aware of our side effecting a bit more.  But, at the end of the day, I'm not entirely convinced that this violates the true intent of CQS since we're not really querying, so I'm not sure how much this is buying me.  What are your thoughts?
  • Adventures in F# - F# 101 Part 9 (Control Flow)

    Taking a break from the Design by Contract stuff for just a bit while I step back into the F# and functional programming world.  If you followed me at my old blog, you'll know I'm pretty passionate about functional programming and looking for new ways to solve problems and express data.

    Where We Are

    Before we begin today, let's catch up to where we are today:
    Today's topic will be covering more imperative code dealing with control flow.  But first, the requisite side material before I begin today's topic.

    A Survey of .NET Languages And Paradigms

    Joel Pobar just contributed an article to the latest MSDN Magazine (May 2008) called "Alphabet Soup: A Survey of .NET Languages And Paradigms". This article introduces not only the different languages that are supported in the .NET space, but the actual paradigms that they operate in.  For example, you have C#, VB.NET, C++, F# and others in the static languages space and IronRuby, IronPython among others in the dynamic space.  But what's more interesting is the way that each one tackles a particular problem.  The article covers a little bit about functional programming and its uses as well as dynamic languages.  Of course the mention is made that C# and VB.NET are slowly adopting more functional programming aspects over time.  One thing I've lamented is the fact that VB.NET and C# are too similar for my tastes so I'm hoping for more true differentiation come the next spin.  Instead, VB would be really interesting as a more dynamic language and not just one that many people just look down their noses at.  Ok, enough of the sidetracking and let's get back to the subject at hand.

    Control Flow

    Since F# is a general purpose language in the .NET space, it supports all imperative ways of approaching problems.  This of course includes control flow.  F# takes a different approach than most functional programming languages in that the evaluation of a statement can happen in any order.  Instead, in F#, we have a very succinct way of doing it in F# with the if, elif, else statements.  Below is a quick example of that:

    #light

    let IsInBounds (x:int) (y:int) =
      if x < 0 then false
      elif x > 50 then false
      elif y < 0 then false
      elif y > 50 then false
      else true

    What I was able to do is to check the bounds of the given integer inputs.  Pretty simple example.  As opposed to many imperative languages, when you are returning a value from the if, all subsequent elif or elses must also return values.  This makes for balanced equations.  Also, if you return a value from an if, then you are also forced to have an else which returns a value.

    Although F# is using type inference to determine what my IsInBounds method returns, I cannot go ahead and return one type in an if and another different type in the elif or else.  F# will complain violently, as it should because that's really not a good design of a function.  Below is some code that will definitely throw an error.

    #light

    let IsInBounds (x:int) (y:int) =
      if x < 0 then "Foo"
      elif x > 50 then false
      elif y < 0 then false
      elif y > 50 then false
      else true

    As I said before, the equations must be balanced.  But of course if your if expression returns a unit (void type for those imperative folks), then you aren't forced to have and else statement.  Pretty self explanatory there. 

    Let's move onto the for loops.  The standard for loop is to start at a particular index value, check for the terminate condition and then increment or decrement the index.  F# supports this of course in a pretty standard way, but by default, the index is incremented by 1.  You must note though that the body of the for loop is a unit type (void once again) so, if you return a value, F# won't like it.  Below is a simple for loop to iterate through all lowercase letters.

    #light

    let chars = [|'a'..'z'|]

    let PrintChars (c:array<char>) =
      for index = 0 to chars.Length - 1 do
        print_any c.[index]
       
    PrintChars chars

    But, if I tried to return c from the for loop, F# will complain, but it will allow it to happen.  It's just a friendly reminder that it's not going to do anything with that value you specified.  I could also specify the for loop with a decrementer, so let's reverse our letters this time.

    #light

    let chars = [|'a'..'z'|]

    let PrintChars (c:array<char>) =
      for index = chars.Length - 1 downto 0 do
        print_any c.[index]
       
    PrintChars chars

    F# also supports the while construct as well.  This of course is the exact same as any imperative construct, but with the caveat of once again, the while loop should not return a value because it is of the unit type.

    #light

    let chars = ref ['a'..'z']

    while (List.nonempty !chars) do
      print_any (List.hd !chars)
      chars := List.tl !chars

    This time we're just printing out a char and then removing it from the list collection.  Note that we're using the ref keyword and reference cells as we talked about before.  Lastly, let's cover one last construct, the foreach statement.  This is much like we have in most other languages, just the wording is a bit different.  As always, the foreach statement has the unit type, so returning values is a warning.

    #light

    let nums = [0..99]

    for n in nums do
      print_any n

    Wrapping It Up

    Just a quick walkthrough of just some of the imperative control statements allowed by F#.  As you can see, it's not a huge leap here from one language to the next.  I have a couple of upcoming talks on F#, so if you're in the Northern VA area on May 17th, come check it out at the NoVA Code Camp.
  • Side Effecting Functions Are Code Smells Revisited

    After talking with Greg Young for a little this morning, I realized I missed a few points that I think need to be covered as well when it comes to side effecting functions are code smells.  In the previous post, I talked about side effect free functions and Design by Contract (DbC) features in regards to Domain Driven Design.  Of course I had to throw the requisite Spec# plug as well for how it handles DbC features in C#.

    Intention Revealing Interfaces

    Let's step back a little bit form the discussion we had earlier.  Let's talk about good design for a second.  How many times have you seen a method and had no idea what it did or that it went ahead and called 15 other things that you didn't expect?  At that point, most people would take out .NET Reflector (a Godsend BTW) and dig through the code to see the internals.  One of the examples of the violators was the ASP.NET Page lifecycle when I first started learning it.  Init versus Load versus PreLoad wasn't really exact about what happened where, and most people have learned to hate it.

    In the Domain Driven Design world, we have the Intention Revealing Interface.  What this means is that we need to name our classes, methods, properties, events, etc to describe their effect and purpose.  And as well, we should use the ubiquitous language of the domain to name them appropriately.  This allows other team members to be able to infer what that method is doing without having to dig in with such tools as Reflector to see what it is actually doing.  In our public interfaces, abstract classes and so on, we need to specify the rules and the relationships.  To me, this comes back again to DbC.  This allows us to not only specify the name in the ubiquitous language, but the behaviors as well.

    Command-Query Separation (CQS)

    Dr. Bertrand Meyer, the man behind Eiffel and the author of Object-oriented Software Construction, introduced a concept called Command-Query Separation.  It states that we should break our functionality into two categories:

    • Commands - Methods that perform an action or change the state of the system should not return a value.
    • Queries - Return a result and do not change the state of the system (aka side effect free)

    Of course this isn't a 100% rule, but it's still a good one to follow.  Let's look at a simple code example of a good command.  This is simplified of course.  But what we're doing is side effecting the number of items in the cart. 

    public class ShoppingCart
    {
        public void AddItemToCart(Item item)
        {
            // Add item to cart
        }
    }

    Should we use Spec# to do this, we could also check our invariants as well, but also to ensure that the number of items in our cart has increased by 1.

    public class ShoppingCart
    {
        public void AddItemToCart(Item item)
            ensures ItemsInCart == old(ItemsInCart) + 1;
        {
            // Add item to cart
        }
    }

    So, once again, it's very intention revealing at this point that I'm going to side effect the system and add more items to the cart.  Like I said before, it's a simplified example, but it's a very powerful concept.  And then we could talk about queries.  Let's have a simple method on a cost calculation service that takes in a customer and the item and calculates.

    public class CostCalculatorService
    {
        public double CalculateCost(Customer c, Item i)
        {
            double cost = 0.0d;
           
            // Calculate cost
           
            return cost;
        }
    }

    What I'm not going to be doing in this example is modifying the customer, nor the item.  Therefore, if I'm using Spec#, then I could mark this method as being [Pure].  And that's a good thing.

    The one thing that I would hold an exception for is fluent builders.  Martin Fowler lays out an excellent case for them here.  Not only would we be side effecting the system, but we're also returning a value (the builder itself).  So, the rule is not a hard and fast one, but always good to observe.  Let's take a look at a builder which violates this rule.

    public class CustomerBuilder
    {
        private string firstName;

        public static CustomerBuilder New { get { return new CustomerBuilder(); } }
       
        public CustomerBuilder WithFirstName(string firstName)
        {
            this.firstName = firstName;
            return this;
        }

        // More code goes here
    }

    To wrap things up, things are not always fast rules and always come with the "It Depends", but the usual rule is that you can't go wrong with CQS.

    Wrapping It Up

    These rules are quite simple for revealing the true intent of your application while using the domain's ubiquitous language.  As with anything in our field, it always comes with a big fat "It Depends", but applying the rules as much as you can is definitely to your advantage.  These are simple, yet often overlooked scenarios when we design our applications, yet are the fundamentals.
  • Side Effecting Functions are Code Smells

    I know the title might catch a few people off guard, but let me explain.  Side effecting functions, for the most part, are code smells.  This is a very important concept in Domain Driven Design (DDD) that's often overlooked.  For those who are deep in DDD, this should sound rather familiar.  And in the end, I think Spec# and some Design by Contract (DbC) constructs can mitigate this, or you can go the functional route as well.

    What Is A Side Effect?

    When you think of the word side effect in most languages, you tend to think of any unintended consequence.  Instead, what we mean by it is having any effect on the system from an outside force.  What do I mean by that?  Well, think of these scenarios, reading and writing to a database, reading or writing to the console, or even modifying the state of your current object.  Haskell and other functional languages take a pretty dim view of side effects, hence why they are not allowed, unless through monads.  F# also takes this stance, as "variables" are immutable unless otherwise specified.

    Why Is It A Smell?

    Well, let's look at it this way.  Most of our operations call other operations which call even more operations.  This deep nesting is then created.  From this deep nesting, it becomes quite difficult to predict the behaviors and consequences of calling all of those nested operations.  You, the developer might not have intended for all of those operations to occur because A modified B modified C modified D.  Without any safe form of abstraction, it's pretty hard to test as well.  Imagine that any mock objects that you create would have to suddenly know in 5 levels deep that it is modified in some function.  Not necessarily the best thing to do.

    Also, when it comes to multi-threaded processing, this becomes even more of an issue.  If multiple threads have a reference to the same mutable object, and one thread changes something on the reference, then all other threads were just side effected.  This may not be something that you'd want to do.  Then again, if working on shared memory applications, that might be.  But, for the most part, the unpredictability of it can be a bad thing.

    Let's take a quick example of a side effecting an object like implementation of a 2 dimensional Point.  We're going to go ahead and allow ourselves to add another Point to the system.

    public class Point2D
    {
        public double X { get; set; }

        public double Y { get; set; }

        public void Add(Size2D other)
        {
            X += other.Height;
            Y += other.Width;
        }
    }

    public class Size2D
    {
        public double Height { get; set; }

        public double Width { get; set; }
    }

    What's wrong with the above sample is that I just side effected the X, and Y.  Why is this bad?  Well, like I said, most objects like these are fire and forget.  Anyone who had a reference to this Point now has a side effected one, that they might not have wanted.  Instead, I should probably focus on retrieving a new one at this point, since this is pretty much a value object.

    What Can You Do About It?

    Operations that return results without side effects are considered to be pure functions.   These pure functions when called any number of times will return the same result given the same parameters time and time again.  Pure functions are much easier to unit test and overall a pretty low risk.

    There are several approaches to being able to fix the above samples.  First, you can keep your modifiers and queries separated.  Make sure you keep the methods that make changes to your object separate from those that return your domain data.  Perform those queries and associated calculations in methods that don't change your object state in any way.  So, think of a method that calculates price and then another method that actually sets the price on the particular object. 

    Secondly, you could also just not modify the object at all.  Instead, you could return a value object that is created as an answer to a calculation or a query.  Since value objects are immutable, you can feel free to hand them off and forget about them, unlike entities which are entirely mutable.  Let's take the above example of the Coordinate and switch it around.  Think of the DateTime structure.  When you want to add x number of minutes, do you side effect the DateTime, or do you get a new one?  The answer is, you get a new one?  Why, well, because it's a structure, and they are immutable, but not only that, it solves a lot of those side effecting problems.

    public class Point2D
    {      
        private readonly double x;
        private readonly double y;
       
        public Point2D() {}
       
        public Point2D(double x, double y)
        {
            this.x = x;
            this.y = y;
        }

        public double X { get { return x; } }
       
        public double Y { get { return y; } }
       
        public Point2D Add(Size2D other)
        {
            double newX = x + other.Height;
            double newY = y + other.Width;
           
            return new Point2D(newX, newY);
        }
    }

    Spec# is a tool that can help in this matter.  Previously I stated why Spec# matters, well, let's get into more detail why.  We can mark our side effect free methods as being pure with the [Pure] attribute.  This allows the system to verify that indeed we are not side-effecting the system, and any time I call that with the same parameters, I will get the same result.  It's an extra insurance policy that makes it well known to the caller that I'm not going to side effect myself when you call me.  So, let's go ahead and add some Spec# goodness to the equation. 

    [Pure]
    public Point2D Add(Size2D other)
    {
        double newX = x + other.Height;
        double newY = y + other.Width;
           
        return new Point2D(newX, newY);
    }

    But, now Spec# will warn us that our other might be null, and that could be bad....  So, let's fix that to add some real constraints for the preconditions.

    [Pure]
    public Point2D Add(Size2D other)
        requires other != null
    {
        double newX = x + other.Height;
        double newY = y + other.Width;
           
        return new Point2D(newX, newY);
    }

    Of course I could have put some ensures as well to ensure the result will be the addition, but you get the point.

    Turning To Design by Contract

    Now of course we have to be a pragmatist about things.  At no point did I say that we can't have side effects ever.  That would be Haskell and they put themselves into a nasty corner with that and the only way around it was with monads, that can be a bit clumsy.  Instead, I want to refocus where we do them and be more aware of what you're modifying.

    In our previous examples, we cut down on the number of places where we had our side effects.  But, this does not eliminate them, instead gather them in the appropriate places.  Now when we deal with entities, they are very much mutable, and so we need to be aware when and how side effects get introduced.  To really get to the heart of the matter, we need to verify the preconditions, the postconditions and mostly our invariants.  In a traditional application written in C#, we could throw all sorts of assertions into our code to make sure that we are in fact conforming to our contract.  Or we can write our unit tests to ensure that they conform to them.  This is an important point in Eric Evans' book when talking about assertions in the Supple Design chapter.

    Once again, Spec# enters again as a possible savior to our issue.  This allows us in our code, to model our preconditions and our postconditions as part of our method signature.  Invariants as well can be modeled as well into our code as well.  These ideas came from Eiffel but are very powerful when used for good.

    Let's make a quick example to show how invariants and preconditions and postconditions work.  Let's create an inventory class, and keep in mind it's just a sample and not anything I'd ever use, but it proves a point.  So let's lay out the inventory class and we'll set some constraints.  First, we'll have the number of items remaining.  That number of course can never go below zero.  Therefore, we need an invariant that enforces that.  Also, when we remove items from the inventory, we need to make sure that we're not going to dip below zero.  Very important things to keep in mind.

    public class Inventory
    {
        private int itemsRemaining;
        private int reorderPoint;
       
        invariant itemsRemaining >= 0;
       
        public Inventory()
        {
            itemsRemaining = 200;
            reorderPoint = 50;
            base();
        }
       
        public void RemoveItems(int items)
            requires items <= ItemsRemaining;
            ensures ItemsRemaining == old(ItemsRemaining) - items;
        {
            expose(this)
            {
                itemsRemaining -= items;
            }

            // Check reorder point
        }
       
        public int ItemsRemaining { get { return itemsRemaining; } }

        // More stuff here in class
    }

    What I was able to express is that I set up my invariants in the constructor.  You cannot continue in a Spec# program unless you set the member variable that's included in the invariant.  Also, look at the RemoveItems method.  We set one precondition that states that number of items requested must be less than or equal to the number left.  And we set the postcondition which states that the items remaining must be the difference between the old items remaining and the items requested.  Pretty simple, yet powerful.  We had to expose our invariant while modifying it so that it could be verified, however.  But, doesn't it feel good to get rid of unit tests that prove what I already did in my method signature?

    Wrapping Things Up

    So, I hope after reading this, you've thought more about your design, and where you are modifying state and that you have intention revealing interfaces to tell the coder what exactly you are going to do.  The Design by Contract features of Spec# also play a role in this to state in no uncertain terms what exactly the method can do with the preconditions and postconditions and through my class with my invariants.  Of course you can use your regular C#, or language of choice to model the same kind of things, yet not as intention revealing.

    So, where to go from here?  Well, if you've found Spec# interesting, let Microsoft know about it.  Join the campaign that Greg and I are harping on and say, "I Want Spec#!"
  • Upcoming Functional Programming/F# Talks

    Well, I certainly have an ambitious May schedule ahead of me.  Most of course will be revolving around functional programming and F# as it seems to be finally catching on.  I've been noticing a bunch from the Java and Ruby communities becoming interested in such things as Scala, Haskell, OCaml, Erlang and F#.  I was rather heartened by this as some in the Ruby world like here and here coming back to the static world for ways of representing data and functions in different ways.  Of course Lisp and Scheme (IronLisp and IronScheme) still manages to eek in the rebirth, but still remains on the outside.

    I will be speaking at the Northern Virginia Code Camp on May 17th for a total of two topics:

    • Improve your C# with Functional Programming and F# concepts
      Learn how .NET 3.5 takes ideas from Functional Programming and how you can apply lessons learned from it and F# respectively.

    • Introduction to Functional Programming and F#
      Come learn about functional programming, an older paradigm that Object Oriented Programming, and the ideas around it.  This talk will cover the basics including high-order functions, functions as values, immutability, currying, pattern matching and more.  Learn how to mesh ideas from functional programming with imperative programming in F# and .NET.

    So, if you're in the DC area, go ahead and register here and show your support for the community.

    Also, I will be taking some time to spend up in Philadelphia this month at the next Philly ALT.NET meeting to also talk about F#.  Still ironing out the details on that one in regards to the DC ALT.NET meeting in May.  Either way, should be a good time!
  • Making Spec# a Priority

    During ALT.NET Open Spaces, Seattle, I spent a bit of time with Rustan Leino and Mike Barnett from the Spec# team at Microsoft Research.  This was to help introduce Design by Contract (DbC) and Spec# to the ALT.NET audience who may not have seen it before through me or Greg Young.  I covered it in detail on my old blog here.

    Spec# at ALT.NET Open Spaces, Seattle

    As I said before I took a bit of time during Saturday to spend some time with the Spec# guys.  I spent much of the morning with them in the IronRuby session explaining dynamic languages versus static ones.  They had the session at 11:30, the second session of the day, in direct competition with the Functional Programming talk I had planned with Dustin Campbell.  Greg was nice enough to record much of the session on a handheld camera and you can find that here.  It's not the best quality, but you can understand most of it, so I'm pretty happy. 

    The things that were covered in this session were:
    • Spec# overview
    • Non-null Types
    • Preconditions
    • Postconditions
    • Invariants
    • Compile-Time checking versus Runtime checking
    • Visual Studio Integration
    All in all, I thought it was one of the best sessions and I'm glad they came out.  Hopefully we'll see more from them in the future.

    Scott Hanselman also recorded a session with the Spec# guys for Episode 128.  This is a much better interview than on DotNetRocks Episode 237 that Rustan did last year. This actually gets into the technical guts of the matter in a much better way, so go ahead and give it a listen.  I was fortunate enough to be in the room at the time to listen.

    The New Release

    Mike and Rustan recently released a new version of Spec# back on April 11th so now Visual Studio 2008 is supported.  You must remember though, this is still using the Spec# compiler that's only C# 2.0 compliant.  So, anything using lambdas, extension methods, LINQ or anything like that is not supported.  As always, you can find the installs here

    As with before, both the Spec# mode (stripped down mode) and C# mode are supported.  What's really interesting is the inferred contracts.  From an algorithm that Mike and Rustan worked on, they have the ability to scan a method to determine its preconditions and postconditions.  It's not perfect, but to have that kind of Intellisense is really powerful.



    What you can see is that the GetEnumerator method ensures that the result is new.  Keep in mind, result is a keyword which states what the return value is for a method.  It also says that the owner of IEnumerator will be the same as before.  Object ownership is one of the more difficult things to comprehend with Spec# but equally powerful.

    Another concept that's pretty interesting is the ability to make all reference types non-null by default in C# or in Spec# modes.  Instead of having to mark your non-null types with an exclamation mark (!), instead you can mark your nullable types with a question mark (?) much as you would with the System.Nullable<T> generic class.  All it takes is the flip of a switch in Spec#:



    Or in the C# mode:



    And then you have all the Spec# goodness.

    Why It's Important

    So, why have I been harping on this?  To be able to express DbC as part of my method signature is extremely important to me.  To be able to express my preconditions (what I require), postconditions (what I ensure), my invariants (what state will change) is a pretty powerful concept.  Not to mention, to enforce immutability and method purity is also a pretty strong concept, especially in the times of multi-core processing.  More on that subject later.

    Focus on Behaviors

    What Spec# can bring to the table is the ability to knock out a bit of your unit tests.  Now, I don't mean all of them, but what about the ones that check for null values?  Are they valid if you already put in your method signature to require a non-null value or use the ! symbol to denote a non-null type?  Those edge cases aren't really valid anymore.  The ability to track your invariants is the same as well as your postconditions.  Instead, what that does is frees you up to consider the behaviors of your code, what you should have been testing anyways.

    Immutability

    Immutability plays a big part in Spec# as well.  To some extent, I'll cover more in a Domain Driven Design post, but instead will get some things out of the way here.  Eric Lippert, C# team member, has stated that immutable data structures are the way of the future in C# going forward.  Spec# can make that move a bit more painless?  How you might ask?  Well, the ImmutableAttribute lays out that explicitly.  Let's do a simple ReadOnlyDictionary in Spec#, taking full advantage of Spec#'s attributes, preconditions and postconditions:

    using System;
    using System.Collections;
    using System.Collections.Generic;
    using Microsoft.Contracts;

    namespace Microsoft.Samples
    {
        [Immutable]
        public class ReadOnlyDictionary<TKey, TValue> : ICollection<KeyValuePair<TKey!, TValue>> where TKey : class
        {
            private readonly IDictionary<TKey!, TValue>! dictionary;
       
            public ReadOnlyDictionary(IDictionary<TKey!, TValue>! dictionary)
            {
                this.dictionary = dictionary;
            }
           
            public TValue this[TKey! key]
            {
                get
                    requires ContainsKey(key);
                { return dictionary[key]; }
            }
           
            [Pure]
            public bool ContainsKey(TKey! key)
            {
                return dictionary.ContainsKey(key);
            }

            void ICollection<KeyValuePair<TKey!,TValue>>.Add(KeyValuePair<TKey!, TValue> item)
            {
                throw new NotImplementedException();
            }

            void ICollection<KeyValuePair<TKey!,TValue>>.Clear()
            {
                throw new NotImplementedException();
            }

            [Pure]
            public bool Contains(KeyValuePair<TKey!, TValue> item)
            {
                return dictionary.Contains(item);
            }

            [Pure]
            public void CopyTo(KeyValuePair<TKey!, TValue>[]! array, int arrayIndex)
                requires arrayIndex >=0 && arrayIndex < Count;
            {
                dictionary.CopyTo(array, arrayIndex);
            }

            [Pure]
            public int Count
            {
                get
                    ensures result >= 0;
                    { return dictionary.Count; }
            }

            [Pure]
            public bool IsReadOnly
            {
                get
                    ensures result == true;
                    { return true; }
            }

            [Pure]
            bool ICollection<KeyValuePair<TKey!,TValue>>.Remove(KeyValuePair<TKey!, TValue> item)
            {
                throw new NotImplementedException();
            }

            [Pure]
            public IEnumerator<KeyValuePair<TKey!, TValue>>! GetEnumerator()
            {
                return dictionary.GetEnumerator();
            }

            [Pure]
            IEnumerator! System.Collections.IEnumerable.GetEnumerator()
            {
                return dictionary.GetEnumerator();
            }
        }
    }

    As you can see, I marked the class itself as immutable.  But as well, I removed anything that might change the state of our dictionary, as well as mark things with non-null values.  That's a little extra on top, but still very readable.  I'll be covering more in the near future as it applies to Domain Driven Design.

    Call to Action

    So, the call to action is clear, make Spec# a priority to get it in C# going forward.  Greg Young has started the campaign, so we need to get it moving!
  • xUnit.net Goes 1.0 and Unit Testing F#

    As I've said before on my previous blogs, I'm very much into F# and functional programming lately.  With that, I'm still in the mode of TDD.  Just because you enter a new programming paradigm, doesn't mean you throw away your XP and TDD roots.  Instead, I find it just as valuable if not even more so when switching to a new model.

    xUnit.net 1.0 Goes Live

    As Brad Wilson said earlier this week, xUnit.net released version 1.0.  You can read more about the announcement here.  Since the RC3 timeframe, it was pretty much feature complete, so there hasn't been much change since that time.  Instead, the focus was on polishing the existing functionality and including the integration with ASP.NET MVC, Resharper 3.1 and TestDriven.NET.  The GUI runner, such as it is has been pretty limited, but ok for most purposes. 

    Many questions used to arise, why xUnit.net?  Why do we need yet another framework out there?  Why not just add onto the existing ones in the market?  I think with the 1.0 release, the critics in this release to do things a bit differently than MbUnit and NUnit have approached things.  For example, the Assert.Throws<TException>, not having to decorate the classes with [TestFixture] and a few other things come to mind as well as being very extensible.  It's a smaller framework, but I really don't have a problem with that.  With most of the other frameworks, I don't use half of it anyways.

    So, why do I care as much as I do about this one over say others at the moment?  Well, it's great that we have such choices in the market now.  As Scott Hanselman said at ALT.NET Open Spaces, Seattle, he's a StructureMap, Moq and xUnit.net guy.   I'll get into the reason shortly enough.

    The Traditional Way

    When you think of doing your unit or behavior tests, you often need a class and decorate with attributes, have a initialize and teardown and all that goodness.  Since F# is considered a multi-purpose language, this works just fine.  That's the beauty of F# and hopefully will drive its adoption into the marketplace.  So, consider the following functions with the appropriate tests in a more traditional way such as NUnit within Gallio.  I am trying out Gallio so that I can see how well it reacts to F# as well as just kicking the tires.  I highly recommend you at least check it out.

    Anyhow, back to the code.  Let's take a simple example of a naive routing table through pattern matching, to see whether a call is allowed or not.  Like I said, naive, but proves a point with pattern matching.

    #light

    #R @"D:\Program Files\Gallio\bin\NUnit\nunit.core.dll"
    #R @"D:\Program Files\Gallio\bin\NUnit\nunit.framework.dll"

    let FilterCall protocol port =
      match(protocol, port) with
      | "tcp", _ when port = 21 || port = 23 || port = 25 -> true
      | "http", _ when port = 80 || port = 8080 -> true
      | "https", 443 -> true
      | _ -> false
     
    open NUnit.Framework

    [<TestFixture>]
    type PatternMatchingFixture = class
      [<Test>]
      member x.FilterCall_HttpWithPort8888_ShouldReturnFalse() =
        Assert.IsFalse(FilterCall "http" 8888)
       
      [<Test>]
      member x.FilterCall_TcpWithPort23_ShouldReturnTruee() =
        Assert.IsTrue(FilterCall "tcp" 23) 
    end

    Unfortunately of course for me, the Gallio Icarus Runner doesn't really work well for me at the moment with F# integration.  Instead, I get all sorts of issues when doing so.  This is where I get the large FAIL.



    This seems to repeat itself unfortunately for the xUnit.net and MbUnit integration as well, so it's not quite ready for primetime in the F# space.  Also, when I exit the application, there is a Gallio session runner that keeps running in memory and therefore I can't perform any builds.  So, I have to manually go into Task Manager and kill the process.  Not the best experience I've ever had...  So, for now, the limited functionality in the xUnit.net GUI Runner works for me.

    The More Functional Way

    Instead, we see a lot of pomp and circumstance that we just don't need.  In the functional world, a lot of the time, we don't want or need to create these classes just to test our functions.  After all, most of what we do has no side effects or at least should be (and are code smells mostly if they are not, but that's another post for another time).

    At the request of Harry Pierson, another F# aficionado and IronPython PM, talked to Brad and Jim Newkirk about adding static function unit test capabilities to xUnit.net.  And sure enough, we now have them, so let's compress the above code into something that looks more like F#.

    #light

    #R @"D:\Tools\xunit-1.0\xunit.dll"

    open Xunit

    let FilterCall protocol port =
      match(protocol, port) with
      | "tcp", _ when port = 21 || port = 23 || port = 25 -> true
      | "http", _ when port = 80 || port = 8080 -> true
      | "https", 443 -> true
      | _ -> false
     
    [<Fact>]
    let FilterCall_TcpWithPort23_ShouldReturnTrue () =
      Assert.True(FilterCall "tcp" 23)

    [<Fact>]
    let FilterCall_HttpWithPort8888_ShouldReturnFalse () =
      Assert.False(FilterCall "http" 8888)

    So, as you can see, I compressed the code quite a bit from here.  Since I'm doing functions and nothing more with just some basic pattern matching, this approach works perfectly.  That's why I am a fan of this.  Open up the GUI runner or just the ever popular console runner, and run it through and sure enough, we get some positive results.



    The interesting thing to see upcoming is how well the TDD space will play in the functional programming space.  I don't think it should be any different of an experience, but time will tell.

    Where to Go?

    From here, where do we go?  Well, I'm sure the GUI Runner of xUnit.net will get better over time, but the Gallio folks are pushing for the Icarus Runner.  Right now, only the xUnit.net runner works for me, so that's what I'm going to stick with at the moment. 

    An interesting thought occurred to me though.  Are the unit tests we're doing mostly nowadays purely functional anyways?  Would it make sense to test some C# code in F# to produce cleaner code?  Not sure, but I like the idea of having that choice.  Or even for that matter, writing my unit tests in Ruby for my staticly typed C# code.  Within the .NET framework space, the possibilities are vast.  And that's the really cool thing about it.  But will we see an IronPython or IronRuby testing framework within the .NET space?
  • let Matt = CodeBetter + 1

    Hello CodeBetter community! 

    #light

    type FullName = string * string

    let FullNameToString (name : FullName) =
      let first, last = name in
      first + " " + last
     
    let blogger = FullNameToString("Matthew", "Podwysocki")

    I'm pretty excited to be joining the CodeBetter gang after following for so many years.  I want to thank Jeremy Miller, Brendan Tompkins, Dave Laribee, Greg Young and others for welcoming me to the fold. 

    So Who Are You and Why Are You Here?

    So, just to introduce myself, I work for Microsoft in the Washington DC area.  I'm active in the developer community in whether it be in the .NET space, Ruby, or less mainstream languages (F#, Haskell, OCaml, Lisp, etc).  I also run the DC ALT.NET group since the November timeframe of last year and helped plan the latest incarnation in Seattle. 

    The number one reason I'm here is to help better myself.  Deep down, I'm a language geek with any of the aforementioned languages.  I'm one of those who strives to learn a language every year, but not just learn it, let it sink in.  That's really the key.  Sure, I can learn a certain dialect, but it's not quite being a native speaker.  That's how I can take those practices back to my other languages to try to apply lessons learned such as functional programming paradigms (pattern matching, currying, first order functions, etc).

    I also have a pretty deep interest in TDD/BDD, Domain Driven Design and of course one of Greg Young's topics, messaging.  Right now in the world of messaging, I think we're in a pretty important time in the development world when messaging, multi-threaded processing and such is going to be more mainstream and hopefully less hard than it is now.

    I'm also a tinkerer at heart.  I'm looking at testing frameworks to help make my TDD experiences easier.  I'm looking at IoC containers to help make my system just a bit more configurable.  I'll look at the tests to see how each one does what it is.  That's the fun part about it.

    I'm also on the fringe with such topics as Spec# and Design by Contract.  I'd love nothing more than to see many of the things being done at Microsoft Research become a bit more mainstream and not just seen as a place where we might see something 10 years down the line.  Topics such as Spec#, F# and others have real importance now and it's best to play with them, give them our feedback and such.

    So What Do You Want From Me?

    Here's the thing, since I'm always looking to better myself, I'll need your help along the way.  I value your feedback along the way and hopefully we'll learn from each other.  Now that this is out of the way, time for more serious topics...

More Posts

This Blog

Syndication

News

Disclaimer
The views expressed on this weblog are mine and do not necessarily reflect the views of my employer.

All postings are provided "AS IS" with no warranties, and confer no rights.

Badges



I'm test-driven!

Locations of visitors to this page


Archives