Providing Safe Alternatives

When I was reading through Real World Haskell, I was struck several times by the mention of providing safe function alternatives.  The idea is to provide a function that in all cases returns a value as well as the one which is meant to accept valid input and throw exceptions should that contract be violated.  There is a real performance consideration to be taken into account as a function which repeatedly throws exceptions as logic will tend to overwhelm a system and slow it down significantly.  Imagine if you will an application which reads a large directory to check each file for an X509 certificate, whether it has one or not, and it throws an exception if one is not present.  The problem of course is there is no way to determine whether a file was signed at the time using the .NET class without resorting to P/Invoke (my favorite).

Recently on Twitter, there was talk finally of the inclusion of Guid.TryParse in .NET 4.0, yet somehow, Enum.TryParse has not been addressed yet (gee, we’re consistent here).  The idea is to provide a way of determining whether the string is a Guid format without throwing an exception, and instead returning a flag indicating success and an out parameter with the value which is set to the value if there is success, else the default value.  Before we dig any further, let’s look at some terminology.

Partial Versus Total Functions

To get some terminology straight, let’s talk about partial functions versus total functions.  Partial functions are those functions which only return values for a defined subset of valid inputs, as throwing an exception is not considered a return value.  On the other hand, functions that return valid results over the entire input are considered to be total functions.  Let’s look at a quick example of what that means in F# interactive, the first being a partial function and the latter being a total function.

> open System.Linq;;
> ([] : int list).First();;
System.InvalidOperationException: Sequence contains no elements
   at System.Linq.Enumerable.First[TSource](IEnumerable`1 source)
   at <StartupCode$FSI_0003>.$FSI_0003.main@()
stopped due to error
> ([] : int list).FirstOrDefault();;
val it : int = 0

We see by the above that the First() function is a partial function because it only has defined behavior for non-empty collections, whereas FirstOrDefault() has defined behavior for empty lists and returns the default value for the given generic type.  When we’re writing our code, we better know which one we’re dealing with as default values can cause unintended consequences.  Without proper tests around these calls, this can be hard to manage. 

Mitigating these partial functions with calls to such things as the Count() extension method is not necessarily the answer either.  When dealing with non-lazy lists such as an array or a List<T>, then such things as the number of items in our collection is well known and part of the state.  In the case of the lazy sequence, this is not the case, so any call to Count() would cause an evaluation of all items in the list, which could lead to unintended consequences.  So, if we’re chaining together 4 partial functions together, we could be doing something wrong.  All in all, total functions should be preferred usage, but of course within reason.

Can We Do Better Than Try?

Going back to one of the original points, can we do better than the standard Try (Parse/GetValue/etc) pattern that is pervasive throughout .NET code?  After all, it tries to do two things at once, and the success flag can be missed if one so chose.   F# decided to take a slightly different angle to this problem by returning the success flag and the value as a pair tuple.  Such an example would be like this memoize function below:

open System.Collections.Generic

let memoize f = 
  let t = new Dictionary<_,_>()
  fun n ->
    match t.TryGetValue(n) with
    | (true , value) -> value
    | (false, _    ) -> 
        let res = f n
        t.Add(n, res)
        res

What we’re doing above is calling the TryGetValue method which takes a key and returns a success flag as well as the out parameter for the value (set if there is success).  F# automatically creates a pair tuple for us which we can then pattern match against to take appropriate action.  Unfortunately, for this go around at least, C# does not treat tuples as first class data types, so a solution like this exclusive to F# at this time.  Overall, a much cleaner design, but maybe there’s another option?

Maybe there’s an Option…

Instead of dealing with the clumsiness of the above code, I would much rather solve this in a clean way using the F# Option type.  This allows us to definitively specify whether we have a value or not.  Oh great you say, another version of null.  Not quite, as sometimes null can be a value…  More to the point, we can have a universal solution to having a value without having to resort to the Nullable<T> or reference type null-ness nonsense.  Simply put, the Option type is no more than the following:

type Option<'a> =
  | Some of 'a
  | None

You’ll notice two constructors for the Option type, the Some which takes a value, and the None which takes no parameters.  Now, using this to extend the Dictionary class much like the F# Map class, we can add a function which returns an Option type whether we have a value or not.

type System.Collections.Generic.Dictionary<'k,'v> with
  member this.Lookup(k:'k) =
    match this.TryGetValue(k) with
    | (true , value) -> Some value
    | (false, _    ) -> None

Then we can rewrite our memoize function to utilize this in a cleaner fashion using the Option type for pattern matching purposes to show our true intentions of whether we have a value or not.

let memoize f = 
  let t = new Dictionary<_,_>()
  fun n ->
    match t.Lookup(n) with
    | Some v -> v
    | None   -> 
        let res = f n
        t.Add(n, res)
        res

Much cleaner and shows better intent on our part.

Conclusion

How far do we take this?  After all, making full functions everywhere sometimes isn’t feasible as we want to constrain our system.  Adding both partial and full implementations might just clutter up our entire API.  With this comes balance, but there are certainly places where this comes into play such as parsing values, querying values from collections and so on.  Could Code Contracts in .NET 4.0 help us here in terms of statically verifying that we’re not failing on a chained partial function?  We’re not quite finished here with this discussion as I’ve yet to cover the many values of null.

This entry was posted in F#, Functional Programming, Haskell. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://codebetter.com/members/Matthew.Podwysocki/default.aspx Matthew.Podwysocki

    @David,

    I agree that it should be non-null by default and languages such as Spec# tried to address such an issue which allowed you to specify by default that all reference types were non nullable.

    What’s interesting though is F#’s take on this which is that on F# objects, null is not defined as a state, unless you go well out of your way to specify that it is the unchecked default value. That’s how I think it should have been done.

    What I’m talking about here though is making it explicit about what your return type is. Could there be a failure? What happens if you find nothing? Null, an exception? Something such as Option makes such a thing explicit and that’s why I think going forward, it is a viable solution, but for languages such as F# where pattern matching is a first class citizen and not so much with C#.

    Matt

  • http://www.commongenius.com David Nelson

    @Matthew

    I don’t understand your statement. What is the semantic difference between Nullable and Option?

    And yes, having a single paradigm for nullable values would have been preferable. I think reference types should have been non-nullable by default from the beginning. But they aren’t, and we can’t change that. So the question is how do we continue to move forward in the state we are in with the least amount of confusion and the greatest productivity possible? Adding yet another paradigm for nullable values is not productive in my opinion.

  • http://codebetter.com/members/Matthew.Podwysocki/default.aspx Matthew.Podwysocki

    @David,

    The problem is that you have to explicitly say what is an invalid value type value whereas using an Option type is explicit. Sure, they added Nullable, but that then differs on how reference types are done. Why not use a single paradigm for nullable values?

    Matt

  • http://www.commongenius.com David Nelson

    “…having to differentiate between Nullable and then a reference is just broken.”

    In C# you don’t have to differentiate between them. You can compare a Nullable to null just like you can with a reference.

  • http://codebetter.com/members/Matthew.Podwysocki/default.aspx Matthew.Podwysocki

    @David

    Perhaps I’ll go further into why I think that even having to differentiate between Nullable and then a reference is just broken. Just as well, a None can never be mistaken for a real value, whereas a null can, and that’s where the real power is.

    Matt

  • http://shadowcoding.blogspot.com Erik

    @David – except for the cases where ‘null’ is a valid value. ‘None’ is never a valid value in any context, except for determining the presence of some other value.

  • http://www.commongenius.com David Nelson

    After reading your explanation of Option , I still say “Oh great, another version of null.” I can see how, if references were non-null by default and Nullable didn’t already exist, Option might be a cleaner way to express the concept. But given that those mechanisms already exist, and we can’t get rid of them, adding yet another way to express the same concept seems to do more harm than good.

    Making match work with Nullable would have made F# more consistent with the rest of the framework.