Side Effects and Functional Programming

One of my first posts at CodeBetter was in regards to side effects and how, when unmanaged, can be truly evil.  Today, I want to revisit that topic briefly in regards to functional programming and managing side effects.  When I was out in Redmond a couple of months ago, I had the opportunity to sit down with Erik Meijer to discuss functional programming among other topics.  In there, we discussed a number of issues around managing side effects and state in your code, and how both C# and F# don’t intrinsically support such a concept.  Languages like Haskell, of course do with IO monads and other such monadic structures.  Whether languages such as F# and Erlang are not pure functional programming languages is another matter, due to the fact that you don’t have to declare when you are side effecting (reading a database, writing to console, spawning a process, etc). 

Erik will be giving a talk at JAOO about why functional programming still matters.  I think in the day and age of where Moore’s law is changing, we need to recognize that functional programming has a unique place in this due to the way that it handles both purity and immutability.  Today, let’s cover some of those issues when dealing with functional constructs and side effects to see what kind of an impact they can have on us.


The Problem of Purity

In some previous posts, I talked about function purity in regards to things you should keep in mind with a functional mindset.  In order to qualify as such, you need to meet the following criteria:

  • Evaluate with the same result given the same input, and perform no state change.
  • Evaluation of the given function does not cause observable side effects (write to console, write to database, etc)

Things get even more clouded as we bring lazy evaluation to the table.  One of the great things that came to C# in 2.0 and especially in 3.0 with LINQ was the idea of lazy evaluation through the yield keyword.  This gave us the ability to defer work until it was absolutely needed.  That’s one of the hallmarks of functional programming, especially a lazy language such as Haskell.

Let’s look at a code snippet both written in F# and the equivalent in C# and let’s determine the order of side effects.  Can they be predicted here?  When do the side effects happen?  Do I just get a nice list of my integers at the end?

F#

#light

let divisible_by_two n = 
  printf "%i divisible by two?" n
  n % 2 = 0

let divisible_by_three n =
  printf "%i divisible by three?" n
  n % 3 = 0
  
let c1 = {1 .. 100} |> Seq.filter divisible_by_two
let c2 = c1 |> Seq.filter divisible_by_three

c2 |> Seq.iter (printfn "%i")

C#

static bool DivisibleByTwo(int n)
{
    Console.Write("{0} divisible by 2?", n);
    return n % 2 == 0;
}

static bool DivisibleByThree(int n)
{
    Console.Write("{0} divisible by 3?", n);
    return n % 3 == 0;
}

var c1 = from n in Enumerable.Range(1, 100)
         where DivisibleByTwo(n)
         select n;

var c2 = from n in c1
         where DivisibleByThree(n)
         select n;

foreach (var c in c2) Console.WriteLine("{0}", c);

 

Well, what do you think?  If you said that the list will print out nicely at the end, you’d be very wrong.  In fact, our output might look something like this:

image

Why is that?  Well, because in LINQ, the return type is always IEnumerable<T>, and in the case of F#, I was using sequences, which unlike the List<’a>, is lazily evaluated.  What can we do about it?  That’s another matter altogether.  Unfortunately, the language constructs in .NET don’t forbid such things, so there’s no way of really preventing it, instead, you must be vigilant on your own to make sure this doesn’t happen.  Spec# can help in just a little way as it can ensure state change will not occur during a given operation, but does not prevent you, nor warn you if you do such things as write to the console, logs, databases, etc.  Let’s move onto another example and let’s see what happens.


Exception Management and Lazy Evaluation

Exception management in terms of lazy evaluation is another interesting topic.  When it comes to lazily evaluated functions and structures, how do we handle exceptions that may occur is a very good question.  Let’s look at an example of trying to catch an exception that may happen in our code due to a DivideByZeroException.  What do you think will happen here?  Will it be caught and just return us an empty collection?  Let’s find out:

F#

#light

let numbers = seq[1 ; 5; 2; 3; 7; 9; 0]

let one_over n =
  try
    n |> Seq.map(fun i -> 1 / i)
  with
    | err -> seq[]
    
numbers |> one_over |> Seq.iter(fun x -> printfn "%i" x)

C#

static IEnumerable<int> OneOver(IEnumerable<int> items)
{
    try
    {
        return from i in items select 1/i;
    }
    catch
    {
        return Enumerable.Empty<int>();
    }
}

var numbers = new[] {1, 5, 2, 3, 7, 9, 0};
foreach (var number in OneOver(numbers)) 
    Console.WriteLine("{0}", number);

 

What did you expect?  Did we in fact catch the exception?  The answer is no, because the returned structure isn’t evaluated until later, there isn’t any way for this try/catch block around our code to possibly work.  The question is to you, how might you fix this?

The logic here is fundamentally flawed.  Let’s move onto another scenario, this time dealing with resource management.


Resource Management and Lazy Evaluation

Another area of exploration is around resource management and lazy evaluation.  How do you ensure that your resources will be cleaned up once your evaluation is complete?  It’s relatively easy to make minor mistakes that may come back and haunt us.  Let’s look at a quick sample of using lazy evaluation and reading a file to the completion.  What happens in the following code?

F#

#light

open System.IO

let readLines = 
  use reader = File.OpenText(@"D:\Foo.txt")
  let lines() = reader.ReadToEnd()
  lines

printfn "%s" (readLines())

C#

Func<string> readLines = null;

using(var stream = File.OpenText(@"D:\foo.txt"))
    readLines = stream.ReadToEnd;

Console.WriteLine(readLines());

 

The answer of course is that we get an ObjectDisposedException thrown due to the fact that the stream is long gone by the time we want to invoke our function of readLines.  Our reader has long fallen out of scope, and therefore we get that exception.  So, how do we fix it?


Closures and Lazy Evaluation

One last topic for this post revolves around variable instantiation around closures and what it means to you in regards to lazy evaluation.  Let’s look at a quick example of some of the issues you might face. 

C#

var contents = new List<Func<int>>();
var s = new StringBuilder();

for (var i = 4; i < 7; i++)
    contents.Add(() => i);

for (var k = 0; k < contents.Count; k++)
    s.Append(contents[k]());

Console.WriteLine(s);

 

What we might expect the results to be in this case would be 456.  But that’s not the case at all here.  In fact, the answer you will get is 777.  Why is that?  Well, it has to do with the way that the C# compiler creates a helper class to enable this closure.  If you’re like me and have Resharper, you’ll notice that it gives a warning about this with "Access to modified closure".  If we change this to give ourselves a local variable inside the loop closure construct to initialize the value properly.  If we do that and change our code, it will now look like this:

C#

var contents = new List<Func<int>>();
var s = new StringBuilder();

for (var i = 4; i < 7; i++)
{
    var j = i;
    contents.Add(() => j);
}

for (var k = 0; k < contents.Count; k++)
    s.Append(contents[k]());

Console.WriteLine(s);

 

Jason Olson has a pretty good explanation in his post "Lambdas – Know Your Closures".  But where does F# fall into this picture?  Well, let’s try the first code sample written in F# to see what kind of results that we get.

F#

#light

open System.Text

let contents = new ResizeArray<(unit -> int)>()
for i = 4 to 6 do
  contents.Add((fun () -> i))
 
let s = new StringBuilder()
for k = 0 to (contents.Count – 1) do
  s.Append(contents.[k]()) |> ignore
 
printfn "%s" (s.ToString())

When we run this code, we get what we were expecting all along, "456".  Why is that?  The F# team handles the code just a little bit differently than the C# compiler does.  This to me seems a lot less error-prone.


Wrapping It Up

I’d like to see how you might fix some of these issues?  What would you do differently in each example to ensure the correct result with lazy evaluation?

Functional programming and side effects are important topics.  I hope this post shed some light on how with lazy evaluation, side effects, when not managed properly, can be quite evil.  The ideas in functional programming is to strive towards a side effect free style.  When we start introducing lazy evaluation, concurrency and other issues, we need to be mindful of side effects and how we manage them. 

I highly encourage you to see Erik’s presentation that I linked above as it contains ideas that may help make you a better programmer.  And while you’re at it, watch the video of Simon Peyton Jones and Erik Meijer on Channel 9 called "Towards a Programming Nirvana".  It’s definitely an eye opener!

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://podwysocki.codebetter.com Matthew.Podwysocki

    @Paul

    Agreed that F# and the way they do things is a lot less error prone and a lot less clumsy than C#. The items that I showed here were to show that purity does in fact matter, and mixing imperative and functional styles is not easy and should be properly managed.

    Matt

  • Paul de Vrieze

    The last example is quite interesting. The F# way is actually more functional. Pure functions must be predictable, as such they can not look up the value of a variable out of its scope. This means that the lambda’s are always instantiated with the value of the variable i, not a reference to the variable. Not to forget the fact that doing it the C# way is very much more complicated on the compiler.

  • http://podwysocki.codebetter.com Matthew Podwysocki

    @Ben

    Thanks for the link! That’s quite useful for those just coming to grips with IO, Exception Management and how to handle them in a lazy evaluated language. I’d love for some of these patterns to be picked up by the F# community to enforce some purity.

    Matt

  • Ben Ellis

    In the spirit of Tony’s comment, here’s a lovely paper by Simon Peyton Jones that explains how Haskell can handle IO, mutable state, exceptions, and concurrency.

    He goes into operational semantics of the various features, but those are really there to fully specify what you can get the general idea of rather quickly.

    I got a lot out of the paper just from seeing IORefs (cells holding mutable values) and MVars (cells holding mutable values that allow for inter-thread communication)

    http://research.microsoft.com/~simonpj/papers/marktoberdorf/

  • http://podwysocki.codebetter.com Matthew Podwysocki

    @Travis

    Oh, I just noticed my copy/paste was a little bad there and since has been fixed.

  • travis

    In the C# code under “Resource Management and Lazy Evaluation”, is there a mistake? Instead of “readContents = stream.ReadToEnd; ” should it be “readLines = stream.ReadToEnd; ”

  • http://podwysocki.codebetter.com Matthew Podwysocki

    @Bryan

    Yes, I think it’s bad to usually say that picking a language will just solve the problem. Because there are right ways to think about a problem and wrong. F# also brings that to the table as well, as there is a more correct way of doing lazy evaluation, and less than ideal scenarios. Haskell has the same things.

    Many of these problems I threw up there had everything to do with beginner mistakes when dealing with laziness. Proper planning and trying not to mix imperative and laziness are the key to solving these problems.

    I’m really looking forward to Real World Haskell, as I think it’ll explain the joys of Haskell to those who aren’t already in the functional programming mindset. Not only to learn about the language, which is great and interesting, but also to learn the patterns around lazy evaluation.

    Matt

  • http://www.serpentine.com/blog/ Bryan O’Sullivan

    Actually, Haskell doesn’t get around these issues. Mixing side effects and lazy evaluation is a common source of beginner mistakes in Haskell, too. A classic instance would be a pure computation that consumes the contents of a file via readFile. The readFile action returns almost immediately, and provides the file contents lazily. If you then try to e.g. write the file without having first ensured that you’ve read all of its contents, you’ll get an exception because the RTS still has it open for reading.

    The reason I bring these up as classic beginner mistakes is that they are just that, at least in Haskell: errors made by people new to the language. In fact, most of the problems you cite with F# and C# above can occur in Haskell, Python, and probably all other languages that support lazy evaluation in some way.

    In Haskell, we encourage people to take a different tack at these problems. For instance, if you’re concerned about lazy stream I/O, try reading files (or network data) inside a monadic left fold instead. I wrote an example of how to program in this style in Real World Haskell, and there’s an exercise somewhere that challenges the reader to use it in a different setting.

  • http://podwysocki.codebetter.com Matthew Podwysocki

    @Tony

    Absolutely agreed that Haskell gets around many of these issues that we discussed. But, how do you educate those who are still using .NET languages, Java, etc on these issues? That’s an important step on them coming to terms with Haskell and the great things it can do for you.

    But, how do you fix it in the languages I picked? Can you?

    Matt

  • http://tmorris.net/ Tony Morris

    > I’d like to see how you might fix some of these issues?

    Use Haskell.

  • http://podwysocki.codebetter.com Matthew.Podwysocki

    @Sebastian

    True, they are not, but I didn’t want to get into debates about how pure, etc.

    Matt

  • Sebastian

    “Whether languages such as F# and Erlang are not pure functional programming languages”

    What do y ou mean “Whether”? They’re not *by definition*. That’s what “pure” means.