Test Automation Week — Fail Fast or Just Plain Fail

Let me start with a little story from a couple years ago.  My team inherited a less than desirable codebase from another team* .  There were some blackbox regression tests that we could use as a safety net, but we found that they were extremely fragile and unhelpful.  The system had a multitude of external dependencies on COM objects, configuration files, external systems, remoting points, virtual directories, and web services.  If there was anything wrong with any of that environment, the application wouldn’t run properly.  The main problem in testing was that some processing that relied on these external dependencies happened in a background thread.  When there was an exception the application would log the exception then swallow it.  The testing harness would just say that the test failed.  That hurt.  Debugging code when the problem could be almost anywhere isn’t an efficient exercise.  Not to worry though, once I started to sink my talons into the codebase I made a relatively small change to make the code…

Fail Fast

Fail Fast is a long standing piece of advice for software construction, but even more so for automated testing.  From Jim Shore,

Failing fast is a non-intuitive technique: “failing immediately and visibly” sounds like it would make your software more fragile, but it actually makes it more robust.  Bugs are easier to find and fix, so fewer go into production.

“Fail Fast” simply means this:  when something goes wrong in the code, fail immediately and directly.  Don’t pass go, don’t hide the real exception behind some sort of human readable message, don’t log the exception and then pretend that nothing is wrong, and finally, don’t put up a friendly “System Failure, please try again in a few minutes” message box.  Some of these choices are proper in normal functioning.  Even desirable in production mode, but hiding the gory details can make your life sheer hell when it comes time to debug a failing test.  Just throw the raw exception in a way that you can immediately relate the exception to the test failure.

My little solution to the problem I described above was pretty simple.  I typically make exception logging and sometimes auditing run through some sort of interface that’s pulled out of StructureMap.  Part of the reason to do this is to simply make unit testing via mock objects easier when auditing is a first class requirement.  The second reason is to create a testing mode where the exceptions are just thrown right back up.  Here’s an example from my current project.

The interface is just this:

 

    public interface IMessageLogger
    {
        void LogMessageFailure(string messageId, object request, object response);
        void LogMessageFailure(string messageId, object request, Exception ex);
        ...
        void LogApplicationFailure(string messageCaption, string MessageBody, Exception ex);
    }

In the projection mode the real concrete class is just a simple wrapper around log4net.  In testing mode I use this:

 

    public class FailFastMessageLogger : IMessageLogger
    {
        public static string LastTradeId;
 
        public void LogMessageFailure(string messageId, object request, object response)
        {
            throw new ApplicationException(response.ToString());
        }
 
        public void LogMessageFailure(string messageId, object request, Exception ex)
        {
            throw ex;
        }
 
        ...
    }

And in a “BootstrapFixture” class that I call in my StoryTeller SetUp, I do this to inject a FailFastMessageLogger:

 

        public void StubTheServer()
        {
            ObjectFactory.ResetDefaults();
            ObjectFactory.Profile = UserInterfaceRegistry.STUBBED;
            ObjectFactory.InjectStub<IMessageLogger>(new FailFastMessageLogger());
            MessageBoxRecorder.Start();
        }

The call to ObjectFactory.InjectStub<T>(T stub) “spring loads” StructureMap to return a FailFastMessageLogger anytime an IMessageLogger is requested.

I really wouldn’t mind some feedback here.  Like this solution, think it’s ugly, or something else?  Comments are open.

 

Defensive Assertions

I’m not a big fan of defensive coding in general, but I think it adds value at strategic spots in your code.  Inside test automation code you can make test failures far easier to diagnose by strategic usage of assertions when the test harness can detect something strange.  Instead of allowing the test harness code to stay quiet, make the automated test blow up right there. 

Here’s an example that comes to mind right off the bat.  I can happily test my WinForms client through Fit tests by manipulating all the little UI widgets (yes, I’m going to do a post in Build you own CAB showing how I do this).  There’s a danger in doing the tests in this way.  Think of a button.  If you’re doing things programmatically, you can quite happily make the button’s Click event fire even when the button is either disabled or hidden.  Your test could give a false positive.  Instead, do this:

 

        public void ClickButton(string label)
        {
            // Just don't worry about *how* we find the Button for the moment
            ButtonElement button = _binder.FindElementByLabelOrAlias<ButtonElement>(label);
            assertElementExists(button, label);
 
            // Throw an exception if the inner button is not enabled and visible
            button.AssertVisibleAndEnabled();
            button.Click();
        }

That’ll stop a false positive in its tracks.  A more pernicious problem is a silent failure.  In one of my main Presenter classes I perform a validation on the Model class before a creation or update message is passed to the server.  If the Model object is not valid, the validation messages are shown on the UI and the call to the server is cancelled.  Early on, I noticed some regression tests suddenly failing with no discernable reason.  With some debugging, I quickly realized that the issue was invalid data being set up in the test case.  Originally, the test data was perfectly fine, but as we added more and more validations the test data managed to trip off some of the new validation rules — silently.  I fixed that issue with this:

 

        public Fixture UpdateTradeWithProperties(string tradeId)
        {
            Trade trade = findTrade(tradeId);
 
            ITradeFixture fixture = TradeFixtureFinder.FindTradeFixture(trade.TypeDescription);
            fixture.Trade = trade;
 
            fixture.RowsFinished += delegate
                                        {
                                            assertTradeIsValid(trade);
                                            _tradeUpdateLog = _repository.UpdateTrade(trade);
                                        };
 
            return (Fixture) fixture;
        }

Note the line of code in bold.  If the Trade object definined in the test doesn’t pass it’s validation rules, the call to assertTradeIsValid() blows up and puts all of the validation messages into the exception message.  This little bitty thing has made the tests more resilient to the code changing by making it easier to fix up the test data.  There’s another lesson there about dealing with test data, but I’ll let that one slide.

 

Summary Points

  • Debugging is hard.  Debugging big tests that involve lots of classes and some external dependencies can be wicked hard.  Keep that debugger turned off as much as possible.
  • Make the test failure messages reflect the real reason for the test failures by failing fast
  • Don’t let hidden issues like clicking a hidden button or exception logging make test failures harder to diagnose
  • Think about creating an alternative ”testing” mode where exceptions aren’t trapped or logged

 

* Of course, the guys that wrote the miserable code are still there and going strong, while my team that wrestled with that crappy code up are all long gone.

About Jeremy Miller

Jeremy is the Chief Software Architect at Dovetail Software, the coolest ISV in Austin. Jeremy began his IT career writing "Shadow IT" applications to automate his engineering documentation, then wandered into software development because it looked like more fun. Jeremy is the author of the open source StructureMap tool for Dependency Injection with .Net, StoryTeller for supercharged acceptance testing in .Net, and one of the principal developers behind FubuMVC. Jeremy's thoughts on all things software can be found at The Shade Tree Developer at http://codebetter.com/jeremymiller.
This entry was posted in Featured, Test Automation Week. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Andy Alm

    Great stuff Jeremy. I like what you are doing here and do some very similar things. The only thing I do differently is the way I throw existing exceptions (like your LogMessageFailure method does that takes an Exception). I like to keep the original stack trace of the exception, so instead of throwing it inside of the LogMessageFailure method, I would have the LogMessageFailure method return a boolean to indicate whether the exception should be rethrown or not. That way, I can rethrow it from its original catch block and retain the stack trace (which can be very helpful when trying to find out exactly where a the failure occured). It makes using the method a little more involved instead of just calling it, you have to do something like this:

    catch(Exception ex)
    {
    if(logger.LogMessageFailure(“myid”, null, ex)) throw;
    }

    But its worth it to maintain the original stack trace IMHO. What do you think?

  • http://gil-zilberfeld.blogspot.com Gil Zilberfeld

    Jeremy,

    First – I’d choose the same way, meaning a logger that throws an exception, since it’s the easiest way I can see.

    In addition, In multi threaded systems, it is very important to fail fast, since if you progress (in this case into the logger), you lose some time where the stack on the other threads has changed. And this might be relevant information.

    I didn’t find a solution in the past apart from crashing inside the code. This means removing try-catches, and so on, leaving just the bare code.