A better way of handling exceptions

The problem with exception handling

Like most of us, my applications usually have a global unhandled exception handler, in order that the inevitable unhandled exception won’t crash the system. The usual way to do this is to add a hook in the Application’s startup method. The App.xaml file would have a line like this…

<Application x:Class="MyApp"
             Startup="Application_Startup"
...other stuff here...

Then, in the App.xaml.cs, we’d have code like this…

private async void Application_Startup(object sender, StartupEventArgs e) {
  Current.DispatcherUnhandledException += UnhandledExceptionCaught;
  // Other stuff here as well
}

private void UnhandledExceptionCaught(object sender,
                                      DispatcherUnhandledExceptionEventArgs e) {
  // Handle the exception. Log it, maybe send an email, etc
}

The problem with this is that by the time you get to the exception handler, you’ve completely lost the context of what was happening. You’re in an isolated, global context, away from any window.

What you really want is to get immediate feedback when there’s been a problem, so you can react appropriately for the situation. For example, failure to load a customer list could be handled by trying again, up to a maximum number of attempts, before informing the user that the list couldn’t be loaded. By contrast, if they were trying to save an individual customer (presumably from a customer details window), you’d react differently.

On top of this is a secondary problem that if you’re using WCF (which we do for most of our applications, but any other server-based technology would probably be the same), you can run into trouble if an unhandled exception occurs on the server. In theory, WCF wraps any exceptions in a FaultException, and the client should be able to pick this up and handle it. Apart from the problems of global exception handlers mentioned above, I have found WCF to be flaky in this respect, and very often doesn’t include anything useful in the FaultException. Even when it works, you’re still left with ugly exception handling in the calling code.

What we want is a robust method of handling exceptions that keeps the client in control.

A better approach

Around a year ago, whilst contemplating this issue, I had an idea that turned out to be an excellent answer to the problem. It turned out that this wasn’t an original idea (it was too obvious for that), but as I hadn’t come across it before, I didn’t know that at the time.

My idea was inspired by one of the most powerful language features in F#, namely the discriminated union (DU). This allows you to create a type that can be one of a specific number of subtypes. In OO terms, this is like having a (probably abstract) base class, and several implementations. However, in F#, the code required to do this is sooooo much cleaner and neater than any OO language.

For example, in the scenario I described above, we would want to make a service call, and get back one of two (or possibly three, see later) results. Either we would get back the requested data (say, the customer list), or we would get back some error information that told us what went wrong.

In F#, we could create a DU as follows…

type CustomerResponse =
  | Customers of Customer list
  | Error of string

This means that we would be handling a CustomerResponse type, which would be the equivalent of our abstract base class, and would deal with it as one of two specific types, Customers, which would be a list of Customer (presumably defined elsewhere), or an Error type, which for this simple example I’ve assumed to be a string. Obviously, you can create much more elaborate DUs than this, but the point is that it only takes three lines of F# to do what requires three classes in C#.

One of the massive benefits you get from using DUs in F# is that the way you handle them (almost) requires you to consider every possibility. This avoids the cause of many null reference exception that would get by the equivalent C# code assuming that the call was successful, and not considering failure. This is an extremely common problem, and is the cause of a significant percentage of null reference exceptions.

However, as I was writing in C#, I had to go the class way. So, having picked the witty class name of Fallible, which describes the type of operation we are handling, I defined my base class and two implementing classes…

public abstract class Fallible<T> {
}

public class Success<T> : Fallible<T> {
  public T Value { get; set; }
}

public class Failure<T> : Fallible<T> {
  public Exception Exception { get; set; }
}

With this, I can have my request return a Fallible, and know that it would either be a Success, which would contain the data, or a Failure, which would contain the exception.

I was then able to wrap the main service call in a Func that attempted to run the service call, caught any exceptions, and either returned the data wrapped in Success or the exception wrapped in Failure.

It turned out that when I tried to pass this out through WCF, I got a serialisation exception, as you can’t serialise an Exception. I had to modify the Failure class to have two string properties, one for the message and one for the stack trace…

public class Failure<T> : Fallible<T> {
  public string Message { internal get; set; }
  public string StackTrace { internal get; set; }
}

Still not quite there though

This was all fine, but required a separate helper method, and exposed the Success.Value property. This allowed a careless programmer to do something like this…

Success<float> jim = (Success<float>)res;
float = jim.Value;

This rather defeats a large part of the point of this exercise, which was to prevent the programmer from bypassing the failure case.

A bad idea that was actually quite good

Before addressing that issue, it’s worth noting another idea I had for this. There might be cases where the calling code tries to do something that wouldn’t raise an exception, but wouldn’t be a good idea. For example, we recently had an issue where you could delete a customer that had active quotes. As our deletes are all soft, this doesn’t cause any referential integrity problems, but would certainly cause business logic issues later on. In such a case, the service call would need to return something that indicated that what the calling code wanted to do wasn’t an exception as such, but was rejected as being a bad idea. To achieve this, I added the following two classes…

public class BadIdea<T> : Fallible<T> {
  public string Message { internal get; set; }
  public string StackTrace { internal get; set; }
}

public class BadIdeaException : Exception {
  public BadIdeaException(string message) : base(message) {
  }
}

The idea here was that if the caller tried to do something like this, we would raise a BadIdeaException, which the Fallible handler (see below) would pick up and wrap in a BadIdea.

The final result

After some refactoring, I ended up with the following. The two methods on the Fallible class will be explained below…

public abstract class Fallible<T> {
  public static Fallible<T> Do(Func<T> f) {
    Fallible<T> result;
    try {
      T fResult = f();
      result = new Success<T> { Value = fResult };
    }
    catch (BadIdeaException ex) {
      result = new BadIdea<T> { Message = ex.Message, StackTrace = ex.StackTrace };
    }
    catch (Exception ex) {
      // NOTE that in a real application, we would log the exception at this point
      result = new Failure<T> { Message = ex.Message, StackTrace = ex.StackTrace };
    }
    return result;
  }

  public void Match(Action<T> onSuccess, Action<string, string> onFailure,
                                         Action<string, string> onBadIdea = null) {
    switch (this) {
      case Success<T> success:
        onSuccess(success.Value);
        break;
      case BadIdea<T> badIdea:
        if (onBadIdea != null) {
          onBadIdea(badIdea.Message, badIdea.StackTrace);
        } else {
          onFailure(badIdea.Message, badIdea.StackTrace);
        }
        break;
      case Failure<T> failure:
        onFailure(failure.Message, failure.StackTrace);
        break;
    }
  }
}

The only changes to the Success and Failure classes shown earlier was to make the get accessor internal, which prevented the naughty programmer from accessing the Value and exception properties directly. This forces them to use the Match() method, meaning that they have to think about what to do in case of an exception.

Note that the Match() method shown above uses C#7 features. If you’re still using C#6, you’ll need to use this version instead…

public void Match(Action<T> onSuccess, Action<string, string> onFailure,
                                       Action<string, string> onBadIdea = null) {
  if (this is Success<T>) {
    Success<T> success = (Success<T>)this;
    onSuccess(success.Value);
  } else if (this is BadIdea<T>) {
    BadIdea<T> badIdea = (BadIdea<T>)this;
    if (onBadIdea != null) {
      onBadIdea(badIdea.Message, badIdea.StackTrace);
    } else {
      onFailure(badIdea.Message, badIdea.StackTrace);
    }
  } else if (this is Failure<T>) {
    Failure<T> failure = (Failure<T>)this;
    onFailure(failure.Message, failure.StackTrace);
  }
}

How it’s all used

So how does this all work now? As you can see, the Fallible class has a Do() method that takes a Func. That will be the service call, for example, in a WCF service call, this would be right at the top level, ie in the .svc file…

[OperationContract]
public Fallible<Customer> GetCustomer(int id) {
  return Fallible<Customer>.Do(() => CustomerBll.GetCustomer(id));
}

The calling code would then use the Fallible object’s Match() method to decide what to do with the result…

Fallible<Customer> c = customerService.GetCustomer(613);
c.Match(
  c => Customer = c,
  (m, st) => AlertUser(e, st) // m is the message and st is the stack trace
);

I’m passing two Funcs to the method, the first to say what to do in case of success, and the second for what to do in case of failure. This code assumes that there is a local property named Customer, which is where we want the data placed. AlertUser() would alert the user that the service call had failed, log the error, decide what to do next, etc.

In a case like this, the BadIdea case is unlikely to be useful, so I only passed two Actions to the Match() method, making use of the fact that if the onBadIdea Action were null, the onFailure Action would be used instead. If the service call had been to delete a customer, then I would have handled onBadIdea as well, to warn the user that this wasn’t a good idea.

The end result is a very simple approach that results in a very stable code base. Since implementing this on one of my projects, I’ve never had an unhandled exception! If anything goes wrong, the execution is handled gracefully, the user is informed, and I get an error report using the technique I described to show you what the user was doing when the error happened.

I must come clean at this point and admit that I haven’t shown all the code here. The classes and methods described above are useful for service calls that return data. For ones that don’t, you need to create non-generic versions of the above. The code is so similar to what I’ve shown that I didn’t think it was worth filling the post with it.

Isn’t this just Either?

As I said, when I first wrote this code, I took inspiration from DUs, but knew little more about functional techniques, and didn’t know anything about monads, other than they were supposed to be mind-bendingly difficult to understand. More recently, I’ve realised that I basically implemented the Either monad, well, sort of. What I did had two differences.

First, I added a third case, which would make the name Either grammatically incorrect (correctly used, “either” means one of two options, not three). However, other than my mother, which is very particular about her grammar, I don’t think would really bother anyone!

More significantly, real functional programmers would be throwing up their arms in horror at the way I did this. A real FPer would have added a Bind() method, and used that to pass the service call down through the layers, rather than just wrapping the top-level call in the Do() method, and relying in exceptions to control what type was returned. I didn’t do this, mainly because this code was to be used in a mature, traditionally-designed OO application, and the amount of work required to refactor the code to be fully FP would have been significant, and would have caused our CEO to have had a fit!

However, what I ended up with could very easily be used in the code we have, and provides some very significant benefits in terms of stability.

Postscript: WCF

One extra problem I hit when doing all of this was that WCF threw a hissy fit when I tried to return the Fallible classes. It turns out that this is because I was returning a derived class. When WCF generates the classes for your service calls, it only knows about base classes, and if you try to use classes that inherit from those, you get a really unhelpful, rather generic WCF error.

The answer turned out to be fairly simple. I needed to use the [ServiceKnownType] attribute on my service calls. As I was going to be doing this several times for each call (once for each of the generic Fallible types and once for each of the non-generic ones), I ended up writing a helper…

public static class KnownTypeHelper {
  public static IEnumerable<Type> GetNonGenericFallibleTypes(ICustomAttributeProvider provider) {
    return new List<Type> {
      typeof(Fallible),
      typeof(Success),
      typeof(BadIdea),
      typeof(Failure)
    };
  }

  private static IEnumerable<Type> GetFallibleTypes<T>(ICustomAttributeProvider provider) {
    return new List<Type> {
      typeof(Fallible<T>),
      typeof(Success<T>),
      typeof(BadIdea<T>),
      typeof(Failure<T>)
    }.Union(GetNonGenericFallibleTypes(provider));
  }
}

  public static IEnumerable<Type> GetIntFallibleTypes(ICustomAttributeProvider provider) {
    return GetFallibleTypes<int>(provider);
  }

  public static IEnumerable<Type> GetPatientFallibleTypes(ICustomAttributeProvider provider) {
    return GetFallibleTypes<Patient>(provider);
  }

We start off with the wittily named GetNonGenericFallibleTypes() method, that creates a List of the four non-generic Fallible types. We then have a generic version of this. Then, as we need them, we add methods like the last two, that are specific to the return type of a service call. For example, the GetIntFallibleTypes() method is appropriate for a service call that returns an int.

As an aside, the .NET team never felt the necessity to implement generic attributes, even though there is no technical reason not to (saw this commented by Jon Skeet). For that reason, we need to add code like the above to get around this problem.

With this in hand, we can now use a variation of the [ServiceKnownType] attribute that takes a method name as follows…

[OperationContract]
[ServiceKnownType("GetPatientFallibleTypes", typeof(KnownTypeHelper))]
public Fallible<Patient> GetPatient(int id) {
  Fallible<Patient> patient =
        Fallible<Patient>.Do(() => PatientServiceLogic.GetPatient(id));
  return patient;
}

This is a bit of a pain, as I had to remember to add the attribute to all service methods, and had to create a new method in the helper class for each return type. However, it works!

A working example

You can grab a working sample that shows how to do this, as well as having the full code for the Fallible classes (including the non-generic versions) from my github repository.