A Critique of Impure Reason

This post has been moved to

http://blog.higher-order.com/blog/2009/04/27/a-critique-of-impure-reason/

 

Advertisements

Higher-Order Java Parallelism, Part 3: Threadless Concurrency With Actors

Multi-core processing is changing the way we write programs. Modern computers come with multi-core or multiple processors, and modern applications no longer do just one thing at a time. But all of this has left Java a little bit behind. While languages like Erlang and Haskell, even other JVM languages like Scala, have powerful concurrency abstractions, concurrent programming in Java is still mostly based around the threading practices of way back in the time of applets.

Threading in Java has long been the domain of the brave or foolish, weaving a brittle and intricate web of manual synchronization between threads, or threads waiting on other threads to feed them work. We all know the examples from old Java books where there might be a thread running in a loop, sleeping for a second, and waking up to check if work needs to be done. This kind of implementation is still being taught to newcomers to the Java language. And when they start writing their own applications in Java, they find that this approach does not scale. Not along the axis of program complexity, nor the axis of the number of threads. In a complex application or one with many threads, you may end up with a program that stops doing anything at all for long periods of time, or worse, hangs forever, while still consuming all the operating system resources that go with those threads.

Blocking vs. Nonblocking Concurrency

The new concurrency library in Java 5 help this situation a great deal. At least some advancement has been made towards better concurrency abstractions. But there are still pitfalls. For instance, as we’ve seen, the Future and Callable interfaces are really rather jolly useful, and they can take us a long way towards the kinds of things possible in those other languages. But at the end of the day, something has to call Future.get(), and that something will block unless the value is already available, until such time that it is. This can result in deadlocks or starvation, as you may end up in a situation where all threads are blocking on Futures and none are available to advance your program forward. This would be bad. In fact, we could say that in a highly concurrent system, blocking on shared state is a catastrophic event.

The Java libraries are a veritable minefield of methods that, ultimately, block on shared memory monitors. There’s Thread.join(), Future.get(), BlockingQueue.take(), and the list goes on. But what if we could solve this problem with a simple abstraction? Could we solve the blocking problem once and re-use our solution for different situations? Let’s see.

Instead of sleeping or blocking, what we really want is the ability for our code to say: “When X happens, wake up a thread to execute me.” That is, in some sense we want threads to go on doing useful work, and have them be notified when an event happens that requires more work to be done. For example, when Bob comes into my office and asks me for this month’s TPS reports, he doesn’t stand around and wait for them, nor does he check periodically to see if I have them done, and he certainly doesn’t go to sleep to periodically wake up and poll me for them. He will continue with his work (or whatever it is he does all day), since I have instructions to send him those TPS reports as soon as they’re ready. We’re independent and concurrent actors, so we don’t wait on each other.

There is a model of computation, called the Actor Model, that works very much the same way. This is the model employed in the design of Erlang, and it was a major motivator for languages like Simula and Scheme. An implementation of Actors comes with the standard Scala library, and as it happens an implementation of it for Java has just been released as part of the 2.8 version of the Functional Java library. You can read more about the actor model on your own, but I will explain in some detail how it works in Functional Java.

Algorithm + Strategy = Parallelism

The first thing to explain is Parallel Strategies. I’ve talked about them before, but I’ll do so again here since they’re important to what I’m trying to demonstrate. The idea of the Parallel Strategy is that we can capture a threading pattern, or some method of concurrency in general, in a separate abstraction called a Strategy, and write our concurrent programs independently of the particular threading pattern.

In Java terms, a Strategy<A> is a way of turning any Java expression, whose evaluated type is A, into a value of type A. How this happens is implementation-specific, but we’d like it to be concurrent. First, we need a way of capturing the idea of an unevaluated A. It’s a bit like the idea behind Runnable, only that we can get a value out of it. Java 5 has the Callable interface, which is really close to representing “any expression”, since we can create Callables anonymously, but its call() method throws Exceptions, creating a kind of barrier for the programmer. Instead, we’ll use fj.P1<A>. It’s an abstract class with one abstract method: A _1()

The Strategy class has an instance method called par(P1<A>). This will evaluate the given P1 concurrently, immediately returning another P1 that can get the value once it’s ready. Behind the scenes, it actually creates a Callable, turns it into a Future, then returns a P1 that calls Future.get(), but all of that is implementation detail. Strategy comes with some static construction methods that implement basic threading patterns, and you can implement your own by passing a Future-valued function to the strategy(F<P1<A>, Future<A>>) method.

You might be concerned that P1 can’t throw any checked exceptions. This is by design. You’ll find that once you start working with concurrent effects, the value of checked exceptions goes out the window. Besides, if our P1 did need to throw errors, we would use Lazy Error Handling and declare this in the P1's type.

Strategy + Effect = Actor

So now we have concurrent evaluation and parallel transformations. But we’re still ending up with objects that require us to call a blocking method to get their value. The only way around that is to not return any values at all. Instead, we’re going to let our P1s have side-effects. So instead of returning a P1 that lets us wait for the value, our P1s are now going to update a variable or send the value somewhere.

To model a P1 that doesn’t return a useful value, I’m going to use the Unit type. It’s another one of those trivial but useful classes. The Unit type is part of Functional Java, and it’s simply a type that only has one possible value. It’s a lot like the standard library’s Void type, except Void is even further impoverished. When you use this type, you’re saying that the actual value is immaterial. So to describe a P1 that doesn’t return anything useful, we use the type P1<Unit>. We can also describe a transformation from some type T to Unit, with F<T, Unit>.

Functional Java comes with a more honest version of F<A, Unit>, which is Effect<A>. It’s more honest in the sense that it doesn’t pretend to be a transformation, which properly should have no side-effects. Effect<A> is explicit about having side-effects, which is what we intend for Actors. It has one method that doesn’t return a value at all: void e(A a).

We now have what we need to instantiate the Actor class. Here’s an example of how to create and use a simple actor:


  Strategy<Unit> s = Strategy.simpleThreadStrategy();

  Actor<String> a = Actor.actor(s, new Effect<String>()
    {public void e(String s)
      {System.out.println(s);}});

  a.act("Hello, actors!");


The actor receives “messages” on its act method, and the Strategy serves as a kind of mailbox for it. You will note that there’s no dependency at all on any particular threading strategy or any part of the concurrency library by the Actor class. The strategy could be sending our Effects to be evaluated by Threads, remote server farms, by ForkJoin tasks, or even by a Mechanical Turk.

An Actor With its Own Mailbox

The Actor class by itself doesn’t yet quite achieve the kind of actors model you see implemented in Erlang or Scala. The important difference is that the Actor above can process multiple “messages” simultaneously. This solution, then, is more general, although if an actor such as the above mutates some state, we’re likely to run into race conditions. Not to worry, it’s easy to construct the more specific case. We just add a queue and ensure that only one message is processed at a time. This construct is available as part of Functional Java, and it’s called QueueActor.

Of course, the “one thread at a time” requirement is not implemented using any blocking or synchronization. Instead, The QueueActor has two possible states–“suspended” or “running”–and, behind the scenes, this is enforced with an AtomicBoolean to keep it consistent in the face of concurrency. If the actor is suspended when it receives a message, it becomes running and its Effect is immediately handed off to its Strategy. If it’s already running, then callers will leave a message on its queue. The QueueActor's Effect is a concurrent, threadless recursion (i.e. it uses a Strategy rather than a Thread) that completely empties the queue, then puts the QueueActor's state back to “suspended”.

QueueActor puts some sanity into managing locally mutable state within an actor’s Effect, since it’s ensured that the state can only be mutated by one thread at a time. It is guaranteed to act on its messages in some unspecified order, but is otherwise semantically equivalent to Actor.

The Obligatory Example

We now have a light-weight implementation of the actor model, in Java. Don’t believe it? Have a look at this implementation of the canonical Ping-Pong example (imports omitted), and compare it to similar examples for Erlang and Scala.

First, a Ping actor that sends a number of pings to a given Pong actor, waiting for a pong reply each time before sending the next ping.


  public class Ping
    {private final Pong pong;
     private final Actor<Pong> ping;
     private final Actor<Integer> cb;
     private volatile int n;

     public Ping(final Strategy<Unit> s, final int i, final Pong pong, final int id, final Actor<Integer> callback)
       {n = i;
        this.pong = pong;
        cb = callback;
        ping = actor
          (s, new Effect<Pong>() {public void e(final Pong pong)
            {n--;
             if (n > 0)
                pong.act(Ping.this);
             else // Done. Notify caller.
                cb.act(id);}});}

    // Commence pinging
    public P1<Unit> start()
      {return pong.act(this);}

    // Receive a pong
    public P1<Unit> act(final Pong p)
      {return ping.act(p);}}

The Pong actor simply receives ping messages and responds.


  public class Pong
    {private final Actor<Ping> p;

    public Pong(final Strategy<Unit> s)
      {p = actor(s, new Effect<Ping>()
        {public void e(final Ping m)
          {m.act(Pong.this);}});}

    // Receive a ping
    public P1<Unit> act(final Ping ping)
      {return p.act(ping);}}

And here’s the main program that uses the Ping and Pong actors. There’s only one Pong actor that responds to any number of Ping actors pinging it concurrently. There’s also a further QueueActor that is contacted by each Ping actor once that Ping actor has done its work. The example uses a thread pool to back its Strategy. When all the Ping actors have sent all their pings and received all their pongs, the program is terminated by shutting down the thread pool.


  public class PingPong
    {private final int actors;
     private final int pings;
     private final Strategy<Unit> s;
     private final Actor<Integer> callback;
     private volatile int done;

     public PingPong(final ExecutorService pool, final int actors, final int pings)
       {this.actors = actors;
        this.pings = pings;
        s = Strategy.executorStrategy(pool);

        // This actor gives feedback to the user that work is being done
        // and also terminates the program when all work has been completed.
        callback = QueueActor.queueActor
          (s, new Effect<Integer>() {public void e(final Integer i)
            {done++;
             if (done >= actors)
               {System.out.println("All done.");
                pool.shutdown();}
             else if (actors < 10 || done % (actors / 10) == 0)
                     System.out.println(MessageFormat.format
                       ("{0} actors done ({1} total pongs).", done, pings * done));}})
          .asActor();}

    public static void main(final String[] args)
      {final int actors  = Integer.parseInt(args[0]);
       final int pings   = Integer.parseInt(args[1]);
       final int threads = Integer.parseInt(args[2]);
       new PingPong(Executors.newFixedThreadPool(threads), actors, pings).start();}

    public void start()
      {// We will use one Pong actor...
       final Pong pong = new Pong(s);
       // ...and an awful lot of Ping actors.
       for (int i = 1; i <= actors; i++)
           {new Ping(s, pings, pong, i, callback).start();
            if (actors < 10 || i % (actors / 10) == 0)
               System.out.println(MessageFormat.format("{0} actors started.", i));}}}

What follows is an example run of this Java program, with a million concurrent Ping actors pinging 7 times each. Each actor takes about 300 bytes of memory, so we need a sizable heap for one million of them, but 19 real Java Threads handle this quite nicely on my 8-core machine.

$ time java -Xmx600m -cp ../../../.build/classes/src:. concurrent.PingPong 1000000 7 19
100,000 actors started.
200,000 actors started.
300,000 actors started.
400,000 actors started.
500,000 actors started.
600,000 actors started.
700,000 actors started.
800,000 actors started.
900,000 actors started.
1,000,000 actors started.
100,000 actors done (700,000 total pongs).
200,000 actors done (1,400,000 total pongs).
300,000 actors done (2,100,000 total pongs).
400,000 actors done (2,800,000 total pongs).
500,000 actors done (3,500,000 total pongs).
600,000 actors done (4,200,000 total pongs).
700,000 actors done (4,900,000 total pongs).
800,000 actors done (5,600,000 total pongs).
900,000 actors done (6,300,000 total pongs).
All done.

real    1m16.376s
user    3m53.612s
sys     0m10.924s

As you see, these simple tools, built on basic components of the Java 5 concurrency library, paired with powerful abstractions from programming in functional style, makes it seem like we have millions of tasks running concurrently. We get a virtually unbounded degree of concurrency while never seeing any locks nor performing any blocking calls. The number of concurrent tasks is limited only by the size of your heap.

Again, everything you see here, and more, has been released just recently as part of the Functional Java library. So head over to their download page and put those extra cores to use with Java, today!

Higher-Order Java Parallelism, Part 1: Parallel Strategies and the Callable Monad

Now that even budget desktop and laptop computers are shipping with multi-core processors, it’s more important than ever to design programs so that they can take advantage of parallel processing. If you’re already writing software in Erlang or Parallel Haskell, then lucky you. But if you’re writing in Java (and are unable or unwilling to take up Scala), you will need to get organized.

Since Java 5, the JDK comes with a concurrency library that makes concurrent programming a lot easier than it used to be, and the library has been improved further in Java 6. This provides some basic abstractions that take care of the nitty-gritty of thread management for us. In this short series of articles, I’m going to show how we can build on top of that library to achieve concurrent programming of a higher order. We will employ some design patterns and simple abstract building blocks from functional programming. This style of writing concurrent programs will afford us the ability to:

  • Compose ordinary functions into concurrent programs.
  • Decouple parallel behavior from the algorithm itself.
  • Turn existing code into parallel code, without refactoring, at runtime.
  • Work with hypothetical results of concurrent computations, before they finish.

We will not use any locking or explicit synchronization. Basic familiarity with the java.util.concurrent package is assumed but not required. Familiarity with generics is highly recommended, and I recommend reading parts 1 and 2 of my Lazy Error Handling series as it explains some preliminaries. Most of the source code herein is already part of the Functional Java library, and I’ll use a fair bit of interfaces and classes from it, which I’ll introduce as we go along.

When we’re through, I will have demonstrated that a functional style of programming promotes code-reuse and modularity of parallel programs, above and beyond what can be achieved with the canonical object-orientated style.

The Callable Monad

The Callable interface is new in Java 5. It’s similar to the old Runnable interface that we all know and love, except that its call() method returns a result, and it may throw an Exception. If you followed my Lazy Error Handling articles, you will immediately notice that Callable is nearly identical to the Thrower interface described in that series, and you’ll know that it is indeed a potential monad. Treating Callable as a monad will allow us to work with it at a higher level of abstraction, letting us compose Callables, chain them, and project computations into them.

As you may already know, we need three ingredients (and only these three) to have a monad:

  1. Type construction. This comes free with the Callable interface, since it takes a type parameter (i.e. it’s generic).
  2. A unit function. This allows us to turn any value into a Callable that returns that value again.
  3. A binding function. This allows us to chain together Callables with functions, without actually calling them.

Here is the unit function for Callable (we will refer to it as unit in the text to avoid ambiguity):

  public static <A> Callable<A> callable(final A a) {
    return new Callable<A>() {
      public A call() {
        return a;
      }
    };
  }

And here is its binding function:

  public static <A, B> Callable<B> bind(final Callable<A> a, final F<A, Callable<B>> f) {
    return new Callable<B>() {
      public B call() throws Exception {
        return f.f(a.call()).call();
      }
    };
  }

Note: The F interface is from Functional Java. It represents a first-class function (a function that can be treated like any other value). F<A, B> has one method that takes an argument of type A and returns a value of type B.

Given the above two methods, Callable is now a monad. All monads are also functors, so we can define a Callable functor. What I mean by “functor” is that we can define a method to turn any existing function into a function on Callables, or, in other words, to apply any function to a value wrapped in a Callable, without actually calling it (i.e. lazily). This method is called fmap, and we can define it in terms of bind:

  public static <A, B> F<Callable<A>, Callable<B>> fmap(final F<A, B> f) {
    return new F<Callable<A>, Callable<B>>() {
      public Callable<B> f(final Callable<A> a) {
        return bind(a, new F<A, Callable<B>>() {
          public Callable<B> f(final A ab) {
            return new Callable<B>() {
              public B call() {
                return f.f(ab);
              }
            };
          }
        });
      }
    };
  }

Useful operations on Callables, that the methods above allow us to implement, include sequence—which is a method of turning a list of Callables into a single Callable that returns a list—and join, which peels one layer of Callables from a Callable<Callable<A>> so that it becomes just Callable<A>. You will find the source code for those as part of Functional Java.

Parallel Strategies

When working with the java.util.concurrent library, you normally don’t work with Threads directly. You might instead implement Callable<A> and submit instances of your implementation to an ExecutorService, which yields values of type Future<A> which represents a running computation that yields an A. Future has a method called get() that returns the result as soon as it’s ready. This is a great improvement over managing threads and Runnables ourselves, but it still ties our hands somewhat to the ExecutorService, and encourages tight coupling of our code to the parallelisation library.

Inspired by Haskell’s “Parallel Strategies”, let’s instead work with parallelisation in the abstract. For what is ExecutorService, really? It’s a method of turning Callables into Futures. That means it’s a kind of function, so we can abstract from it, using a function type: F<Callable<A>, Future<A>>. Such a function can then be backed by an ExecutorService, or by something else entirely, such as a load-balancing facility that serializes Callables to be executed on remote servers.

We will use a new class, Strategy<A>, that allows us to effectively separate parallelism from the algorithm itself:

  public final class Strategy<A> {

    private F<Callable<A>, Future<A>> f;

    private Strategy(final F<Callable<A>, Future<A>> f) {
      this.f = f;
    }

    public F<Callable<A>, Future<A>> f() {
      return f;
    }

    public static <A> Strategy<A> strategy(final F<Callable<A>, Future<A>> f) {
      return new Strategy<A>(f);
    }

  }

We’ll add a couple of static functions to create simple strategies:

  public static <A> Strategy<A> simpleThreadStrategy() {
    return strategy(new F<Callable<A>, Future<A>>() {
      public Future<A> f(final Callable<A> p) {
        final FutureTask<A> t = new FutureTask<A>(p);
        new Thread(t).start();
        return t;
      }
    });
  }

  public static <A> Strategy<A> executorStrategy(final ExecutorService s) {
    return strategy(new F<Callable<A>, Future<A>>() {
      public Future<A> f(final Callable<A> p) {
        return s.submit(p);
      }
    });
  }

One of the neat things that working with Strategies as functions allows us to do is use the Callable monad to compose them with existing functions. Any function can be lifted into the Callable monad using fmap, and then composed with a Strategy to yield a concurrent function. Moreover, we can use Strategies to convert existing functions to concurrent functions. The following method on Strategy will take any function and return the equivalent function that executes concurrently. Calling such a function will give you a Future value from which you can get the computed result whenever it’s ready.

  public <B> F<B, Future<A>> lift(final F<B, A> f) {
    final Strategy<A> self = this;
    return new F<B, Future<A>>() {
      public Future<A> f(final B b) {
        return self.f().f(new Callable<A>() {
          public A call() {
            return f.f(b);
          }
        });
      }
    };
  }

Alert readers will note that lift represents the Kleisli arrow for the Future monad. As for most monads, this arrow is a very useful kind of thing (see Lazy Error Handling, Part 2). In the Future monad, the Kleisli arrow provides parallel function application. If you already have a function f, then calling lift(f).f(x) will apply that function to x while simultaneously continuing with the next statement in the current program.

Lazy Futures

For functions involving Futures, we generally always want to have the Future type in the codomain (on the right-hand side, in the return type). If you think about it, you’ll see that functions that take Futures in their arguments won’t be able to do much without blocking on Future.get(). However, that doesn’t mean we can’t compose Future-valued functions together. It just means that we can only compose two of them together in such a way that we either wait for the first Future to obtain a value before firing off the second one, or we have to spark a new thread that does nothing but wait on the first Future. We’re much better off composing Callables and then turning those into Futures as needed. In fact, we might want to wrap values of type Future<A> inside of a Callable<A> again so that we can manipulate their return values while they are running:

  public static <A> Callable<A> obtain(final Future<A> x) {
    return new Callable<A>() {
      public A call() throws Exception {
        return x.get();
      }
    };
  }

And this takes us full circle back into the Callable monad, where we can compose computations, bind them to functions and map functions over them, all lazily. Which means: without actually asking what their values are until we absolutely need to know.

So we have accomplished what we set out to do in this first part of the series. We’ve written a Callable monad that lets us compose existing code into concurrent programs. We’ve implemented parallel strategies to run those programs in parallel, without regard to the actual parallelisation implementation. We have a Kleisli arrow that lets us run ordinary functions concurrently with each other, and finally we have a way of taking concurrently running computations back into the Callable monad. Not bad for a day’s work.

In the next part of this series, we will employ what we’ve built here to develop some higher-order parallel functions, including a parallel list functor. I hope you join me.

Lazy Error Handling in Java, Part 3: Throwing Away Throws

In parts 1 and 2, I covered how we could create first-class functions in Java that declare, in their types, the exceptions they might throw. I talked about exploiting a functor to lift any existing function so that it carries exception information with it, and how to compose functions of that kind by using a monad. In this third and last installment, we’re going to fully abandon checked exceptions and embrace the functional style of handling errors.

First, let’s examine what exceptions really are. During the evaluation of some expression (some function), unexpected conditions or data are encountered, with which it is undefined how to proceed. That is to say, for some inputs, the result of the computation is undefined. For example, when a parseInt function encounters a character that isn’t a digit (or a minus sign), what do we do? Well, we don’t know exactly, so we need some way to indicate failure. It would be invalid to return an integer. One could return null, but that isn’t exactly reliable, nor correct, and well, it would be lying since the type signature promises an integer.

So far, I’ve shown how we can use a Thrower monad to indicate that a computation can error. But the Thrower monad doesn’t do anything except wrap Java’s existing exception handling with a datastructure. And what do you think Java’s existing error handling is? It’s a monad. No, really, it is. In fact, Java is one big monad with bells on.

Note that throws changes the type of a method so that it may throw exceptions. Sound familiar? That’s what Thrower does (see part 1). Java’s throws keyword is the Kleisli arrow (see part 2) for Java’s exceptions functor. Binding is done with a combination of try and a Java keyword called “;” which I’m sure you use all the time without even batting an eye. You can then think of catch as a function that takes a monadic value and “extracts” from it, performing case analysis on the result.

Why did we need Thrower then? The difference is that throws needs to be hard-coded at method definition time, while our Kleisli arrow, which we called Partial, transforms function values “at runtime”.

Returning Errors

We now find that we can take this idea further. We don’t actually need the throws monad at all, and hence, we don’t need Thrower either. Don’t get me wrong. I like the Trower monad. I use the Thrower monad in production. But let’s explore what we can do to approach a more functional style of error handling, just for grins.

Instead of throwing Exceptions, we can just collapse error types into the return type. We’ll use a monad called Either, which looks very similar to Thrower except that errors are returned, not thrown. The Functional Java library luckily has Either already defined:


  public abstract class Either<A, B>

Either is the disjoint union of two types. A value of type Either<A, B> simply wraps a value of either type A or type B. The Either class comes with two static data constructors:


  public static <A, B> Either<A, B> left(final A a)

  public static <A, B> Either<A, B> right(final B b)

We’ll use the standard convention that left values represent failure, and that right values represent success.

Let’s see how we might use this data type for error handling. It’s really quite straightforward. The following declares the return type of parseInt to be either an Exception or an Integer (the left value can actually be any type, but… baby steps).


  public Either<Exception, Integer> parseInt(String s);

Oh, but wait. We’ve lost the laziness. Functional Java’s Either, by itself, is not lazy like Thrower is. If we do want laziness, we have to wrap the computation in a closure. For example, we can use fj.P1, which is just a generic interface with one method _1() which takes no arguments.


  public P1<Either<Exception, Integer>> parseInt(String s);

That’s better. Raising errors is done by returning a left value:


  return new P1<Either<Exception, Integer>>() {
    public Either<Exception, Integer> _1() {
      return Either.left(new Exception("You broke it!"));
    }
  };  

Sometimes you’re able to call functions that might fail, recovering with a default value:


  P1<Either<Exception, Integer>> a = parseInt(s);
  int addFive = (a._1().isLeft() ? 0 : a._1().right().value()) + 5;

Other times, you will be catching exceptions thrown by other people’s code and converting them to Either:


  public static P1<Either<Exception, Double>> divide(double x, double y) {
    return new P1<Either<Exception, Double>>() {
      public Either<Exception, Double> _1()
        try {
          return Either.right(x / y);
        } catch (Exception e) {
          return Either.left(e);
        }
      }
    };
  }

An error is sent to the caller simply by returning it. In the example below, the parse errors are explicitly caught and re-raised, while any errors raised by the divide method are implicitly returned.


  public static P1<Either<Exception, Double>> parseAndDivide(String sx, String sy) {
    final P1<Either<Exception, Integer>> x, y;
    x = parseInt(sx); y = parseInt(sy);
    return new P1<Either<Exception, Double>>() {
      public Either<Exception, Double> _1() {
        if (x.isLeft() || y.isLeft()) {
          return Either.left(x.isLeft() ? x.left().value() : y.left().value());
        }
        return divide(x.right().value(), y.right().value());
      }
    };
  }

Either is a monad, so we can lift existing functions into it, compose Eithers, sequence them, just like we did with Throwers. Here I’m taking an existing sine function and reusing it for values of type Either<Exception, Double>:


  F<Double, Double> sin = new F<Double, Double>() {
    public Double f(final Double a) {
      return Math.sin(a);
    };}

  P1<Either<Exception, Double>> z =
    divide(x, y).map(Either.rightMap_().f(sin));

If you need to work with more than one kind of error at a time, you can. Simply nest them on the right, like so: Either<E1, Either<E2, A>>. This is something we could not do with Thrower.

Nothing to Declare

Now, we’ve been returning Exceptions, but what does an Exception give you, really? You get some information about the failure, like an Exception type name, an error message, a stack trace, and possibly a nested Exception. If you know how to recover from an error, are you going to need the stack trace and the error message? No. You don’t actually have to return Exceptions to indicate errors if you don’t want to. You could just as well return String, Integer, fj.Unit, or your favourite data structure. Or you could return… Nothing.

There’s a simpler monad than Either which is often used in functional programming. It’s called Option (sometimes known as Maybe). An example is Functional Java’s fj.data.Option class, which I’ll demonstrate briefly here. Option<A> is a lot like a List<A> that can either be empty, or contain a single value of type A. For example:


  public static Option<Double> divide(double x, double y) {
    if (y == 0) {
      return Option.none();
    }
    return Option.some(x / y);
  }

This is similar to returning null on failure, except that callers won’t be surprised if the method fails to return a value. It’s declared in the method signature, after all. Option.none() has the correct type and will respond to methods declared on the Option class. If you want a lazy Option, simply wrap it in a P1 like we did with Either.

Not surprisingly, Option comes equipped with monadic methods like fmap (or simply “map”) to lift an F<A,B> to an F<Option<A>, Option<B>>, “bind” to chain Option-valued functions together, and “join” to collapse an Option<Option<A>> into an Option<A>. Option’s unit function is the static method Option.some. The instance method some() returns the value in the Option, if any. This last one is the Option equivalent to Thrower.extract().

So, now that we’re using Option and Either to indicate recoverable errors, what about unrecoverable errors? Those are the kinds of errors that we’re not going to catch, nor expect any reasonable application to catch. They should result in immediate and visible failure. For example, we might decide that it’s not reasonable to recover from a critical function if it returns nothing:


  Option<MyAppConfig> config = getAppConfig();
  if (config.isNone()) {
    throw new Error("No configuration!");
  } else {
    // Do stuff with config here.
  }

At this point, we have completely obviated Java’s checked exceptions mechanism. Recoverable errors are simply returned, and unrecoverable errors are thrown all the way out. We can now construct computations, and even if they might fail we can pass them around, compose them together, extract values from them, etc. All without a throws keyword in sight. The circle is complete.

That’s the end of Lazy Error Handling in Java. We’ve gone everywhere from throwing exceptions lazily, to higher-order chaining and manipulation of exception-throwing computations, to doing away with checked exceptions altogether. I hope you have found this short series educational and as fun to read as it was to write.

END

Lazy Error Handling in Java, Part 2: Thrower is a Monad

In Part 1, we talked about the Thrower interface, and how we could use it to separate error handling from the code that produces those errors. We also saw how we could exploit generics to create a functor that can promote an existing function to a function that can handle and throw errors.

In this post, we’ll see that the greatest benefit of Thrower is the ability to compose and sequence computations; even computations that may fail, all the while staying type-safe. We’ll introduce some new functions which allow us to perform a kind of Thrower arithmetic that in turn will let us manipulate values tucked away in Throwers, without ever having to extract from them, and without any exceptions being thrown or caught.

The ingredients we need are exactly the ingredients needed for a monad. That is: A type construction, a unit function, and a binding operation. The first ingredient, type construction, we already have. The Thrower interface is generic, so it lets us create a type Thrower<A, E> for any type A, given some Throwable type E.

The unit function is simple enough. It just creates a new Thrower that completely preserves the value that we put in it:

  
  public static <A, E extends Throwable> Thrower<A, E> unit(final A a) {
    return new Thrower<A, E>() {
      public A extract() {
        return a;
      }
    };
  }

The final ingredient for our Thrower monad is the bind function. It takes a Thrower and feeds its result to a function that returns another Thrower:

  
  public static <A, B, E extends Throwable> Thrower<B, E>
  bind(final Thrower<A, E> a, final F<A, Thrower<B, E>> f) {
    return new Thrower<B, E>() {
      public B extract() throws E {
        return f.f(a.extract()).extract();
      }
    };
  }

Check what this actually does. It creates a new Thrower, whose extract() method calls the extract() method on the first Thrower and applies the function f to the result. Nothing actually happens when we call bind, except that a new future call to the provided function is created. So we’re applying f to the contents of the Thrower a, but not suffering any consequences until later, whenever we decide to call extract() on the result of bind.

Note that we can only compose, in this manner, Throwers that throw the same Throwable type. To combine Throwers of different kinds of exceptions, you would need to return a Thrower<Thrower<A,E1>,E2>>, or something like a Thrower2<B,E1,E2> which can throw either E1 or E2. You cannot create a new generic Throwable like MyException<E1, E2> as the disjoint union of two Throwables, since subclasses of Throwable are not allowed to be generic due to Java’s runtime type erasure. Normally, I don’t care about the types of exceptions to that level of detail anyway, so if I need to work with different species of them, I just use the Exception type. That’s exactly what we’ll do in the example further down.

Now, notice the second argument to bind:

          
  final F<A, Thrower<B, E>> f

This is a pretty useful kind of thing. It’s a function that takes an A and either returns a B or throws an E. So it’s kind of like a Thrower that takes an argument. It’s actually a Kleisli Arrow for the Thrower monad. This is something we’ll need often, so let’s make them easy for ourselves to create:

  
  public abstract class Partial<A, B, E extends Throwable> implements F<A, Thrower<B, E>> {

  protected abstract B run(final A a) throws E;

    public final Thrower<B, E> f(final A a) {
      final Partial<A, B, E> self = this;
      return new Thrower<B, E>() {
        public B extract() throws E {
          return self.run(a);
        }
      };
    }
  }
   

It’s called Partial because it models a partial function, a function whose result is undefined for some values (i.e. it might throw an exception instead of returning a value). We’ve abstracted away all the Thrower stuff (it’s always going to work the same anyway), so concrete subclasses will only have to override the run method and do what a partial function would: turn an A into a B or throw an E.

It might seem that this has taken us full circle, that this was all an exercise in futility if we’re back to just writing methods that throw exceptions. But no, what we have so far actually gives us a lot of power under the hood.

For the token contrived example, let’s say that you have an abstract class which models an action to get some object from the network given a URLConnection, possibly throwing a network error:

    
  public abstract class NetTalker<A> extends Partial<URLConnection, A, Exception> {}

Even though this class has no body, its intent is very clear. Users of our API will just have to to override run to do something useful with a connection. But NetTalkers just use connections; the construction of URLConnections will be implementation-specific. We’ll use another class:

  
  public abstract class URLConnector extends Partial<URL, URLConnection, Exception> {}

Where concrete subclasses create different concrete URLConnections. To get an A from a URL, given some URLConnector c (that knows how to make HttpURLConnections) and some NetTalker n, we can do the following:

          
  Thrower<A, Exception> t = bind(c.f("http://foo.bar"), n);

Again, this doesn’t give you the object of type A, it merely gives you a Thrower which either returns the A or throws an exception. It’s lazy. You can take t and bind it to a Partial that takes an A and, say, writes it to a file, again returning a lazy Thrower. Then, when something calls extract() on it, the whole enchilada of operations will be executed in one go.

Or, if you have a whole list of Throwers, you could turn them into a Thrower that returns a list:

  

  public static <A, E extends Throwable> Thrower<List<A>, E> sequence(final List<Thrower<A, E>> ts)
  {
    return ts.foldRight(Throwers.<List<A>, E> unit(List.<A>nil()),
      Throwers.<A, List<A>, List<A>, E> liftM2(List.<A> cons()));
  }

The above collapses an entire list of throwers into a single thrower that returns all their results in one list. Here, List is a fj.data.List from Functional Java, not java.util.List, although this can be made to work quite easily with the Java Collections Framework by defining nil(), cons(), and foldRight(). liftM2 is almost exactly like fmap from Part 1, except that it lifts functions of arity-2.

We’ve only had a slight taste of what we can do with monads, even in Java, and there are relationships between fmap, bind, and unit that are worth understanding in detail. As you can see, it’s a powerful abstraction that can be applied to a wide variety of problems. It doesn’t stop with Throwers. Notice that NetTalker is also a generic type, with a single type parameter. Yes, it too is a functor. It too can be a monad, and can be composed as we did with Throwers. In fact, here’s fmap for NetTalker:

 
  public static <A,B> F<NetTalker<A>, NetTalker<B>>
    fmap(final F<A,B> f) {
      return new F<NetTalker<A>, NetTalker<B>>() {
        public NetTalker<B> f(final NetTalker<A> a) {
          return new NetTalker<B>() {
            public B run(URLConnection u) throws Exception {
              return f.f(a.f(u));
            }
          };
        }
      };
    }

I wonder what other types are monadic.

Unfortunately, you’ll find that you can’t write a generic Monad class in Java. Poor Java’s type system is not sophisticated enough to understand higher-kinded types, nor would that be sane without type inference in the compiler, which Java also lacks. You will have to write bind and unit for every kind of monad that you need, as well as all operations based on them (fmap, sequence, etc.), or dispense with the type system (in which case you may as well be writing Ruby), or switch to Scala.

I hope you enjoyed this little journey through monads and exceptions in Java. In Part 3 of this series, we will look at how we can remove try, catch, and throws entirely from our code, in favor of unit, bind, fmap, and join.

Lazy Error Handling in Java, Part 1: The Thrower Functor

How many try/catch blocks have you written in your day? Hundreds? Thousands? How many of the catch blocks look exactly the same? Is try/catch synonymous with copy/paste? Well, you may have written your last catch block. Follow me.

Let’s say that you’re coding in Java, and you want to call some code that might throw Exceptions, but you don’t want to be bothered with catching or throwing those exceptions in your code. Or you may want to call some existing code that doesn’t handle exceptions, passing it something that has methods which may throw exceptions. This can get ugly really fast, and you may end up with “throws Exception” declarations in places where you don’t want them, or try/catch blocks where you’re not sure whether to log the error or throw an unchecked exception, or what.

But, despair not. From very simple ingredients, we can conjure up powerful functional-style error handling to save the day. The first ingredient is a generic interface that declares, in its type expression, the exception thrown by its method:


  public interface Thrower<A, E extends Throwable>
   {public A extract() throws E;}
 

What we’re saying here is that there’s some operation preformed by extract() that returns an A and might throw an E. But the operation won’t take place (and won’t throw anything) until extract() is actually called.

What is such a thing useful for? Well, you could implement this interface, take parameters in a constructor, and perform some calculation in extract(). For example, reading a file and possibly throwing an IOException. But that’s a naive view of what generic interfaces are for. Thrower has higher aspirations than that. It’s not just an interface. It’s a functor. Observe:

  
  public static <A,B,E extends Throwable> F<Thrower<A,E>, Thrower<B,E>> fmap
    (final F<A,B> f)
    {return new F<Thrower<A,E>, Thrower<B,E>>()
      {public Thrower<B, E> f(final Thrower<A, E> a)
        {return new Thrower<B,E>()
          {public B extract() throws E
            {return f.f(a.extract());}};}};}
 

OK, so this is a static method on which class? It doesn’t matter, let’s say we put it in a utility class called Throwers (remembering to make it final and its constructor private). Just look at the type signature for now. The interface F<A,B> is from Functional Java, and it’s just an interface with one method B f(A a). So what fmap does is take a function from A to B, and return a function from Thrower<A,E> to Thrower<B,E>. In other words: fmap promotes the function f so that it works with computation that may throw checked errors, without f having been written to handle errors.

The implementation should be self-explanatory, albeit verbose (this is Java, after all). We construct an anonymous function to return, implementing its only method, which returns a new thrower which in turn yields the result of applying the function f to the contents extracted from the original thrower. It sounds a lot more complicated than it looks, so just read the code. All we’re doing is taking the A from a Thrower, turning it into a B, and wrapping that in another Thrower. If taking the A from the Thrower fails with an exception, we’ll carry that exception into the new Thrower.

The important thing to note here is that nothing actually happens until you call extract() on the resulting Thrower. Values of types F and Thrower are data structures that represent a computation to be carried out in the future. Anybody can pass Throwers around, but only a try/catch block, or a method that throws a Throwable of the right type may actually call extract() on it to run its computation and get its value. Java’s type system ensures that you can’t go wrong (although, as for unchecked exceptions, you’re on your own).

A simple example is in order. Just to illustrate the point, not to do anything very useful. This program reuses a function designed to operate on strings, so that it operates on operations that result in strings but might throw exceptions. The exception handling is done somewhere else, in ErrorHandler, outside of the main program, in an error handling library function called extractFrom, which we assume knows the right thing to do. Note that the main program has no try/catch block at all.

  
  public class Foo
   {public static void main (String[] args)
     {Thrower<String,IOException> readLn = new Thrower<String,IOException>()
       {public String extract() throws IOException
         {return new BufferedReader(new InputStreamReader(System.in)).readLine();}};
      System.out.println(ErrorHandler.extractFrom(Throwers.fmap(length).f(readLn)));}

    // A first-class function to get a String's length.
    public static F<String,Integer> length = new F<String,Integer>()
     {public Integer f(final String s)
       {return s.length();}};}
 

That’s right. You don’t have to handle exceptions at all in your code, even though you’re performing actions that throw exceptions. You just pass the buck to somebody who knows how to handle exceptions. It’s like outsourcing without the funny accents. OK, so you’re not really performing any IO actions, at least not syntactically speaking. ErrorHandler is doing that for you as well, since it’s the one calling extract() on the Thrower. This doesn’t make a difference to the compiled program except that the execution stack (and hence the stack trace, should things go terribly wrong) will look a bit different. If this makes you uncomfortable, here’s your hat and there’s the door. If you’ve ever used “Inversion of Control”, then this should be nothing new. In fact, in real life, you may want to inject something like ErrorHandler as a dependency, using Guice, or Spring, or something similar.

So there you go. That’s lazy error handling in Java, in a nutshell. But there’s more to come, as Thrower has higher aspirations still. It’s not just a functor. It’s a monad. We’ll talk about that in the next post.

A Terser Right Fold in Java

By using Functional Java‘s F2 and F3 interfaces, (functions of arity 2 and 3, respectively), I was able to make the right fold in Java look downright terse:


 public static <A, B> B foldr(final F2<A, B, B> f, B z, List<A> xs)
   {return fold(curry(new F3<F<B, B>, A, B, B>()
                 {public B f(final F<B, B> k, A a, B b)
                   {return k.f(f.f(a, b));}}),
                Function.<B> id(), xs).f(z);}
 

The curry method simply coerces an F3<A, B, C, D> by returning the equivalent F<A, F<B, F<C, D>>>.