Higher-Rank Polymorphism in Scala

I was reading Tim Carstens’s post about Haskell features he’d like to see in other languages, and one of those features was rank-2 types. Tim says that the killer application for this to ensure the safety of resource access, in the type system. In short, this feature enables us to make sure that if we have a value of a particular type, then it represents a system resource which can be safely accessed, and that it’s a compile-time type error to access a closed or unopened resource. I find it really fascinating that we can get this kind of assurance from a type system.

First, a brief explanation of rank-2 polymorphism. A regular (rank-1) polymorphic function in Scala might look like this:

def apply[A](f: A => A, a: A): A = f(a)

This will apply the given function to a given value of any particular type, and it’s ensured that the argument and return types of the function are the same as the value to which it is applied. There’s only one type involved here, but it can be any particular type. But what if we wanted to say that the function f should work for all types? For example, a function that puts its argument in a list. Such a function should work for all types, as long as the input and output type match. For example:

def singletonList[A](a: A): List[A] = List(a)

Now say we want to take such a function as an argument to another function. With just rank-1 polymorphism, we can’t do this:

def apply[A,B](f: A => List[A], b: B, s: String): (List[B], List[String]) =
  (f(b), f(s))

It’s a type error because, B and String are not A. That is, the type A is fixed on the right of the quantifier [A,B]. We really want the polymorphism of the argument to be maintained so we can apply it polymorphically in the body of our function. Here’s how that might be expressed if Scala had rank-n types:

def apply[B](f: (A => List[A]) forAll { A }, b: B, s: String): (List[B], List[String]) =
  (f(b), f(s))

That’s not legal Scala code, so let’s see how we could work around this. Note that a method can be polymorphic in its arguments, but a value of type Function1, Function2, etc, is monomorphic. So what we do is represent a rank-2 polymorphic function with a new trait that accepts a type argument in its apply method:

trait ~>[F[_],G[_]] {
  def apply[A](a: F[A]): G[A]
}

This trait captures something more general than Function1, namely a natural transformation from functor F to functor G. Note that A appears nowhere in the type and isn’t given until the apply method is actually called. We can now model a function that takes a value and puts it in a list, as a natural transformation from the identity functor to the List functor:

type Id[A] = A

val singletonList = new (Id ~> List) {
  def apply[A](a: A): List[A] = List(a)
}

And we can now take such a function as an argument:

def apply[B](f: Id ~> List, b: B, s: String): (List[B], List[String]) =
  (f(b), f(s))

I wonder if we can use this to get static guarantees of safe resource access, as in the SIO monad detailed in Lightweight Monadic Regions.

Advertisements

Higher-Order Java Parallelism, Part 1: Parallel Strategies and the Callable Monad

Now that even budget desktop and laptop computers are shipping with multi-core processors, it’s more important than ever to design programs so that they can take advantage of parallel processing. If you’re already writing software in Erlang or Parallel Haskell, then lucky you. But if you’re writing in Java (and are unable or unwilling to take up Scala), you will need to get organized.

Since Java 5, the JDK comes with a concurrency library that makes concurrent programming a lot easier than it used to be, and the library has been improved further in Java 6. This provides some basic abstractions that take care of the nitty-gritty of thread management for us. In this short series of articles, I’m going to show how we can build on top of that library to achieve concurrent programming of a higher order. We will employ some design patterns and simple abstract building blocks from functional programming. This style of writing concurrent programs will afford us the ability to:

  • Compose ordinary functions into concurrent programs.
  • Decouple parallel behavior from the algorithm itself.
  • Turn existing code into parallel code, without refactoring, at runtime.
  • Work with hypothetical results of concurrent computations, before they finish.

We will not use any locking or explicit synchronization. Basic familiarity with the java.util.concurrent package is assumed but not required. Familiarity with generics is highly recommended, and I recommend reading parts 1 and 2 of my Lazy Error Handling series as it explains some preliminaries. Most of the source code herein is already part of the Functional Java library, and I’ll use a fair bit of interfaces and classes from it, which I’ll introduce as we go along.

When we’re through, I will have demonstrated that a functional style of programming promotes code-reuse and modularity of parallel programs, above and beyond what can be achieved with the canonical object-orientated style.

The Callable Monad

The Callable interface is new in Java 5. It’s similar to the old Runnable interface that we all know and love, except that its call() method returns a result, and it may throw an Exception. If you followed my Lazy Error Handling articles, you will immediately notice that Callable is nearly identical to the Thrower interface described in that series, and you’ll know that it is indeed a potential monad. Treating Callable as a monad will allow us to work with it at a higher level of abstraction, letting us compose Callables, chain them, and project computations into them.

As you may already know, we need three ingredients (and only these three) to have a monad:

  1. Type construction. This comes free with the Callable interface, since it takes a type parameter (i.e. it’s generic).
  2. A unit function. This allows us to turn any value into a Callable that returns that value again.
  3. A binding function. This allows us to chain together Callables with functions, without actually calling them.

Here is the unit function for Callable (we will refer to it as unit in the text to avoid ambiguity):

  public static <A> Callable<A> callable(final A a) {
    return new Callable<A>() {
      public A call() {
        return a;
      }
    };
  }

And here is its binding function:

  public static <A, B> Callable<B> bind(final Callable<A> a, final F<A, Callable<B>> f) {
    return new Callable<B>() {
      public B call() throws Exception {
        return f.f(a.call()).call();
      }
    };
  }

Note: The F interface is from Functional Java. It represents a first-class function (a function that can be treated like any other value). F<A, B> has one method that takes an argument of type A and returns a value of type B.

Given the above two methods, Callable is now a monad. All monads are also functors, so we can define a Callable functor. What I mean by “functor” is that we can define a method to turn any existing function into a function on Callables, or, in other words, to apply any function to a value wrapped in a Callable, without actually calling it (i.e. lazily). This method is called fmap, and we can define it in terms of bind:

  public static <A, B> F<Callable<A>, Callable<B>> fmap(final F<A, B> f) {
    return new F<Callable<A>, Callable<B>>() {
      public Callable<B> f(final Callable<A> a) {
        return bind(a, new F<A, Callable<B>>() {
          public Callable<B> f(final A ab) {
            return new Callable<B>() {
              public B call() {
                return f.f(ab);
              }
            };
          }
        });
      }
    };
  }

Useful operations on Callables, that the methods above allow us to implement, include sequence—which is a method of turning a list of Callables into a single Callable that returns a list—and join, which peels one layer of Callables from a Callable<Callable<A>> so that it becomes just Callable<A>. You will find the source code for those as part of Functional Java.

Parallel Strategies

When working with the java.util.concurrent library, you normally don’t work with Threads directly. You might instead implement Callable<A> and submit instances of your implementation to an ExecutorService, which yields values of type Future<A> which represents a running computation that yields an A. Future has a method called get() that returns the result as soon as it’s ready. This is a great improvement over managing threads and Runnables ourselves, but it still ties our hands somewhat to the ExecutorService, and encourages tight coupling of our code to the parallelisation library.

Inspired by Haskell’s “Parallel Strategies”, let’s instead work with parallelisation in the abstract. For what is ExecutorService, really? It’s a method of turning Callables into Futures. That means it’s a kind of function, so we can abstract from it, using a function type: F<Callable<A>, Future<A>>. Such a function can then be backed by an ExecutorService, or by something else entirely, such as a load-balancing facility that serializes Callables to be executed on remote servers.

We will use a new class, Strategy<A>, that allows us to effectively separate parallelism from the algorithm itself:

  public final class Strategy<A> {

    private F<Callable<A>, Future<A>> f;

    private Strategy(final F<Callable<A>, Future<A>> f) {
      this.f = f;
    }

    public F<Callable<A>, Future<A>> f() {
      return f;
    }

    public static <A> Strategy<A> strategy(final F<Callable<A>, Future<A>> f) {
      return new Strategy<A>(f);
    }

  }

We’ll add a couple of static functions to create simple strategies:

  public static <A> Strategy<A> simpleThreadStrategy() {
    return strategy(new F<Callable<A>, Future<A>>() {
      public Future<A> f(final Callable<A> p) {
        final FutureTask<A> t = new FutureTask<A>(p);
        new Thread(t).start();
        return t;
      }
    });
  }

  public static <A> Strategy<A> executorStrategy(final ExecutorService s) {
    return strategy(new F<Callable<A>, Future<A>>() {
      public Future<A> f(final Callable<A> p) {
        return s.submit(p);
      }
    });
  }

One of the neat things that working with Strategies as functions allows us to do is use the Callable monad to compose them with existing functions. Any function can be lifted into the Callable monad using fmap, and then composed with a Strategy to yield a concurrent function. Moreover, we can use Strategies to convert existing functions to concurrent functions. The following method on Strategy will take any function and return the equivalent function that executes concurrently. Calling such a function will give you a Future value from which you can get the computed result whenever it’s ready.

  public <B> F<B, Future<A>> lift(final F<B, A> f) {
    final Strategy<A> self = this;
    return new F<B, Future<A>>() {
      public Future<A> f(final B b) {
        return self.f().f(new Callable<A>() {
          public A call() {
            return f.f(b);
          }
        });
      }
    };
  }

Alert readers will note that lift represents the Kleisli arrow for the Future monad. As for most monads, this arrow is a very useful kind of thing (see Lazy Error Handling, Part 2). In the Future monad, the Kleisli arrow provides parallel function application. If you already have a function f, then calling lift(f).f(x) will apply that function to x while simultaneously continuing with the next statement in the current program.

Lazy Futures

For functions involving Futures, we generally always want to have the Future type in the codomain (on the right-hand side, in the return type). If you think about it, you’ll see that functions that take Futures in their arguments won’t be able to do much without blocking on Future.get(). However, that doesn’t mean we can’t compose Future-valued functions together. It just means that we can only compose two of them together in such a way that we either wait for the first Future to obtain a value before firing off the second one, or we have to spark a new thread that does nothing but wait on the first Future. We’re much better off composing Callables and then turning those into Futures as needed. In fact, we might want to wrap values of type Future<A> inside of a Callable<A> again so that we can manipulate their return values while they are running:

  public static <A> Callable<A> obtain(final Future<A> x) {
    return new Callable<A>() {
      public A call() throws Exception {
        return x.get();
      }
    };
  }

And this takes us full circle back into the Callable monad, where we can compose computations, bind them to functions and map functions over them, all lazily. Which means: without actually asking what their values are until we absolutely need to know.

So we have accomplished what we set out to do in this first part of the series. We’ve written a Callable monad that lets us compose existing code into concurrent programs. We’ve implemented parallel strategies to run those programs in parallel, without regard to the actual parallelisation implementation. We have a Kleisli arrow that lets us run ordinary functions concurrently with each other, and finally we have a way of taking concurrently running computations back into the Callable monad. Not bad for a day’s work.

In the next part of this series, we will employ what we’ve built here to develop some higher-order parallel functions, including a parallel list functor. I hope you join me.

Noumenal Null

Although most would not consider omniscience a reasonable requirement for a software system, it’s nevertheless a common complaint that computer systems are not omniscient. It’s not often worded quite that way, but we’re all familiar with the argument that a system (say, a database) must be able to represent a fact that it doesn’t contain, or rather, to represent missing values somehow.

That “somehow” normally takes the form of a special symbol, ω (omega), or “null”. This special symbol is assigned to variables that have not yet been initialized, that have no meaningful value in that context, or whose value is unknown, etc.

Now, let’s take a step back to reality. In reality (the world that’s actually out there), things either are or are not. Reality doesn’t care if we know about it or not, and there are no facts which aren’t facts about something in particular. And there are certainly no existents out there which are “not yet initialized”.

To be clear, the issue we are talking about pertains to variables in whatever language we’re using. A variable properly has some value, but may have any value (of the appropriate type). The notion of null is a variable which has no value at all. It’s my opinion that this amounts to a philosophical error known as reification of the zero.

Implications of Null

A software system models some abstraction from reality. To model the facts of reality (or a virtual reality), a very simple logical system suffices, since things either are or are not (law of the excluded middle). A simple enough system represents only facts about reality (or, rather, abstractions from such facts), and those things which aren’t facts about reality are simply omitted. This is not to say that it represents all facts of reality (that would be absurd), but only the ones about which we have something to say in the context of our system.

For a simple but contrived example, let’s make a kind of language that represents facts about people’s birthdays. We have a type of expression saying “The birthday of Person p is on Date d.” Call it a logical theorem that there exists some person p whose birthday is on some date d. To prove this theorem, one merely needs to supply some (but any) person who has some (but any) birthday. According to the Curry-Howard isomorphism, we could think of it as a function, a constructor of values of a specific type. Call the type and the constructor function BD, and let it take two arguments, p and d, which represent a person and a date, respectively, yielding values of the form BD(p,d).

Now comes the question of what to do when a fact is not yet known, not applicable, or can’t be found out (because of some error, for example). There are some options for representation here, of which, I will consider two. For example, in a system that tracks people’s birthdays, we could handle unknown birthdays in two ways:

  1. State that the fact is unknown, using a new kind of expression: “Person p has an unknown birthday.” If this were a programming language, it would be something like a function UB which takes a value p and returns a value of the form UB(p). It’s like a logical formula with one variable p, such that any value for p provides a proof of the hypothesis that there exists some person whose birthday is unknown.
  2. Create a special null value called ω, such that for an unknown birthday, we would simply say: “The birthday of person p is on date ω“, or: BD(p, ω), for some p. Or if the person, whose birthday it is, is unknown: BD(ω, d), for some d.

These are different approaches to the problem in a profound way. In the former, we have stayed within the framework of representing “that which is”, i.e. truths only, and used the semantics of the language to represent facts about the unknown (the fact that it is unknown). In the latter, we have imported the notion of the unknown, from the meta-level, into our algebra. As far as our algebra is concerned, ω is a kind of something that is nothing in particular and has no specific type. It has become a way of formalizing vagueness in our logic. We have in fact changed the logic of our language in such a way that the Curry-Howard isomorphism no longer holds, and the law of excluded middle is no longer applicable. Supplying null for either of the variables of BD(p,d) creates a value of the form BD(p,d) that is not a proof of the logical formula that BD(p,d) represents. Observe that “The birthday of person John is on date ω” does not constitute a proof of the hypothesis that there exists some person p whose birthday is on some date d.

In short, the difference between these two approaches is that the former treats the fact that something is unknown as just another fact, using grammar that already exists, while the latter is an attempt to formalize “unknown” as a new grammatical construct.

But once we invite null into the system, we have to come up with the semantics for how to handle it. We have to ask the question: “What exactly do we mean by null?” For every operator, we must decide what it means for one of its operands to be null. What does ω = ω yield? In some languages (most imperative programming languages), ω is considered equal to itself. In others, it’s considered an error to even ask the question, resulting in a runtime error or a non-terminating program. In yet others (SQL), such an operation always yields ω itself. This last kind is a kind of three-valued logic, which is fraught with difficulties. For example, in SQL, A=A is not true for all values of A. In other words, we have arrived at a logic in which the law of identity is not a tautology!

The implication of this last point is that a Boolean expression in such a system is to be considered valid even though it is neither true nor false. This means we can’t consider such a system to deal with propositions, since all valid propositions are either true or false (a proposition that is neither true nor false is arbitrary, i.e. invalid). From the standpoint of propositional logic, then, a database is corrupt to the extent that it contains nulls.

In fact, once we allow it, we may find that we need more than one kind of null, with differing semantics depending on what we mean by it in that context. One kind may mean “unknown”, another might mean “inapplicable”, and of course we might need a third to say that “we do not know if this is applicable or not”. When taken to its logical conclusion, this leads to an explosion of kinds of null. As a concrete example, consider a database that tracks people’s death dates. For the predicate Person p has a death Date of d, one might need at least three kinds of null: one to say that the person is not yet dead (inapplicable), another to say that the person is dead, but we don’t know when they died (unknown), and a third to say that we don’t know whether or not they have died (unknown whether or not a value is applicable). And this is just one really simple example.

But in practice, the meaning of null is often decided ad hoc by the user of the system or language, and may even not be decided at all, in which case the result of a program employing nulls has been consigned to chance to some degree.

Obviating Null

It should be abundantly clear that introducing null into a language is not much of a solution to anything. We don’t do that with natural language, and certainly not in formal logic, so why do it with programming languages? I believe it’s partially because people are used to thinking of a programming language as a machine rather than a logic. They are still thinking in terms of initializing pointers or registers and transitioning between memory states. But the power of languages is to abstract these things away, so that we can concentrate on the algebra while the machine does the arithmetic. Yet, I think we owe the acceptance of null largely to the fact that people are used to sloppy thinking. They’re used to treating nothing as if it were a thing, and using concepts of which they only know the approximate meaning.

Fortunately, there are real-world solutions for obviating the notion of null. Functional programming, for example, does away with it by employing monads. Haskell has the Maybe monad which is an abstraction from case analysis such that a value of type Maybe A can take one of two possible forms: Nothing or Just a where a is a value of type A. This might seem like it’s the same thing as null, but the fact that only certain operations are defined on the Maybe type class (i.e. it’s a monoid) keeps everything well defined. This concept is applicable to any language that supports functional style, and can even be implemented in Ruby. In some other languages, this monad is called Option.

In the realm of databases, work is being done on more abstract database languages than most of us are used to. Chris Date and Hugh Darwen have been instrumental in putting database systems on the sound theoretical footing of predicate logic, although products based on their research are largely in the early stages of development.