This blog has moved

I have not seriously updated the old Apocalisp blog for quite some time. Mostly this is due to the fact that I have been spending all of my creative time outside of work on writing a book. It’s also partly that putting a post up on WordPress is a chore. It’s like building a ship in a bottle.

So I have decided to make posting really easy for myself by hosting the blog on GitHub. I am using a dead-simple markdown-based framework called Octopress. With this setup I can very easily write a new post from my command line and publish by pushing to GitHub. This is already part of my normal coding workflow, so it feels more friction-free.

The new blog is simply titled “Higher Order”, and is available at blog.higher-order.com. Check back soon for posts that I’ve been sitting on but have been too lazy busy to post.

All of the old content and comments will still be available at the old address, and I’ll probably cross-post to both places for a little while.

 

Pascal’s Wager and the Problem of Computability

I was on a cable car yesterday morning and I was intellecually tickled by an advertisement posted on one of the walls in the car. It was an ad for a unitarian church, and in it they quoted Blaise Pascal:

It is incomprehensible that God should exist, and it is incomprehensible that He should not exist.

Pascal was a philosopher and mathematician, and he was trying to understand the question of whether or not one should believe in the existence of a God (specifically the Christian god, but that’s not important). He concluded that since the benefits of belief are supposedly enormous (or infinite), and one cannot actually know, one should wager in favor of there being a God, since then the benefit of being right is maximized and the cost of being wrong is minimized.

But Pascal made an error in his premises, which touches on computability theory. He sets out assuming that the statement “God exists” is either true or false. This is an unwarranted premise. Pascal was a rationalist, so he assumes the dichotomy of truth, that every statement is either true or false. But any programmer can tell you that this isn’t the case.

You see, Aristotle understood that not every statement is either true or false. The law of excluded middle says that a thing either is or is not. And truth or falsehood is an attribute only of statements that refer to things which are. For this reason, some statements are simply absurd, or arbitrary. Such statements cannot be examined for truth or falsehood. They can only be dismissed out of hand, as if nothing had been said. Because, in a very strict sense, nothing really has.

In computation, this is equivalent to the fact that not every program has an answer. Some programs simply bottom (crash or hang). There are types that are nonsensical and have no implementation except for programs whose answer is bottom. And there are questions that have no answer because they are nonsensical, and statements that are neither true nor false because they do not refer to any attributes or configurations of things which exist.

But it is absurd and impossible to suppose that the unknowable and indeterminate should contain and determine.

- Aristotle

Not only is it not right, it’s not even wrong!

- Wolfgang Pauli

On two occasions I have been asked,—”Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?” … I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

- Charles Babbage

Imperative vs Functional Programming

Recently I had a quasi-private discussion about philosophy in programming where somebody asked a question about functional programming. I’d like to relay part of the discussion here since it might be of interest to the community at large.

Jason wrote:

I have come out strongly in support of OO, and I think it has a very clear and strong basis in the facts of how computers work and how programmers think. In brief: Computers function by executing a series of instructions, at the level of machine language, which perform calculations and manipulate memory and I/O. In the earliest days of computing, programmers gave instructions to the computer directly at this level. As programs grew more complex, programmers needed ways to achieve greater unit-economy.

Often a set of variables truly do go together, such as a record representing an item for sale in a database, and certain functions operate primarily on this data. [This choice is not arbitrary.] Objects formalize this pattern and allow programmers to express it clearly and directly.

In general, imperative programming is rooted in the fact that computers actually run programs by executing a series of instructions.

I’d love to understand better the advantages of functional programming are. So to its advocates, I ask: What are the facts of reality that give rise to it? Note that it is not enough to explain, for instance, why it is mathematically *possible* to represent any given thing in any given mathematical way. What I want to know is: How is functional style rooted in the *purpose* of the software and the *nature of the thinking* that programmers do?

On the validity of the FP approach

The fact of reality that give rise to the FP approach is that computation is essentially a process of inferring quantities. Its purpose is to aid in human thinking. The conceptual faculty is a quantitative mechanism. This is laid out eloquently in David Harriman’s book “The Logical Leap”. See also “Concept-formation as a mathematical process” in OPAR. So the computer is in essence a quantifying device. Its invention was motivated by man’s need to quantify.

Calculating with numbers (quantities of something) gives rise to algebra, and importantly the abstraction “function”. And functional programming is simply writing programs using this abstraction. Functional programming was being done well before there were computers.

Imperatives are nonessential to computing. A mechanical computer has levers, wheels, and the like. We can imagine programming Babbage’s inference engine by turning a few wheels to set the premises and then moving a lever to make it infer the consequent. Similarly, in a digital computer we fill some registers with the premises and then pull a virtual lever in the form of a CPU instruction. But it’s important to note that “instruction” here is a metaphor. What’s really going on is the selection of a function which computes a quantity based on the input given.

So in summary, the purpose of the computer is to calculate with quantities as functions of other quantities. It is a mathematical tool. Even (especially?) for things like games. Drawing a complex 3D scene on the screen in real-time involves calculating a particular quantity of light of each color to be used to illuminate each pixel on the screen, given a particular time.

On the utility of the FP approach

Referential transparency buys you modularity (the extent to which components of a program can be separated and recombined) and compositionality (the extent to which the whole program can be understood by understanding the components and the rules used to combine them).

Type systems also get more sophisticated to the extent that programs are pure. It’s easier to infer types for pure programs than it is for impure ones, for example. You also gain things like type constructor polymorphism (a.k.a. higher-kinded types) and higher-rank types. It’s my experience that in a sufficiently sophisticated program (such as a compiler for a programming language), you either use these abstractions or you repeat yourself.

Sophisticated type systems then in turn aid in program inference, which is the extent to which the computer (or the programmer with the aid of the computer) can infer the correct program from type information (think tab completion, only moreso). Program inference is currently an active area of research.

As a side-note a lot of OO folks are discovering the functional approach as a tool to aid in modular design. They call it “dependency injection”, “inversion of control”, “interpreter pattern” and the like.

On the way programmers think

I believe that the way programmers think is deeply influenced by their chosen tools. They learn to think in the concepts that pertain to the tool. I think ultimately the argument that OO mirrors how programmers think is an appeal to intuition. But as we know, there’s no such thing as intuition. When people say “intuitive”, they usually just mean “familiar”.

A word on OO and the arbitrary

In OO, as commonly practiced, the choice of distinguished argument is arbitrary. Consider this function:

K(x, y) = x

If we were to write this in OO style, then on which object, x or y, should the function K be dispatched? Should it be x.K(y), or y.K(x)? It’s arbitrary. And before you say that the function K itself is an arbitrary invention, I say no. It’s actually a constructor of emtpy lists in this implementation of a singly linked list data structure:

Empty(x, y) = x
Cons(a, b, x, y) = y(a, b(x, y))

On “types of languages”

I want to get clear on some concepts. First the question of “types of programming languages”. I don’t think it’s helpful to divide programming languages into “functional”, “object-oriented”, “imperative”, etc. Sure, certain languages are better suited to certain styles of programming, but it’s important to differentiate on essentials and not mere aesthetics.

Functional programming is a restriction that the programmer puts on his programs. Having a language with a compiler that will warn you if you’re breaking referential transparency is helpful, but not essential. I do functional programming in Java, for example, which most people consider an “imperative OO” language.

We can also do “OO” in Haskell, a purely functional language (in the sense that all Haskell programs are referentially transparent expressions). Haskell is also a very good imperative language. Here’s a Haskell program in imperative style:

main = do {
  putStrLn "What is your name?";
  name <- readLn;
  putStrLn ("Hello, " ++ name ++ "!");
}

Is this a sequence of instructions, or a referentially transparent expression? It’s both! Another way of writing the same program is:

main = putStrLn "What is your name?" >> readLn >>= (\name -> putStrLn "Hello, " ++ name ++ "!")

It’s important to note that these are not two different programs in any sense. The Haskell compiler literally translates the former into the latter.

The >> and >>= functions concatenate IO actions in exactly the same sense that the ++ function concatenates Strings. It’s possible for us to do this because e.g. calling readLn doesn’t immediately read from standard input. It is a referentially transparent expression that returns an IO action. The operating environment translates this IO action into instructions for the underlying machine at some convenient time. Or it may in fact not do that. The implementation of the IO data structure is abstracted away in a library, and there’s no reason that somebody couldn’t supply a different implementation with the same interface. This may be useful if you’re programming a missile silo and you need to be able to test the launchTheMissile action without starting a nuclear war.

Another benefit is that main can be composed with other programs. We could, for example, write this program:

mainTwice = main >> main

Or this one:

mainForever = main >> mainForever

I hope that helps.

Towards an Effect System in Scala, Part 1: ST Monad

Referentially Transparent Mutable State

In their paper “Lazy Functional State Threads”, John Launchbury and Simon Peyton-Jones present a way of securely encapsulating stateful computations that manipulate mutable objects. The result is Haskell’s ST monad. Its definition is very similar to the State data type. In Haskell, the ST monad is used to thread the manipulation of mutable state in such a way that the mutation is completely referentially transparent, because it is a type error for a mutable object to escape the monad.

I would like to present an implementation of this in Scala, which I recently committed to the Scalaz library. I was inspired to write this by Tim Carstens last summer, but never found a way of encoding the requisite rank-2 types in Scala’s type system in such a way that what should work does and what shouldn’t doesn’t. But Geoff Washburn got me going again. Following the technique on his blog, of representing universal quantifiers as doubly negated existentials, I was able to encode ST in a way that’s surprisingly nice to use, and actually does give you type errors if you try to access a naked mutable reference. And as Mark Harrah has pointed out, we end up not having to use the double negation after all. I’m surprised to find that doing this in the obvious way in Scala, just works.

OK, let’s get to the money. In Scala, we can declare the ST data type as follows:

case class World[A]()

case class ST[S, A](f: World[S] => (World[S], A)) {
  def apply(s: World[S]) = f(s)
  def flatMap[B](g: A => ST[S, B]): ST[S, B] =
    ST(s => f(s) match { case (ns, a) => g(a)(ns) })
  def map[B](g: A => B): ST[S, B] =
    ST(s => f(s) match { case (ns, a) => (ns, g(a)) })
}

def returnST[S, A](a: => A): ST[S, A] = ST(s => (s, a))

This is a monad in the obvious way. The flatMap method is monadic bind and returnST is monadic unit.

The World type represents some state of the world, and the ST type encapsulates a state transformer which receives the state of the world and returns a value which depends on that state together with a new state. Here, we are representing the world state by nothing at all. It turns out that for what we want to do with the ST monad, the contents of the state are not important, but its type very much is. A much more detailed explanation of how and why this works is given in the paper, but the punchline is that we are going to “transform the state” by mutating objects in place, and in spite of this the state transformer is going to be a pure function. This is achieved by guaranteeing that the type S for a given state transformer is unique. More on that in a second.

Purely Functional Mutable References

A simple object that we can mutate in place is one that holds a reference to another object through a mutable variable.

case class STRef[S, A](a: A) {
  private var value: A = a

  def read: ST[S, A] = returnST(value)
  def write(a: A): ST[S, STRef[S, A]] = ST((s: World[S]) => {value = a; (s, this)})
  def mod[B](f: A => A): ST[S, STRef[S, A]] = for {
    a <- read
    v <- write(f(a))
  } yield v
}

def newVar(a: => A) = returnST(STRef(a))

So we have monadic combinators to construct, read, write, and modify references. Note that the implementation of write blatantly mutates the object in place. The definition of mod shows how to compose state transformers in sequence, using monad comprehensions.

It’s important that an STRef is parameterized on a type S which represents the state thread that created it. This makes variables allocated by different state threads have incompatible types. Therefore, state threads cannot ever see each other’s mutable variables. Because state transformers can only be composed sequentially (with flatMap), it’s guaranteed that two of them can never simultaneously mutate the same STRef.

Running a State Transformer as a Pure Function

Note that the type of a reference to a value of type A in a state thread S is ST[S, STRef[S, A]]. If ST had a run function of type ST[S, A] => A, we would be able to get the reference out. But this type is more general than we want. What we want is for the compiler to reject code like newVar(10).run, which would give you access to the naked STRef, but to accept code like newVar(10).flatMap(_.mod(x => x + 1).flatMap(read)).run, which simply accesses an integer.

In Haskell, the type of runST is:

runST :: forall a. (forall s. ST s a) -> a.

This is a rank-2 type which Scala’s type system does not directly support.

To see why this type would prevent the leaking of a mutable reference, consider the type you would need in order to get an STRef out of the ST monad.

forall a. (forall s. ST s (STRef s a)) -> STRef ??? a

What type should go in place of the three question marks? There is no type that could possibly fit the bill because the type s is bound (introduced) by the universal quantifier to the left of the arrow. It’s a local type variable in the domain of the function, so it can’t escape to the codomain. This is why ST state transformers are referentially transparent.

Of course, if you get the value out of a reference, then you can run that just fine. In Scala terms, you can always go from ST[S, A] to A, but you can never go from ST[S, F[S]] to F[S] for any F[_].

Writing runST in Scala

So the problem becomes how to represent a rank-2 polymorphic type in Scala. I’ve shown before how we can represent a rank-2 function type by encoding it as a natural transformation. And Mark has posted on how to write natural transformations using universally quantified values. (And I just now realized that he’s using functional state threads for non-observable mutation!)

First, we need a representation of universally quantified values:

trait Forall[P[_]] {
  def apply[A]: P[A]
}

Now that we have rank-2 polymorphism, the implementation of runST is straightforward:

  def runST[A](f: Forall[({type λ[S] = ST[S, A]})#λ]): A =
    f.apply.f(realWorld)._2

I’m using the “type lambda” trick here to declare the type constructor inline. The realWorld object is just a dummy value.

Some Examples

Here’s a simple example of a computation that creates a mutable reference and mutates it:

def e1[S]: ST[S, STRef[S, Int]] = for {
  r <- newVar[S, Int](0)
  x <- r.mod(_ + 1)
} yield x

And this expression creates a reference, mutates it, and then reads the value out:

def e2[A] = e1[A].flatMap(_.read)

Running the latter expression is fine, since it just returns an Int:

runST(new Forall[A] { def apply[A] = e2 })

But running the former fails at compile-time because it exposes a mutable reference. Or rather, because when the compiler tries to unify with our existential type, it’s out of scope:

runST(new Forall[({type λ[S] = ST[S, STRef[S, Int]]})#λ] { def apply[A] = e1 })

found   : scalaz.Forall[[S(in type λ)]scalaz.ST[S(in type λ),scalaz.STRef[S(in type λ),Int]]]
required: scalaz.Forall[[S(in type λ)]scalaz.ST[S(in type λ),scalaz.STRef[_ >: (some other)S(in type λ) with (some other)S(in type λ), Int]]]

What are the practical implications of this kind of compile-time checking? I will just quote Peyton-Jones and Launchbury:

It is possible to encapsulate stateful computations so that they appear to the rest of the program as pure (stateless) functions which are guaranteed by the type system to have no interactions whatever with other computations, whether stateful or otherwise (except via the values of arguments and results, of course).

Complete safety is maintained by this encapsulation. A program may contain an arbitrary number of stateful sub-computations, each simultaneously active, without concern that a mutable object from one might be mutated by another.

This can be taken much further than these simple examples. In Scalaz, we have STArrays, which are purely functional mutable arrays. There’s an example of a pure binsort which uses a mutable array for sorting.

This technique can be extrapolated to implement Monadic Regions (currently underway for Scalaz), which allows compile-time tracking of not just mutable arrays and references, but file handles, database connections, and any other resource we care to track.

What we have here then is essentially the beginnings of an effect system for Scala. This allows us to compose programs from referentially transparent components which are internally implemented with mutation and effects, while those effects are guaranteed by the type system to be transparent to the rest of the program.