More Scala Typehackery

Inspired by this page on the Haskell Wiki, here’s a stripped down version of the lambda calculus encoded in the Scala type system:

trait Lambda {
  type subst[U <: Lambda] <: Lambda
  type apply[U <: Lambda] <: Lambda
  type eval <: Lambda
}

trait App[S <: Lambda, T <: Lambda] extends Lambda {
  type subst[U <: Lambda] = App[S#subst[U], T#subst[U]]
  type apply[U] = Nothing
  type eval = S#eval#apply[T]
}

trait Lam[T <: Lambda] extends Lambda {
  type subst[U <: Lambda] = Lam[T]
  type apply[U <: Lambda] = T#subst[U]#eval
  type eval = Lam[T]
}

trait X extends Lambda {
  type subst[U <: Lambda] = U
  type apply[U] = Lambda
  type eval = X
}
[/code]

Now to see if it works. Here's a simple identity:

scala> type x = Equal[App[Lam[X], X]#Eval, X]
defined type alias x

Yay! You might want to know what Equal is. It's a trait that doesn't typecheck unless both of its type parameters are the same (thanks, Jesper):

trait Equal[T1 >: T2 <: T2, T2]
[/code]

Let's try an expression that diverges (never terminates) in the untyped lambda calculus:

scala> type x = App[Lam[App[X, X]], Lam[App[X, X]]]#Eval
:8: error: illegal cyclic reference involving type Eval

That's Scala 2.7.4. It figures out that we're trying to construct an infinite type. Hurray! But that also means it's not Turing-equivalent. There will be other, legitimate types that it can't check either.

Try the same thing in 2.8 and you get...

Exception in thread "main" java.lang.StackOverflowError
        at scala.tools.nsc.symtab.Symbols$Symbol.isPackageClass(Symbols.scala:407)
        at scala.tools.nsc.symtab.Types$Type.isGround(Types.scala:640)
        at scala.tools.nsc.symtab.Types$Type.isGround(Types.scala:640)
        at scala.tools.nsc.symtab.Types$Type.isGround(Types.scala:640)
        at scala.tools.nsc.symtab.Types$Type.isGround(Types.scala:640)
        at scala.tools.nsc.symtab.Types$Type.isGround(Types.scala:638)
        at scala.tools.nsc.symtab.Types$Type$$anonfun$isGround$1.apply(Types.scala:638)
        at scala.tools.nsc.symtab.Types$Type$$anonfun$isGround$1.apply(Types.scala:638)
        at scala.collection.generic.LinearSequenceTemplate$class.forall(LinearSequenceTemplate.scala:97)
        at scala.collection.immutable.List.forall(List.scala:27)

...which is what you would expect. So it looks like typechecking in 2.8 has become undecidable in general. But don't fret. You really have to try hard to get this kind of result.

Data types such as the Church numerals and Booleans can be encoded in this calculus. I might post those a bit later.

Scala rocks.

A Scala Puzzler

I’ve been toying around with Scala’s type system a little bit, and I’ve come to the conclusion that it’s completely brilliant. And I’m very encouraged by how rapidly it has improved.

There is, for example, limited support for type-level computation. But not so much that typing becomes undecidable.

In Scala, types can have type members, and those members can take type parameters. Type members can be bounded, and so can their parameters.


trait Lambda { type a[A <: Lambda&#93; <: Lambda }

&#91;/code&#93;</pre>
This feature gives you quite a lot of expressive power in the type system. Type members are accessed by use of the # symbol. You can nest types, so Scala effectively has closures for types. So you can do crazy stuff like this:
<pre>
  trait S extends Lambda { type a[X <: Lambda&#93; = Lambda { type a&#91;Y <: Lambda&#93; = Lambda { type a&#91;Z&#93; = X#a&#91;Z&#93;#a&#91;Y#a&#91;Z&#93;&#93; }} }
&#91;/code&#93;</pre>
OK, since we basically have the power of the lambda calculus in the type system, let's see if we can do Church encoding of datatypes. It turns out that we can, to an extent. Here are booleans:
<pre>
  trait Boolean { type cond[t, f] }
  trait TFalse extends Boolean { type cond[t, f] = f }
  trait TTrue extends Boolean { type cond[t, f] = t }

Let's test that out.

scala> val x: TTrue#cond[Int, String] = 10
x: TTrue#cond[Int,String] = 10

scala> val x: TTrue#cond[Int, String] = "10"
:6: error: type mismatch;
 found   : java.lang.String("10")
 required: TTrue#cond[Int,String]
 val x: TTrue#cond[Int, String] = "10"

It works. We can assign an Int to the value if the condition is TTrue, and a String if it's TFalse, but not vice versa.

How about church numerals? Here are those:

trait Num { type a[F[_], X] }
trait Zero extends Num { type a[F[_], X] = X }
trait One extends Num { type a[F[_], X] = F[X] }
trait Two extends Num { type a[F[_], X] = F[F[X]] }
trait Three extends Num { type a[F[_], X] = F[F[F[X]]] }

The way this works is that the numerals take some type constructor F and a unit type X. For example, to get lists of strings three levels deep, Three#a[List, String] is equivalent to the type List[List[List[String]]]

scala> val x: Three#a[List, String] = List(List(List("Three")), List(List("Levels"), List("Deep")))
x: Three#a[List,String] = List(List(List(Three)), List(List(Levels), List(Deep)))

We can even do some arithmetic. Here are three type-level functions. Respectively, they add two numbers together, increment a number by one, and multiply one number by another:

trait PApp[F[P[_], _], A[_]] {
  type a[X] = F[A, X]
}

trait Plus[M <: Num, N <: Num] extends Num { type a[F[_], X] = M#a[F, N#a[F, X]] }
trait Succ[N <: Num] extends Num { type a[F[_], X] = F[N#a[F, X]] }
trait Mult[M <: Num, N <: Num] extends Num { type a[F[_], X] = PApp[N#a, PApp[M#a, F]#a]#a[X] }
[/code]

Pretty sweet. For example, a List of Strings 20 levels deep would be of the type Mult[Succ[Three], Plus[Two, Three]]]#a[List, String]

Now for the puzzler.

Exercise: Write a type-level function Exp[M, N] that yields the Church numeral M lifted to the Nth power. Can it be done? If not, why not?

Some Notes in Hostility Toward Subtyping

What follows is a session of thinking out loud, so it’s not to be taken too seriously.

The more I think about it, the more I convince myself that the idea of subtyping and class hierarchies is a mistake. “Inheritance” is a good way to lock down a design so that it becomes rigid and brittle. If you remember your (now scorched, I trust) GoF book, its main cargo was “composition over inheritance”. So why do we want this mechanism at all? Because we want type substitution. We say that if X is a subtype of Y, then X can be substituted wherever Y is expected. Informally, we want to say that an X is a Y. But what does this mean? It means that the set of all values in X is a subset of all the values in Y. So if you have a value x in X, then x is also in Y. In other words, X implies Y, or (X => Y). Hang on. This is just the type signature for a function.

When we observe this, we realise that we can substitute functions for subtyping. Moreover, we can make those functions implicit (as in Scala) and then type substitution will work just as if we were using subtypes (hopefully without loss of type inference). As a bonus, our implicit type substitutions have scope, while subtyping is a global declaration.

For example, take a look at the function type (A => B), recalling that this conceptually the same as saying “A implies B”, or even “A is a subtype of B”. Using subtyping, we could have types Foo and Bar, and say “Bar extends Foo”. But instead of subtyping, in Scala we could have an implicit function:

implicit def fooBar(b: Bar): Foo = b.foo

The easiest way to implement this is to have Bar wrap a value of type Foo by accepting it in its constructor. Then the Bar.foo method simply returns this value. But it can really be implemented any way we want. For example, if both Foo and Bar take a parameter of type Int, then we can extract it from one when constructing the other.

There’s a slight problem with implicit functions like this. Somebody else may have defined a function with the same type already, with a totally different intent. To remedy that, we can create a type for a specific kind of conversion to Foo; one that implies the relation we want to express. Something like…

trait IsAFoo[A] {
  def apply(a: A): Foo
}

This is now unambiguous. This kind of construct is a typeclass. IsAFoo classifies all types that imply type Foo. We can supply an implicit instance of this typeclass for Bar:

implicit val bar_is_a_foo = new IsAFoo[Bar] {
  def apply(b: Bar) = b.foo
}

And wherever we want to accept something that “is a” Foo, we accept an implicit parameter as evidence that it is indeed a Foo.

implicit def doSomethingWithFoo[A](implicit foo: IsAFoo[A])(a: A) = foo(a).methodOnFoo

We can call this method with a value of type Bar, because of the existence of the implicit instance bar_is_a_foo. In fact, Scala has an even nicer syntax for this, using “view bounds”. I leave it to you to check that out. What I want to impress on you is how flexible typeclasses are. We’re not constrained to using this mechanism to substitute for subtyping. We can use it to do the converse, i.e. supertyping an existing type. Or we can have the conversion go both ways to express isomorphisms between types. The pattern here is that we want to be able to state any kind of relation between types.

If Scala had the ability to state functional dependencies, the typeclass mechanism truly could obviate subtyping, with the added bonus that we could state any kind of type-level relation that we want, rather than just type order. I could talk about the downside of this, but… I could go on forever.

Variance of Functors

So, speaking of type order, there’s a tie-in here with variance. This is one of the things that trips people up when thinking about class hierarchies. I know it trips me up. But variance is much easier to reason about if we think of the subtype/supertype relation as just a function.

To compare, here’s what Wikipedia has to say about variance in class hierarchies:

If class B is a subtype of class A, then all member functions of B must return the same or narrower set of types as A; the return type is said to be covariant. On the other hand, the member functions of B must take the same or broader set of arguments compared with the member functions of A; the argument type is said to be contravariant. The problem for instances of B is how to be perfectly substitutable for instances of A. The only way to guarantee type safety and substitutability is to be equally or more liberal than A on inputs, and to be equally or more strict than A on outputs.

OK, but why is it the case that this is the only way to guarantee substitutability and type-safety? To understand that, it helps to throw away the notion of “subtype”, and simply think only in terms of functions, instead of the mixed notions of “subtype” and “member function”. Try translating that snippet from Wikipedia so that it’s worded in terms of functions. It gets pretty convoluted.

Proceeding from the premise that functions and subtype relations are interchangeable, we can derive a definition of co- and contravariance simply from the definition of substitutability. We start with this:

If B is a subtype of A, then every subtype of B is also a subtype of A, and every supertype of A is also a supertype of B.

Now, if we take (A <: B) to mean “A is a subtype of B”, and (A => B) to mean “function from A to B”, or equivalently “A implies B”, we can write this property as:

(A <: B) <==> ((B <: C) => (A <: C)) and ((C <: A) => (B <: C))

Remember that you can substitute functions for subtyping and vice versa, so any (=>) sign above can be replaced with the (<:) sign, or the other way around, and the meaning stays intact. So let’s restate this property purely in terms of functions: If there exists a function from B to A, then for every function from C to B there exists a function from C to A, and for every function from A to C there exists a function from B to C. (B => A) <==> ((C => B) => (C => A)) and ((A => C) => (B => C))

Moving (B => A) to the left of the <==> sign, we can infer two properties of (B => A):

(B => A) => (C => B) => (C => A)
and
(B => A) => (A => C) => (B => C)

Recalling that function application is logical implication, both of these properties evaluate to true, for all A, B, and C. Let’s use Wolfram Alpha to confirm this for us.
See here and here.

Let’s now say that C is a fixed type. Remembering what we did above with typeclasses, this gives rise to two such typeclasses, representing (C => A) and (A => C), respectively, for all A:

trait FromC[A] {
  def fromC(c: C): A
}
trait ToC[A] {
  def toC(a: A): C
}

To restate the properties above:

(B => A) => FromC[B] => FromC[A]
(B => A) => ToC[A] => ToC[B]

In other words, mapping a function (B => A) over FromC[B] results in FromC[A]. So the implication is preserved across the mapping. Mapping a function (B => A) over ToC[A] results in ToC[B]. So the implication is reversed across the mapping. This means that FromC is a covariant functor and ToC is a contravariant functor, by the definition of co- and contravariance. So now we have anchored the notion of substitutability to the variance of functors. For reference, here’s my off-the-cuff definition of co- and contravariance, along with some preliminaries:

Definition of Higher-Order Function
A higher-order function (HOF) is a function that takes another function as its argument.

Definition of covariance
A unary type constructor T[_] is a covariant functor if and only if there exists a HOF with the following type, for all A and B:
(A => B) => T[A] => T[B].

Definition of contravariance
A type constructor T[_] is a contravariant cofunctor if and only if there exists a HOF with the following type, for all A and B:
(B => A) => T[A] => T[B].

Note that these last two are exactly the properties above that we got for FromC and ToC, which we derived directly from the definition of subtype and supertype.

Conclusion

The benefits of subtyping, namely type substitution, can be expressed more naturally with a mechanism for classifying types.