Today, we’re going on a journey. It is a sojourn to the outer limits of the expressiveness of the Java type system, and to the edge of what can be considered sane programming. This is definitely one for the power users. You will need a firm grasp of the Java language, and an iron constitution for type annotations. But the reward will be something far greater than any treasure: understanding, entertainment, and perhaps even enlightenment. Remember that we choose to do these things in Java, not because they are easy, but because they are hard. Now then, to the ships.
A Most Versatile Vessel
In Java, we can create a list that contains values of type A
, by constructing a value of type List<A>
. The type system will enforce that each element in the list is in fact of type A
. But sometimes we want lists of values that aren’t necessarily of the same type. Normally, for such a purpose, we would use a heterogeneous list, which in Java is just the raw list type List<?>
or List<Object>
. Since every class in Java is a subclass of Object (and now that we have autoboxing), such a list can contain any Java value. There are many kinds of situation where this would be necessary. For example, a row of database results will comprise values that are not all of the same type.
However, there’s a problem with the raw list approach. In using the List<?>
type, we are dispensing with the type system. When you get a value from the list, how do you know what it is? How do you know which operations it supports? Well, you will have to defer that discovery until runtime, and use explicit type casting. Most will shrug at this and say: “So what?” After all, this is what we did anyway, before generics. Ah, but what if we don’t have to? Can we create generic heterogeneous collections that are type-safe? Yes, we can. Sort of.
Products of Types
What we would like to see is if it’s possible to declare some constraints on the types of a heterogeneous collection, to achieve essential type-safety while maintaining the extensibility of a list. Of course, it’s easy to create types that are the product of two or more types:
public abstract class P2<A, B> { public abstract A _1(); public abstract B _2(); }
But the length of this kind of product is as fixed as the length of a string in Pascal. It isn’t extensible, so it’s more like a type-safe heterogeneous array than a list. If you want products of different lengths, you will need to declare separate classes for P3<A, B, C>
, P4<A, B, C, D>
, etc. What we’re trying to achieve is a product of arbitrary length, whose length might even vary at runtime. There’s no reason we couldn’t create products of products in a chain, like P2<A, P2<B, P2<C, D>>>
, and this is more or less the approach that we will take.
Introducing HList
To achieve our goal, we’re going to implement linked lists in the type system. Let’s remind ourselves what a linked list looks like. A List<T>
is essentially either the empty list or a value of type T
paired with a List<T>
. In Java, using the List<A>
type from Functional Java, an unsafe heterogeneous list might be constructed in a manner like the following:
List<?> x = cons("One", cons(2, cons(false, nil()));
The cons
method constructs a list, and the nil
method returns the empty list. With just these two methods, we can create any homogeneous list. A list has two methods to access its members, head()
which returns the first element, and tail()
which returns the rest of the list. Getting the head or tail of the empty list is an error at runtime.
Let’s now take a step up into the type system, and say that a list of types is either the empty list or a type paired with a list of types. This gives rise to our heterogeneous list type:
public abstract class HList<A extends HList<A>> { private HList() {} private static final HNil nil = new HNil(); public static HNil nil() { return nil; } public static <E, L extends HList<L>> HCons<E, L> cons(final E e, final L l) { return new HCons<E, L>(e, l); } public static final class HNil extends HList<HNil> { private HNil() {} } public static final class HCons<E, L extends HList<L>> extends HList<HCons<E, L>> { private E e; private L l; private HCons(final E e, final L l) { this.e = e; this.l = l; } public E head() { return e; } public L tail() { return l; } } }
That’s not a lot of code, and it’s all relatively straightforward Java. The HList
class is parameterised with a parameterised subclass of itself. There are only two concrete subclasses of HList
that can possibly occupy that slot: the type HNil
and the type constructor HCons
. These represent the empty list and the list constructor, respectively. HCons
takes two type parameters, the first representing the first element of the list, and the second being another HList
, allowing us to form a chain of them. HNil
does not take type parameters, so it terminates the chain.
As with regular old lists, you can access the head()
and tail()
of the list. Note, however, that the fact that you cannot get the head or tail of the empty list is now enforced by the type system. There’s a nil
method to get the empty list, and a cons
method to construct a nonempty list, just like with regular lists.
Here’s an example of how we would construct a heterogeneous list using this new type:
HCons<String, HCons<Integer, HCons<Boolean, HNil>>> x = cons("One", cons(2, cons(false, nil()));
This is more verbose than the unsafe version before, but not by much. Obviously, the HList
example assumes a static import of HList.cons
and the List<?>
example assumes a static import of List.cons
. Using the type-safe version is, however, much nicer. Compare these two contrived examples:
if (x.tail().tail().head()) { return x.head().length() == x.tail().head(); } if ((boolean) x.index(3)) { return ((String) x.head()).length() == (int) x.index(2); }
The latter, of course, offers no static guarantees and may throw ClassCastExceptions
, or we might inadvertently get the head or tail of the empty list at runtime. The former will always work as long as it compiles, guaranteed.
Concatenating HLists
Now let’s do something more interesting with these lists. Notice that the cons methods for both type-safe and unsafe lists prepend an element to a list rather than appending. Sometimes we want to append a list to the end of another. This is unsurprisingly uncomplicated for unsafe lists:
List<?> c = a.append(b);
Behind the scenes, we can think of append as reversing the first list and consing each element to the second list in reverse order. Doing that for HList
is a little more involved. We have to construct a chain of types in exactly the right way, at compile-time.
Appending an HList
to another is a function that takes two HList
-valued arguments and returns an HList
. Using first-class functions from Functional Java, the append operation for HLists
of specific types L
and R
, would be a function of the following type:
F2<L extends HList<R>, L extends HList<L>, LR extends HList<LR>>
Where LR
is the type of the concatenated HList
. Now, since we necessarily have the two arguments, we know the specific types of L
and R
. Since Java doesn’t have type inference, it cannot automatically figure out the specific type of LR
. We will have to supply it as type annotation. Not to worry. Even though Java doesn’t infer types, it can be coerced into doing some type arithmetic. All we have to do is a little inductive reasoning.
Types as Formulae
According to the Curry-Howard isomorphism, a program is a proof, and the hypothesis that it proves is a type for the program. In this sense, Java’s type system is a kind of crude theorem prover. Put another way, a type is a predicate, and values of that type represent the terms for which the predicate holds. The function type above therefore asserts that for any two HLists
, L
and R
, there exists some program to derive the HList LR
. The function type by itself does not put any constraints on LR
, however. It can be derived by any function, not just the concatenation function. We will remedy that presently. We need a formula that states that the two types L
and R
imply a third type LR
which is the HList
concatenation of L
and R
, given some concatenation function. Here is the type that represents that formula:
public static final class HAppend<L, R, LR> { private final F2<L, R, LR> append; private HAppend(final F2<L, R, LR> f) { append = f; } public LR append(final L l, final R r) { return append.f(l, r); } }
At this point, HAppend
is still just a hypothesis without evidence. Remember that a value of a type is proof of the formula that it represents. So we will need to supply two proofs in the form of constructors for values of this type; one for the base case of appending to the empty list HNil
, and another for the case of appending to an HCons
. The base case is easy. Appending anything to the empty list should result in that same thing. So the HAppend
constructor for appending to the empty list looks like this:
public static <L extends HList<L>> HAppend<HNil, L, L> append() { return new HAppend<HNil, L, L>(new F2<HNil, L, L>() { public L f(final HNil hNil, final L l) { return l; } }); }
The case for the nonempty list is not quite as easy. Consider its type:
public static <X, A extends HList<A>, B, C extends HList<C>, H extends HAppend<A, B, C>> HAppend<HCons<X, A>, B, HCons<X, C>> append(final H h)
Read the return type first. This returns an HAppend
that appends some B
to an HCons<X, A>
. The type of the head of the first list (X
) becomes the type of the head of the concatenated list. The tail of the concatenated list is C
. The type constraints state that C
must be an HList
, and that there must exist some way to append B
(the second list) to A
(the tail of the first list) so that they make C
. We must supply proof that this last constraint holds, and you’ll see that such a proof is in fact supplied as an argument (in the form of the value h
).
What this is saying is that, given the premise that A
and B
can be concatenated, the concatenation of HCons<X, A>
and B
can be inferred. A value of type HAppend<A, B, C>
is precisely proof of the hypothesis that A
and B
can be concatenated, since there are only these two cases and we’ve supplied a proof for both. In other words, if we can append to the empty list, then that’s proof enough that we can append to a list of one element, which proves that we can append to a list of two elements, and so on. Given this, we can construct a chain of proofs. This concatenated proof, then, is a function that concatenates lists of the corresponding types.
OK, so how do we use this? Well, here’s an example program that appends one list to another:
public class HList_append { public static void main(final String[] args) { // The two lists final HCons<String, HCons<Integer, HCons<Boolean, HNil>>> a = cons("Foo", cons(3, cons(true, nil()))); final HCons<Double, HCons<String, HCons<Integer[], HNil>>> b = cons(4.0, cons("Bar", cons(new Integer[]{1, 2}, nil()))); // A lot of type annotation final HAppend<HNil, HCons<Double, HCons<String, HCons<Integer[], HNil>>>, HCons<Double, HCons<String, HCons<Integer[], HNil>>>> zero = append(); final HAppend<HCons<Boolean, HNil>, HCons<Double, HCons<String, HCons<Integer[], HNil>>>, HCons<Boolean, HCons<Double, HCons<String, HCons<Integer[], HNil>>>>> one = append(zero); final HAppend<HCons<Integer, HCons<Boolean, HNil>>, HCons<Double, HCons<String, HCons<Integer[], HNil>>>, HCons<Integer, HCons<Boolean, HCons<Double, HCons<String, HCons<Integer[], HNil>>>>>> two = append(one); final HAppend<HCons<String, HCons<Integer, HCons<Boolean, HNil>>>, HCons<Double, HCons<String, HCons<Integer[], HNil>>>, HCons<String, HCons<Integer, HCons<Boolean, HCons<Double, HCons<String, HCons<Integer[], HNil>>>>>>> three = append(two); // And all of that lets us append one list to the other. final HCons<String, HCons<Integer, HCons<Boolean, HCons<Double, HCons<String, HCons<Integer[], HNil>>>>>> x = three.append(a, b); // And we can access the components of the concatenated list in a type-safe manner if (x.tail().tail().head()) System.out.println(x.tail().tail().tail().tail().tail()[1] * 2); // 4 } }
Holy pointy brackets, Batman! Do we really need all of that? Well, look at what it’s doing. It’s constructing a concatenation function of the appropriate type, by supplying the premise at each step. If this seems mechanical, then that’s because it is. There is only one possible implementation for the HAppend
we need, but Java does not have any mechanism for figuring this out, nor does it provide a facility for the programmer to tell it how.
Contrast that to Scala. The above is a clear example of where Scala’s implicit arguments come in handy. If we import this to Scala, we can make both of the append
functions implicit, and we can further make the H
argument to the append
function for nonempty lists implicit as well. There can be only one possible implementation of each function, so it can be declared once and used implicitly wherever proofs of the corresponding types are required. Jesper Nordenberg has implemented an HList library for Scala that demonstrates this well. With implicits and Scala, the whole middle section of our program is condensed from 12 lines of type annotations to just:
val x = a.append(b)
Now, if you’re really into this Java stuff, you’re probably thinking: “implicits are just dependency injection”. Well, in a sense, you would be right. Both dependency injection and inheritance are degenerate forms of implicits. However, there is currently no dependency injection framework for Java that can abstract over type constructors such that it provides injection of parameterised types with injection of type parameters also. If you can prove me wrong, by all means send me evidence in the form of working code.
Conclusion
Clearly, Java is not very useful for this kind of type-safe programming. I was actually quite surprised that you can do this in Java at all, but we’ve definitely hit the outer boundary of what can be considered reasonably expressible.
The code you’ve seen in this article uses the new HList
package that was released with Functional Java 2.16. And is based on the Haskell HS library by Oleg Kiselyov.
Yikes! Java really hurts when you push its type system this hard. But it’s fun stuff, and
I’ve added the original paper on this to my reading list: “Strongly typed heterogeneous collections”.
Also, I thought I’d comment on one detail that confused me for a bit, to clarify if any other newbie like me is wondering: the “self” generic for HList (A) isn’t needed for this example (but it’s used in Functional Java’s version, which is presumably the reason it’s here).
Matt, thanks for linking the original paper. You’re right, the type parameter to HList isn’t required for this example. It’s used for the extend method on Functional Java’s HLists.
“there’s no objective definition of what “object-oriented” refers to”
Alan Kay invented the term, and he said: “OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things”.
Granted, it’s been corrupted by countless languages and countless^2 programmers, but it did actually mean something once.
Pingback: Type-Level Programming in Scala, Part 6a: Heterogeneous List Basics « Apocalisp
Pingback: Variable-arity polymorphism from scratch in Scala, Part II: Type-level computation « Higher Kinded Tripe
Referring to your statement “Normally, for such a purpose, we would use a heterogeneous list, which in Java is just the raw list type List or List.” Are List and List really raw types? According to http://docs.oracle.com/javase/tutorial/java/generics/rawTypes.html
Raw type is the name of a generic class or interface without any type arguments.
In Java, a type constructor can be used without type arguments. Remember life before generics? Back then, every type was “raw”. When you use just
List
that’s equivalent toList<Object>
.6 years after … Thanks for this effort and self generics headaches.
Can we hope this implementation with JAVA 8 functions and JAVA 7 generic inference.
Thanks.
Hi Rúnar, been a while. Anyway, at least one of your code samples has unencoded HTML entities plainly visible. That abstraction leaked.