## Injective type families for Haskell

For the last few months I have been working on extending Glasgow Haskell Compiler with injective type families. At first this seemed like a fairly simple project but in the end it turned out to be much more interesting than initially thought. (I suppose that’s how research mostly turns out.) There are still some rough edges in the implementation and it will take a few more weeks before my branch gets merged into the master development branch of GHC. But for now there is a draft paper “Injective type families for Haskell” written by me, Simon Peyton Jones, and Richard Eisenberg, that we submitted for this year’s Haskell Symposium. This is not yet the final version of the paper so any feedback will be appreciated.

The idea behind injective type families is to infer the arguments of a type family from the result. For example, given a definition:

```type family F a = r | r -> a where F Char = Bool F Bool = Char F a = a```

if we know `(F a ~ Bool)`1 then we want to infer `(a ~ Char)`. And if we know `(F a ~ Double)` then we want to infer `(a ~ Double)`. Going one step further from this, if we know `(F a ~ F b)` then – knowing that `F` is injective – we want to infer `(a ~ b)`.

Notice that in order to declare `F` as injective I used new syntax. Firstly, I used “`= r`” to introduce a name for the result returned by the type family. Secondly, I used syntax borrowed from functional dependencies to declare injectivity. For multi-argument type families this syntax allows to declare injectivity in only some of the arguments, e.g.:

`type family G a b c = r | r -> a c`

Actually, you can even have kind injectivity, assuming that type arguments have polymorphic kinds.

Obviously, to make use of injectivity declared by the user GHC needs to check that the injectivity annotation is true. And that’s the really tricky part that the paper focuses on. Here’s an example:

```type family T a = r | r -> a where T [a] = a```

This type family returns the type of elements stored in a list. It certainly looks injective. Surprisingly, it is not. Say we have `(T [T Int])`. By the only equation of `T` this gives us `(T [T Int] ~ T Int)`. And by injectivity we have `([T Int] ~ Int)`. We just proved that lists and integers are equal, which is a disaster.

The above is only a short teaser. The paper covers much more: more corner cases, our algorithm for verifying user’s injectivity annotations, details of exploiting knowledge of injectivity inside the compiler and relationship of injective type families to functional dependencies. Extended version of the paper also comes with proofs of soundness and completeness of our algorithm.

1. `~` means unification. Think of “`~`” as “having a proof that two types are equal”. []

## Smarter conditionals with dependent types: a quick case study

Find the type error in the following Haskell expression:

`if null xs then tail xs else xs`

You can’t, of course: this program is obviously nonsense unless you’re a typechecker. The trouble is that only certain computations make sense if the `null xs` test is `True`, whilst others make sense if it is `False`. However, as far as the type system is concerned, the type of the then branch is the type of the else branch is the type of the entire conditional. Statically, the test is irrelevant. Which is odd, because if the test really were irrelevant, we wouldn’t do it. Of course, `tail []` doesn’t go wrong – well-typed programs don’t go wrong – so we’d better pick a different word for the way they do go.

The above quote is an opening paragraph of Conor McBride’s “Epigram: Practical Programming with Dependent Types” paper. As always, Conor makes a good point – this test is completely irrelevant for the typechecker although it is very relevant at run time. Clearly the type system fails to accurately approximate runtime behaviour of our program. In this short post I will show how to fix this in Haskell using dependent types.

The problem is that the types used in this short program carry no information about the manipulated data. This is true both for `Bool` returned by `null xs`, which contains no evidence of the result, as well as lists, that store no information about their length. As some of you probably realize the latter is easily fixed by using vectors, ie. length-indexed lists:

```data N = Z | S N  -- natural numbers   data Vec a (n :: N) where Nil  :: Vec a Z Cons :: a -> Vec a n -> Vec a (S n)```

The type of vector encodes its length, which means that the type checker can now be aware whether it is dealing with an empty vector. Now let’s write `null` and `tail` functions that work on vectors:

```vecNull :: Vec a n -> Bool vecNull Nil        = True vecNull (Cons _ _) = False   vecTail :: Vec a (S n) -> Vec a n vecTail (Cons _ tl) = tl```

`vecNull` is nothing surprising – it returns `True` for empty vector and `False` for non-empty one. But the tail function for vectors differs from its implementation for lists. `tail` from Haskell’s standard prelude is not defined for an empty list so calling `tail []` results in an exception (that would be the case in Conor’s example). But the type signature of `vecTail` requires that input vector is non-empty. As a result we can rule out the `Nil` case. That also means that Conor’s example will no longer typecheck1. But how can we write a correct version of this example, one that removes first element of a vector only when it is non-empty? Here’s an attempt:

```shorten :: Vec a n -> Vec a m shorten xs = case vecNull xs of True -> xs False -> vecTail xs```

That however won’t compile: now that we written type-safe tail function typechecker requires a proof that vector passed to it as an argument is non empty. The weak link in this code is the `vecNull` function. It tests whether a vector is empty but delivers no type-level proof of the result. In other words we need:

`vecNull` :: Vec a n -> IsNull n`

ie. a function with result type carrying the information about the length of the list. This data type will have the runtime representation isomorphic to `Bool`, ie. it will be an enumeration with two constructors, and the type index will correspond to length of a vector:

```data IsNull (n :: N) where Null    :: IsNull Z NotNull :: IsNull (S n)```

`Null` represents empty vectors, `NotNull` represents non-empty ones. We can now implement a version of `vecNull` that carries proof of the result at the type level:

```vecNull` :: Vec a n -> IsNull n vecNull` Nil        = Null vecNull` (Cons _ _) = NotNull```

The type signature of `vecNull`` says that the return type must have the same index as the input vector. Pattern matching on the `Nil` case provides the type checker with the information that the `n` index of `Vec` is `Z`. This means that the return value in this case must be `Null` – the `NotNull` constructor is indexed with `S` and that obviously does not match `Z`. Similarly in the `Cons` case the return value must be `NotNull`. However, replacing `vecNull` in the definition of `shorten` with our new `vecNull`` will again result in a type error. The problem comes from the type signature of `shorten`:

`shorten :: Vec a n -> Vec a m`

By indexing input and output vectors with different length indices – `n` and `m` – we tell the typechecker that these are completely unrelated. But that is not true! Knowing the input length `n` we know exactly what the result should be: if the input vector is empty the result vector is also empty; if the input vector is not empty it should be shortened by one. Since we need to express this at the type level we will use a type family:

```type family Pred (n :: N) :: N where Pred Z = Z Pred (S n) = n```

(In a fully-fledged dependently-typed language we would write normal function and then apply it at the type level.) Now we can finally write:

```shorten :: Vec a n -> Vec a (Pred n) shorten xs = case vecNull` xs of Null -> xs NotNull -> vecTail xs```

This definition should not go wrong. Trying to swap expression in the branches will result in a type error.

1. Assuming we don’t abuse Haskell’s unsoundness as logic, eg. by using `undefined`. []

## The basics of coinduction

I don’t remember when I first heard the terms “coinduction” and “corecursion” but it must have been quite long ago. I had this impression that they are yet another of these difficult theoretical concepts and that I should learn about them one day. That “one day” happened recently while reading chapter 5 of “Certified Programming with Dependent Types”. It turns out that basics of coinduction are actually quite simple. In this post I’ll share with you what I already know on the subject.

Let’s begin with looking at Haskell because it is a good example of language not formalizing coinduction in any way. Two features of Haskell are of interest to us. First one is laziness. Thanks to Haskell being lazy we can write definitions like these (in GHCi):

```ghci> let ones = 1 : ones ghci> let fib = zipWith (+) (1:fib) (1:1:fib)```

`ones` is – as the name implies – an infinite sequence (list) of ones. `fib` is a sequence of Fibonacci numbers. Both these definitions produce infinite lists but we can use these definitions safely because laziness allows us to force a finite number of elements in the sequence:

```ghci> take 5 ones [1,1,1,1,1] ghci> take 10 fib [2,3,5,8,13,21,34,55,89,144]```

Now consider this definition:

`ghci> let inf = 1 + inf`

No matter how hard we try there is no way to use the definition of `inf` in a safe way. It always causes an infinite loop:

```ghci> (0 /= inf) *** Exception: <<loop>>```

The difference between definitions of `ones` or `fib` an the definition of `inf` is that the former use something what is called a guarded recursion. The term guarded comes from the fact that recursive reference to self is hidden under datatype constructor (or: guarded by a constructor). The way lazy evaluation is implemented gives a guarantee that we can stop the recursion by not evaluating the recursive constructor argument. This kind of infinite recursion can also be called productive recursion, which means that although recursion is infinite each recursive call is guaranteed to produce something (in my examples either a 1 or next Fibonacci number). By contrast recursion in the definition of `inf` is not guarded or productive in any way.

Haskell happily accepts the definition of `inf` even though it is completely useless. When we write Haskell programs we of course don’t want them to fall into silly infinite loops but the only tool we have to prevent us from writing such code is our intelligence. Situation changes when it comes to….

# Dependently-typed programming languages

These languages deeply care about termination. By “termination” I mean ensuring that a program written by the user is guaranteed to terminate for any input. I am aware of two reasons why these languages care about termination. First reason is theoretical: without termination the resulting language is inconsistent as logic. This happens because non-terminating term can prove any proposition. Consider this non-terminating Coq definition:

`Fixpoint evil (A : Prop) : A := evil A.`

If that definition was accepted we could use it to prove any proposition. Recall that when it comes to viewing types as proofs and programs as evidence “proving a proposition” means constructing a term of a given type. `evil` would allow to construct a term inhabiting any type `A`. (`Prop` is a kind of logical propositions so `A` is a type.) Since dependently-typed languages aim to be consistent logics they must reject non-terminating programs. Second reason for checking termination is practical: dependently typed languages admit functions in type signatures. If we allowed non-terminating functions then typechecking would also become non-terminating and again this is something we don’t want. (Note that Haskell gives you `UndecidableInstances` that can cause typechecking to fall into an infinite loop).

Now, if you paid attention on your Theoretical Computer Science classes all of this should ring a bell: the halting problem! The halting problem says that the problem of determining whether a given Turing machine (read: a given computer program) will ever terminate is undecidable. So how is that possible that languages like Agda, Coq or Idris can answer that question? That’s simple: they are not Turing-complete (or at least their terminating subsets are not Turing complete). (UPDATE: but see Conor McBride’s comment below.) They prohibit user from using some constructs, probably the most important one being general recursion. Think of general recursion as any kind of recursion imaginable. Dependently typed languages require structural recursion on subterms of the arguments. That means that if a function receives an argument of an inductive data type (think: algebraic data type/generalized algebraic data type) then you can only make recursive calls on terms that are syntactic subcomponents of the argument. Consider this definition of `map` in Idris:

```map : (a -> b) -> List a -> List b map f [] = [] map f (x::xs) = f x :: map f xs```

In the second equation we use pattern matching to deconstruct the list argument. The recursive call is made on `xs`, which is structurally smaller then the original argument. This guarantees that any call to `map` will terminate. There is a silent assumption here that the `List A` argument passed to `map` is finite, but with the rules given so far it is not possible to construct infinite list.

So we just eliminated non-termination by limiting what can be done with recursion. This means that our Haskell definitions of `ones` and `fib` would not be accepted in a dependently-typed language because they don’t recurse on an argument that gets smaller and as a result they construct an infinite data structure. Does that mean we are stuck with having only finite data structures? Luckily, no.

# Coinduction to the rescue

Coinduction provides a way of defining and operating on infinite data structures as long as we can prove that our operations are safe, that is they are guarded and productive. In what follows I will use Coq because it seems that it has better support for coinduction than Agda or Idris (and if I’m wrong here please correct me).

Coq, Agda and Idris all require that a datatype that can contain infinite values has a special declaration. Coq uses `CoInductive` keyword instead of `Inductive` keyword used for standard inductive data types. In a similar fashion Idris uses `codata` instead of `data`, while Agda requires ∞ annotation on a coinductive constructor argument.

Let’s define a type of infinite `nat` streams in Coq:

```CoInductive stream : Set := | Cons : nat -> stream -> stream.```

I could have defined a polymorphic stream but for the purpose of this post stream of nats will do. I could have also defined a `Nil` constructor to allow finite coinductive streams – declaring data as coinductive means it can have infinite values, not that it must have infinite values.

Now that we have infinite streams let’s revisit our examples from Haskell: `ones` and `fib`. `ones` is simple:

`CoFixpoint ones : stream := Cons 1 ones.`

We just had to use `CoFixpoint` keyword to tell Coq that our definition will be corecursive and it is happily accepted even though a similar recursive definition (ie. using `Fixpoint` keyword) would be rejected. Allow me to quote directly from CPDT:

whereas recursive definitions were necessary to use values of recursive inductive types effectively, here we find that we need co-recursive definitions to build values of co-inductive types effectively.

That one sentence pins down an important difference between induction and coinduction.

Now let’s define `zipWith` and try our second example `fib`:

```CoFixpoint zipWith (f : nat -> nat -> nat) (a : stream) (b : stream) : stream := match a, b with | Cons x xs, Cons y ys => Cons (f x y) (zipWith f xs ys) end.   CoFixpoint fib : stream := zipWith plus (Cons 1 fib) (Cons 1 (Cons 1 fib)).```

Unfortunately this definition is rejected by Coq due to “unguarded recursive call”. What exactly goes wrong? Coq requires that all recursive calls in a corecursive definition are:

1. direct arguments to a data constructor
2. not inside function arguments

Our definition of `fib` violates the second condition – both recursive calls to `fib` are hidden inside arguments to `zipWith` function. Why does Coq enforce such a restriction? Consider this simple example:

```Definition tl (s : stream) : stream := match s with | Cons _ tl' => tl' end.   CoFixpoint bad : stream := tl (Cons 1 bad).```

`tl` is a standard tail function that discards the first element of a stream and returns its tail. Just like our definition of `fib` the definition of `bad` places the corecursive call inside a function argument. I hope it is easy to see that accepting the definition of `bad` would lead to non-termination – inlining definition of `tl` and simplifying it leads us to:

`CoFixpoint bad : stream := bad.`

and that is bad. You might be thinking that the definition of `bad` really has no chance of working whereas our definition of `fib` could in fact be run safely without the risk of non-termination. So how do we persuade Coq that our corecursive definition of `fib` is in fact valid? Unfortunately there seems to be no simple answer. What was meant to be a simple exercise in coinduction turned out to be a real research problem. This past Monday I spent well over an hour with my friend staring at the code and trying to come up with a solution. We didn’t find one but instead we found a really nice paper “Using Structural Recursion for Corecursion” by Yves Bertot and Ekaterina Komendantskaya. The paper presents a way of converting definitions like `fib` to a guarded and productive form accepted by Coq. Unfortunately the converted definition looses the linear computational complexity of the original definition so the conversion method is far from perfect. I encourage to read the paper. It is not long and is written in a very accessible way. Another set of possible solutions is given in chapter 7 of CPDT but I am very far from labelling them as “accessible”.

I hope this post demonstrates that basics ideas behind coinduction are actually quite simple. For me this whole subject of coinduction looks really fascinating and I plan to dive deeper into it. I already have my eyes set on several research papers about coinduction so there’s a good chance that I’ll write more about it in future posts.

Staypressed theme by Themocracy