## Expressing foldl in terms of foldr

In chapter 4 of Real World Haskell there is a challenge for the reader: express `foldl` using `foldr`. Authors warn that this is not trivial and I must admit that I have not attempted this exercise leaving it for later. Yesterday I read Graham Hutton’s “Tutorial on the universality and expressiveness of fold” and it happens that, among other things, it presents an approach that can be applied to solve this problem. Hutton first presents a specific case in which `sum` function is defined using `foldl` expressed with `foldr`. Then he gives a general formula for expressing `foldl` with `foldr`. Despite having a derivation for one specific case and a ready result (without its derivation) it wasn’t straightforward to provide my own derivation of the general solution. This was a mind bending exercise and I’d like to go through the details here.

The solution is based on using the so called universal property of `foldr`1. This property states that if we have some function `g` defined as:

```g [] = v g (x:xs) = f x (g xs)```

then

`g = foldr f v`

Indeed, if we substitute `foldr f v` into definition of `g` we get a definition of `foldr`:

```foldr f v [] = v foldr f v (x:xs) = f x (foldr f v xs)```

Moreover, `foldr f v` is a unique solution to the defining equations of `g`.

Recall that `foldl` is defined as:

```foldl f v [] = v foldl f v (x:xs) = foldl f (f v x) xs```

The base case of `foldr` and `foldl` is identical, but the recursive one is not. Moreover, the recursive case of `foldl` cannot be rewritten in the form `f x (g xs)`. This means that we need to apply some transformation to definition of `foldl` so it can be rewritten in that form. Let’s create a function `foldl2` defined as:

```foldl2 f [] v = v foldl2 f (x:xs) v = foldl2 f xs (f v x)```

Nothing special so far. We just made a function that is the same as `foldl`, but has the last two parameters swapped. We can rewrite the base case as:

`foldl2 f [] v = id v`

`id` is the identity function that accepts one parameter and returns that parameter unchanged. Now we remove the `v` parameter:

`foldl2 f [] = id`

Such transformation is known as ?-reduction2. Let us now concentrate on the recursive case. It can be rewritten as

`foldl2 f (x:xs) v = (\w -> foldl2 f xs (f w x)) v`

We created an anonymous function with one of the parameters of `f` factored out. This expression can also be ?-reduced:

`foldl2 f (x:xs) = \w -> foldl2 f xs (f w x)`

Let’s factor out second parameter of function `f` in the same manner:

`foldl2 f (x:xs) = (\y w -> foldl2 f xs (f w y)) x`

And finally let’s factor out `foldl2 f xs` and just pass it as another parameter to the lambda expression:

`foldl2 f (x:xs) = (\y h w -> h (f w y)) x (foldl2 f xs)`

We’re almost there. Recall that the universal property requires function of the form3:

`g (x:xs) = k x (g xs)`

And it so happens that we just converted `foldl2` to that form, where `k = \y h w -> h (f w y)`. Comparing last two equation we see that `g = foldl2 f`, but from the universal property we also know that `g = foldr k v`, which means that `foldl2 f = foldr k v`. The notation here might be a bit confusing. From the base case we determined the value of `v` in the last equality to be equal to `id` function, which yields `foldl2 f = foldr k id`. Substituting the value of `k` we get:

`foldl2 f = foldr (\y h w -> h (f w y)) id`

Original definition of `foldl2` had two more parameters, but they were removed by ?-reductions. Let’s restore these two parameters by adding them to both lhs and rhs:

`foldl2 f xs v = foldr (\y h w -> h (f w y)) id xs v`

Recall that `foldl2` was only a convenience function we used for the derivation. Going back to the original `foldl` function yields the final result:

`foldl f v xs = foldr (\y h w -> h (f w y)) id xs v`

OK, formulas don’t lie, but this result is definitely not an intuitive one and deserves some good explanation. You may be surprised that four parameters are passed into `foldr`, but this should become clear in a moment. We will play with it to get some intuition on how this works.

Let us begin with verifying that this expression is type-correct. Type of `foldr` is:

```ghci> :t foldr foldr :: (a -> b -> b) -> b -> [a] -> b```

So the first parameter to `foldr` should be of type `(a -> b -> b)`. Lambda that we pass to `foldr` as the first parameter uses `f`. This is the function that is passed as first parameter to `foldl`. Since `foldl` has type:

```ghci> :t foldl foldl :: (a -> b -> a) -> a -> [b] -> a```

We require that `f` has type `a -> b -> a`. Let’s define simplest possible function that has that type and then check the type of lambda passed to `foldr`:

```ghci> let f = \a b -> a ghci> :t (\y h w -> h (f w y)) (\y h w -> h (f w y)) :: t2 -> (t1 -> t) -> t1 -> t```

Recall that `->` is right-associative, which means that above type is equivalent to `t2 -> (t1 -> t) -> (t1 -> t)`. Parentheses at the end can be added and the meaning is the same. This corresponds to our expected type of `(a -> b -> b)`. Here, the value of `b` is assumed to be `t1 -> t`. If we substitute (`t1 -> t`) for `b` in the type signature of `foldr` we get

`(a -> (t1 -> t) -> (t1 -> t)) -> (t1 -> t) -> [a] -> (t1 -> t)`

Note that last parentheses can be dropped, which result in function that has four parameters:

`(a -> (t1 -> t) -> (t1 -> t)) -> (t1 -> t) -> [a] -> t1 -> t`

We already verified that lambda passed to `foldr` is of type `(a -> (t1 -> t) -> (t1 -> t))`. The second parameter, `id` function, is of type `(a -> a)`, which corresponds to `(t1 -> t)` in the type signature. Therefore usage of `id` imposes additional restriction that `t1` ~ `t`4, which means that type signature can be rewritten as:

`(a -> (t -> t) -> (t -> t)) -> (t -> t) -> [a] -> t -> t`

`[a]` corresponds to parameter `xs`, the list that we are folding. `t` corresponds to initial value of accumulator and the last `t` is the return type.

Now that we have verified type-correctness of the solution, let’s see how it works in practice. Let’s say we want to fold a list of three elements using `(+)` as folding function and `0` as initial value of the accumulator. In other words, we want to calculate the sum of elements in a list. If we use `foldr`, the evaluation process will look like this:

```foldr (+) 0 [1,2,3] = (+) 1 (foldr (+) 0 [2,3]) = (+) 1 ((+) 2 (foldr (+) 0 [3])) = (+) 1 ((+) 2 ((+) 3 (foldr (+) 0 []))) = (+) 1 ((+) 2 ((+) 3 0 )) = (+) 1 ((+) 2 3) = (+) 1 5 = 6```

If we instead use `foldl`, evaluation will look like this:

```foldl (+) 0 [1,2,3] = foldl ((+) 0 1) [2,3] = foldl ((+) ((+) 0 1) 2) [3] = foldl ((+) ((+) ((+) 0 1) 2) 3) [] = ((+) ((+) ((+) 0 1) 2) 3) = ((+) ((+) 1 2) 3) = ((+) 3 3) = 6```

Both folds produce the same result, which is a direct consequence of first duality theorem for folds5. Now let’s see how evaluation will proceed if we use `foldl` expressed using `foldr`:

```foldl (+) 0 [1,2,3] = foldr (\y h w -> h ((+) w y)) id [1,2,3] 0 = (\h w -> h ((+) w 1)) (foldr (\y h w -> h ((+) w y)) id [2,3]) 0 = (\h w -> h ((+) w 1)) (\h w -> h ((+) w 2)) (foldr (\y h w -> h ((+) w y)) id [3]) 0 = (\h w -> h ((+) w 1)) (\h w -> h ((+) w 2)) (\h w -> h ( (+) w 3) ) (foldr (\y h w -> h ( (+) w y) ) id []) 0 = (\h w -> h ((+) w 1)) (\h w -> h ((+) w 2)) (\h w -> h ((+) w 3)) id 0 = (\h w -> h ((+) w 1)) (\h w -> h ((+) w 2)) (w -> id ((+) w 3)) 0 = (\h w -> h ((+) w 1)) (w -> id ((+) ((+) w 2) 3)) 0 = (w -> id ((+) ((+) ((+) w 1) 2) 3)) 0 = id ((+) ((+) ((+) 0 1) 2) 3))```

We’ve reached expression that is the same as the one we reached when evaluating `foldl`. Well, in fact that is what we expected. After all this is also `foldl`! So the whole trick is based on using `foldr` to generate a function that accepts initial value of the accumulator and produces the same expression we would get when using `foldl` (plus the identity function).

I hope that this post made it clear how to express `foldl` using `foldr`. This is of course by no means an exhaustive treatment of the subject of folds. There’s a lot more. I think that Hutton’s paper is a good starting point. Bird’s and Wadler’s “Introduction to Functional Programming” also seems to be a very valuable resource, though I’ve read only chapter about folds6. There’s still some more stuff to figure out about folds, like the difference in behaviour of `foldr` and `foldl` for infinite lists or expressing `foldr` using `foldl` (for finite lists only).

1. Graham Hutton in his paper uses the name fold to denote `foldr`. I’m sticking with `foldr` so my derivation remains consistent with Haskell []
2. ? is pronounced eta. []
3. Notice that I renamed the name of the first function from `f` to `k` to prevent the name clash. []
4. ~ notation means that two types are the same. []
5. See: Bird, R., Wadler, P. “Introduction to functional programming”, 1st ed., p. 68 []
6. There’s also a second edition of this book authored only by Bird, but I wasn’t able to find it. []

## Monomorphism restriction strikes again

A while ago I blogged about monomorphsim restriction. I thought I understood most of it, but just yesterday I was surprisingly hit by it. I was working my way through chapter 9 of Real World Haskell. In this chapter authors present a case study of a library that is designed for searching the file system. The central point in the design of this library are predicates that are used to perform queries. A predicate function is given some information about the file system entry: path, permissions (they allow to determine if the entry is a file or a directory), size and modification time. Based on that information a predicate function can return some information, e.g. name of the file or `Bool` value stating if the file meets some given criteria. To save a bit of typing RWH authors introduce a type synonym:

```import System.Directory (Permissions(..)) import System.Time (ClockTime(..))   type InfoP a = FilePath -> Permissions -> Maybe Integer -> ClockTime -> a```

‘P’ at the end of name stands for ‘predicate’. As you can see this type defines a function type with polymorphic return value acting as type parameter. Here’s an example of usage:

```pathP :: InfoP FilePath pathP path _ _ _ = path   sizeP :: InfoP Integer sizeP _ _ (Just size) _ = size sizeP _ _ Nothing _ = -1```

These functions retrieve path to a file and its size, respectively. We’d certainly would like to do something with those information. We might want to know if the file is larger than some specified size. We might also want to join predicates using logical AND and OR. Here’s an example:

```equalP :: (Eq a) => InfoP a -> a -> InfoP Bool equalP f k = \w x y z -> f w x y z == k```

This function takes a predicate and a value and compares them for equality, e.g. `equalP sizeP 1024`. This can be generalized to any comparison operator by introducing additional parameter1 :

```liftP :: (a -> b -> c) -> InfoP a -> b -> InfoP c liftP q f k = \w x y z -> f w x y z `q` k   equalP :: (Eq a) => InfoP a -> a -> InfoP Bool equalP = liftP (==)```

So `liftP` takes an operator `q` (infix notation is used to show that explicitly) and applies it to the result of `f` and `k`. Note that both `liftP` and previous version of `equalP` return anonymous functions that require four more parameters to yield a result. So far so good. We can easily define other predicates like `lesserP` and `greaterP` only by passing different comparison operator to `liftP`. The `liftP` function can be further generalized to allow logical operators:

```liftP2 :: (a -> b -> c) -> InfoP a -> InfoP b -> InfoP c liftP2 q f g = \w x y z -> f w x y z `q` g w x y z   andP = liftP2 (&&) orPĀ  = liftP2 (||)```

As you can see `liftP` and `liftP2` are very similar and in fact the former one can be written in the terms of the latter:

```constP :: a -> InfoP a constP k _ _ _ _ = k   liftP :: (a -> b -> c) -> InfoP a -> b -> InfoP c liftP q f k = liftP2 q f (constP k)```

This part was very confusing to me at first, but it becomes clear when you replace the InfoP type synonym in constP type declaration with actual type:

```constP :: a -> FilePath -> Permissions -> Maybe Integer -> ClockTime -> a constP k _ _ _ _ = k```

Now it becomes clear that `constP` takes a constant and four other parameters for the predicate, discards these parameters four and returns the constant.

Now comes the key part. Notice that logical predicates `andP` and `orP` were written without type signature. That’s perfectly fine. However, omitting type signature for `equalP` function causes an error:

```Ambiguous type variable `b0' in the constraint: (Eq b0) arising from a use of `==' Possible cause: the monomorphism restriction applied to the following: equalP :: InfoP b0 -> b0 -> InfoP Bool (bound at rwh9.hs:20:1) Probable fix: give these definition(s) an explicit type signature or use -XNoMonomorphismRestriction In the first argument of `liftP', namely `(==)' In the expression: liftP (==) In an equation for `equalP': equalP = liftP (==)```

I’m not sure why this happens when we try to lift `(==)`, but doesn’t happen when `(&&)` is lifted in same manner. It seems that this happens because `(==)` operator introduces additional type class constraints (`Eq` type class), but I don’t see why these constraints would cause monomorphsim restriction to kick in. These mystery is yet to be solved.

1. My code slightly differs from what you’ll find in RWH []

Staypressed theme by Themocracy