During last few weeks I got a pretty good grip of basics of dependent types and Agda. Programming in Agda is fun but nevertheless I decided to experiment with other dependently-typed programming languages. Back in March I attempted to learn Idris from one of Edwin Brady’s presentations, but having no knowledge of dependent types I had to give up after about 30 minutes of first video. Now that I know basics of Agda I decided to give Idris another try. This time it was much simpler. Reading official Idris tutorial and doing some experiments took me about 5 hours. Below are some of my first impressions (I’m underlining that phrase to make it clear that some of these opinions may change in the future).
- Standard library in Idris feels friendlier than in Agda. It is bundled with the compiler and doesn’t require additional installation (unlike Agda’s). Prelude is by default imported into every module so programmer can use Nat, Bool, lists and so on out of the box. There are also some similarities with Haskell prelude. All in all, standard library in Idris is much less daunting than in Agda.
- Idris is really a programming language, i.e. one can write programs that actually run. Agda feels more like a proof assistant. According to one of the tutorials I’ve read you can run programs written in Agda, but it is not as straightforward as in Idris. I personally I haven’t run a single Agda program – I’m perfectly happy that they typecheck.
- Compared to Agda Idris has limited Unicode support. I’ve never felt the need to use Unicode in my source code until I started programming in Agda – after just a few weeks it feels like an essential thing. I think Idris allows Unicode only in identifiers, but doesn’t allow it in operators, which means I have to use awkward operators like
<!= instead of ≤. I recall seeing some discussions about Unicode at #idris channel, so I wouldn’t be surprised if that changed soon.
- One of the biggest differences between Agda and Idris is approach to proofs. In Agda a proof is part of function’s code. Programmer is assisted by agda-mode (in Emacs) which guides code writing according to types (a common feature in dependently typed languages). Over the past few weeks I’ve come to appreciate convenience offered by agda-mode: automatic generation of case analysis, refinement of holes, autocompletion of code based on types to name a few. Idris-mode for Emacs doesn’t support interactive development. One has to use interactive proof mode provided in Idris REPL – this means switching between terminal windows, which might be a bit inconvenient. Proofs in Idris can be separated from code they are proving. This allows to write code that is much clearer. In proof mode one can use tactics, which are methods used to convert proof terms in order to reach a certain goal. Generated proof can then be added to source file. It is hard for me to decide which method I prefer. The final result is more readable in Idris, but using tactics is not always straightforward. I also like interactive development offered by Agda. Tough choice.
- Both languages are poorly documented. That said, Idris has much less documentation (mostly papers and presentations by Edwin Brady). I expect this to change, as the Idris community seems to be growing (slowly, but still).
- One thing I didn’t like in Idris are visibility qualifiers used to define how functions and datatypes are exported from the module. There are three available: public (export name and implementation), private (don’t export anything) and abstract (export type signature, but don’t export implementation). This is slightly different than in Haskell – I think that difference comes from properties of dependent types. What I didn’t like are rules and syntax used to define export visibility. Visibility for a function or datatype can be defined by annotating it with one of three keywords: public, private, abstract. If all definitions in a module are not annotated then everything is public. But if there is at least one annotation everything without annotation is private. Unless you changed the default visibility, in which case everything without annotation can be abstract! In other words if you see a definition without annotation it means that: a) it can be public, but you have to check if all other definitions are without annotations; b) private, if at least one other definition is annotated – again, you have to check whole file; c) but it can be abstract as well – you need to check the file to see if the default export level was set. The only way to be sure – except for nuking the entire site from orbit – is annotating every function with an export modifier, but that feels very verbose. I prefer Haskell’s syntax for defining what is exported and what is not and I think it could be easily extended to support three possible levels of export visibility.
- Unlike Agda, Idris has case expressions. They have some limitations however. I’m not sure whether these limitations come from properties of dependently typed languages or are they just simplifications in Idris implementation that could theoretically be avoided.
- Idris has lots of other cool features. Idiom brackets are a syntactic sugar for applicative style: you can write
[| f a b c |] instead of
pure f <$> a <$> b <$> c. Idris has syntax extensions designed to support development of EDSLs. Moreover tuples are available out of the box, there’s do-notation for monadic expressions, there are list comprehensions and Foreign Function Interface.
- One feature that I’m a bit sceptical about are “implicit conversions” that allow to define implicit casts between arguments and write expressions like
"Number " ++ x, where
x is an
Int. I can imagine this could be a misfeature.
- Idris has “using” notation that allows to introduce definitions that are visible throughout a block of code. Most common use seems to be in definition of data types. Agda does it better IMO by introducing type parameters into scope of data constructors.
- Idris seems to be developed more actively. The repos are stored on github so anyone can easily contribute. This is not the case with Agda, which has Darcs repos and the whole process feels closed (in a sense “not opened to community”). On the other hand mailing list for Idris is set up on Google lists, which is a blocker for me.
All in all programming in Idris is also fun although it is slightly different kind of fun than in Agda. I must say that I miss two features from Agda: interactive development in Emacs and Unicode support. Given how actively Idris is developed I imagine it could soon become more popular than Agda. Perhaps these “missing” features will also be added one day?
As an exercise I rewrote code from “Why dependent types matter” paper from Agda (see my previous post) to Idris. Code is available in on github.
During my MSR internship I had a chance to learn from people working on type systems. I already had interest in types before going to Cambridge, but somehow I didn’t put my full effort into that topic. I had only made some attempts at learning dependent types but didn’t get farther than standard vector examples. The best thing that happened in Cambridge was Conor McBride’s course on Dependently-typed metaprogramming (in Agda) organized by Ohad Kammar at the University’s Computer Laboratory. I learned about the course after it had started so I only had a single day to learn basics of Agda before going to the second session of lectures and labs. I admit that material covered by Conor was difficult for me and I will probably need a few months to digest it.
Recently I decided to solidify my knowledge of basics of dependent types by reading “Why Dependent Types Matter”. This unpublished paper was written by Thorsten Altenkirch, Conor McBride and James McKinna somewhere in 2006 I believe. It gives a great overview of dependent types and various design decisions related to their usage. But most of all this paper shows how to write a provably correct merge-sort algorithm. Proving correctness of algorithms is something I find very interesting, so this paper was a must-read for me.
There is only one catch with “Why Dependent Types Matter”. All the code is written in Epigram, a dependently typed functional language designed by Conor McBride and James McKinna. The problem is that Epigram’s webpage has been offline for few months now and the language basically seems dead. Anyway, since my dependently-typed language of choice is Agda (for the moment at least – I’m thinking a lot about Idris recently) I decided to rewrite all the code in the paper to Agda. For the most part this was a straightforward task, once I learned how to read Epigram’s unusual syntax. There were however a few bumps along the way. One problem I encountered early on was Agda’s termination checker complaining about some functions. Luckily, Agda community is as helpful as Haskell’s and within a day I was given a detailed explanation of what goes wrong. A slightly larger problem was that paper elides details of some proofs. If I wanted to have working Agda code I had to fill in these details. Since I didn’t know how to do that I had to pause for one day and go through online materials for Thorsten Altenkirch’s course on Computer Aided Formal Reasoning. In the end I managed to fill in all the missing gaps. My code is available on github. Now I feel ready to prove correctness of a few more algorithms on my own.
Conor will be giving his course on “Dependently typed metaprogramming” in November and December at University of Edinburgh. See here for details. Be sure not to miss it if you have a chance to attend. Code repository for the course is available here.
Unofficial mirror of Epigram’s sources is available on github.
For the last 4 months I have been working on improving compilation of intermediate Cmm language used by GHC. Recently I am noticing some recurring problems in design of the Cmm pipeline.
I already mentioned in my earlier post that Cmm is a low-level language, something between C and assembly. Cmm is produced from another intermediate language, STG. A single Cmm procedure is represented as a directed graph. Each node in a graph is a Cmm Block of low level instructions. Exactly one Cmm Block in a graph is an entry point – a block from which the execution starts when procedure represented by a graph is called. Most Cmm Blocks in a graph have at least one successor, that is node(s) to which control flows from a given Cmm Block. A Cmm Block may not have a successor if it is a call to another procedure, i.e. it passes flow of control to another Cmm graph. Each Cmm Block consists of a linear list of Cmm Nodes. A Cmm Node represents a single Cmm instruction like store to a register, conditional jump or call to a function.
Cmm representation produced by the STG -> Cmm pass is incomplete. For example operations on the stack are represented in an abstract way. It is also far from being optimal as it may contain lots of empty blocks or control flow paths that will never be taken. That’s why we have Cmm pipeline. It takes Cmm representation produced by the code generator and applies a series of transformations to it. Some of them are mandatory (like stack layout), while others perform optimisations and are completely optional. Here’s a rough overview of the pipeline:
- Control Flow Optimisations. Optimises structure of a graph by concatenating blocks and omitting empty blocks.
- Common Block Elimination (optional). Eliminates duplicate blocks.
- Minimal Proc-point set. Determines a minimal set of proc-points.
- Stack Layout. Turns abstract stack representation into explicit stack pointer references. Requires proc-point information computed in step 3. Creates stack maps.
- Sinking pass (optional). Moves variable declarations closer to their usage sites. Inlines some literals and registers in a way similar to constant propagation.
- CAF analysis. Does analysis of constant-applicative forms (top-level declarations that don’t have any parameters). CAF information is returned by the Cmm pipeline together with optimized Cmm graph.
- Proc-point analysis and proc-point splitting (optional). Here the pipeline splits into two alternative flows. They are identical except for the fact that one branch begins by performing proc-point splitting. This means that blocks that were determined to be proc-points are now turned into separate procedures. Requires proc-point information computed in step 3.
- Info Tables. Populates info tables of each Cmm function with stack usage information. Uses stack maps created by the stack layout (step 4).
- Control Flow Optimisations (optional). Repeat control flow optimisations (step 1), but this time this is optional.
- Unreachable Blocks Removal. Eliminates blocks that don’t have a predecessor in the Cmm graph.
As an example consider this Cmm produced by the code generator:
if ((old + 0) - <highSp> < SpLim) goto c4x9; else goto c4xa;
R2 = _s2Rv::P64;
R1 = _s2Ru::P64;
call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
After going through the Cmm pipeline the empty blocks are eliminated and abstract stack representation (
(old + 0) and
<highSp>) is turned into explicit stack usage:
if (Sp - 88 < SpLim) goto c4x9; else goto c4xa;
R2 = _s2Rv::P64;
R1 = _s2Ru::P64;
call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
During my work on the Cmm pipeline I encountered following bugs and design issues:
- In some cases the Stack Layout phase (step 4) can invalidate the minimal set of proc-points by removing a block that was earlier determined to be a proc-point. However, minimal proc-point information is later used by proc-point analysis. GHC would panic if a proc-point block was removed. This was bug #8205. I solved it by adding additional check in the proc-point analysis that ensures that all proc-points actually exist in a graph. This was a simple, one line solution, but it doesn’t feel right. It accepts the fact that our data is in an inconsistent state and places the responsibility of dealing with that on algorithms relying on that state. In other words, any algorithm that relies on minimal proc-point set after the stack layout must check whether given proc-points exist in a graph.
- I observed that in some circumstances control flow optimisation pass may lead to unexpected duplication of blocks (see #8456). After some investigation it turned out that this pass began by computing set of predecessors for each block and then modified the graph based on this information. The problem was that the list of predecessors was not being updated as the graph was modified. This problem is the same as the previous one: we compute a fact about our data structure and based on that fact we start modifying the data but we don’t update the fact as we go.
- Control flow optimisations pass may produce unreachable blocks. They remain in the data structure representing the graph, but they are not reachable from any block in the graph. The common block elimination pass will remove the unreachable blocks but before it does the data is in an inconsistent state.
- Same thing happens later: stack layout pass may create unreachable blocks and relies on later passes to remove them.
I listed only the problems I am aware of but I believe there are more and they will manifest themselves one day. I spent last couple of days thinking how to solve these issues. Of course fixing each of them separately is a relatively simple task, but I’m trying to come up with some general design that would prevent us from introducing inconsistencies in our data structures.