Coq’Art, CPDT and SF: a review of books on Coq proof assistant

I have been pretty quiet on the blog in the past couple of months. One of the reasons for this is that I have spent most of my time learning Coq. I had my first contact with Coq well over a year ago when I started reading CPDT. Back then I only wanted to learn the basics of Coq to see how it works and what it has to offer compared to other languages with dependent types. This time I wanted to apply Coq to some ideas I had at work, so I was determined to be much more thorough in my learning. Coq is far from being a mainstream language but nevertheless it has some really good learning resources. Today I would like to present a brief overview of what I believe are the three most important books on Coq: “Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions” (which I will briefly refer to as Coq’Art) by Yves Bertot and Pierre Castéran, “Certified Programming with Dependent Types” (CPDT) by Adam Chlipala and “Software Foundations” (SF for short) by Benjamin Pierce and over a dozen over contributors. All three books significantly differ in their scope and focus. CPDT and Coq’Art are standard, printed books. CPDT is also available online for free. Software Foundations is only available as an online book. Interestingly, there is also a version of SF that seems to be in the process of being revised.

I believe Coq’Art was the first book published on Coq. There are two editions – 2004 hardcover version and a 2010 paperback version – but as far as I know there are no differences between them. Too bad the 2010 edition was not updated for the newest versions of Coq – some of the code examples don’t work in the newest compiler. Coq’Art takes a theoretical approach, ie. it teaches Coq largely by explaining how the rules of Calculus of Constructions work. There are also practical elements like case studies and exercises but they do not dominate the book. Personally I found Coq’Art to be a very hard read. Not because it dives too much in theory – it doesn’t – but because the presentation seems to be chaotic. For example, description of a single tactic can be spread throughout deveral places in the book. In principle, I don’t object to extending earlier presentation with new details once the reader gets a hold of some new concepts, but I feel that Coq’Art definitely goes too far. Coq’Art also presents material in a very unusual order. Almost every introduction to Coq or any other functional language begins with defining data types. Coq’Art introduces them in chapter 6. On the other hand sorts and universes – something I would consider an advanced concept for anyone who is not familiar with type-level programming – are presented in the second chapter. (Note that first chapter is a very brief overview of the language.) By contrast, CPDT goes into detailed discussion of universes in chapter 12 and SF does not seem to cover them at all. Overall, Coq’Art is of limited usefulness to me. To tell the truth this is not because of its focus on theory rather than practice, but because of language style, which I find rather inaccessible. Many times I had problems understanding passages I was reading, forcing me to re-read them again and again, trying to figure out what is the message that the authors are trying to convey. I did not have such problems with CPDT, SF, nor any other book I have read in the past few years. At the moment I have given up on the idea of reading the book from cover to cover. Nevertheless I find Coq’Art a good supplementary reading for SF. Most importantly because of the sections that explain in detail the inner workings of various tactics.

As mentioned at the beginning, I already wrote a first impressions post about CPDT. Back then I said the book “is a great demonstration of what can be done in Coq but not a good explanation of how it can be done”. Having read all of it I sustain my claim. CPDT does not provide a thorough and systematic coverage of basics, but instead focuses on advanced topics. As such, it is not the best place to start for beginners but it is a priceless resource for Coq practitioners. The main focus of the book is proof automation with Ltac, Coq’s language for writing automated proof procedures. Reader is exposed to Ltac early on in the book, but detailed treatment of Ltac is delayed until chapter 14. Quite surprisingly, given that it is hard to understand earlier chapters without knowing Ltac. Luckily, the chapters are fairly independent of each other and can be read in any order the reader wishes. Definitely it is worth to dive into chapter 14 and fragments of apter 13 as early as possible – it makes understanding the book a whole lot easier. So far I have already read chapter 14 three times. As I learn Coq more and more I discover new bits of knowledge with each read. In fact, I expect to be going back regularly to CPDT.

Coq’Art and CPDT approach teaching Coq in totally different ways. It might then be surprising that Software Foundations uses yet another approach. Unlike Coq’Art it is focused on practice and unlike CPDT it places a very strong emphasis on learning the basics. I feel that SF makes Coq learning curve as flat as possible. The main focus of SF is applying Coq to formalizing programming languages semantics, especially their type systems. This should not come as a big surprise given that Benjamin Pierce, the author of SF, authored also “Types and Programming Languages” (TAPL), the best book on the topic of type systems and programming language semantics I have seen. It should not also be surprising that a huge chunk of material overlaps between TAPL and SF. I find this to be amongst the best things about SF. All the proofs that I read in TAPL make a lot more sense to me when I can convert them to a piece of code. This gives me a much deeper insight into the meaning of lemmas and theorems. Also, when I get stuck on an exercise I can take a look at TAPL to see what is the general idea behind the proof I am implementing.

SF is packed with material and thus it is a very long read. Three months after beginning the book and spending with it about two days a week I am halfway through. The main strength of SF is a plethora of exercises. (Coq’Art has some exercises, but not too many. CPDT has none). They can take a lot of time – and I really mean a lot – but I think this is the only way to learn a programming language. Besides, the exercises are very rewarding. One downside of the exercises is that the book provides no solutions, which is bad for self-studying. Moreover, the authors ask people not to publish the solutions on the internet, since “having solutions easily available makes [SF] much less useful for courses, which typically have graded homework assignments”. That being said, there are plenty of github repositories that contain the solved exercises (I also pledge guilty!). Although it goes against the authors’ will I consider it a really good thing for self-study: many times I have been stuck on exercises and was able to make progress only by peeking at someone else’s solution. This doesn’t mean I copied the solutions. I just used them to overcome difficulties and in some cases ended up with proofs more elegant than the ones I have found. As a side note I’ll add that I do not share the belief that publishing solutions on the web makes SF less useful for courses. Students who want to cheat will get the solutions from other students anyway. At least that has been my experience as an academic teacher.

To sum up, each of the books presents a different approach. Coq’Art focuses on learning Coq by understanding its theoretical foundations. SF focuses on learning Coq through practice. CPDT focuses on advanced techniques for proof automation. Personally, I feel I’ve learned the most from SF, with CPDT closely on the second place. YMMV

Installing OCaml under openSUSE 11.4, or: “the compilation of conf-ncurses failed”

Recently I decided to learn the basics of OCaml and I spent yesterday installing the compiler and some basic tools. On my machine at work I have a Debian 7 installation, while on my home laptop I have openSUSE 11.4. Both systems are quite dated and ship with OCaml 3.x compiler, which is five years old. Obviously, I wanted to have the latest version of the language. I could have compiled OCaml from sources – and in fact I have done that in the past to compile the latest version of Coq – but luckily there is a tool called OPAM (OCaml Package manager). OPAM can be used to easily download and install desired version of OCaml compiler. As the name implies, OPAM can be also used for managing packages installed for a particular compiler version.

The installation process went very smooth on my Debian machine, but on openSUSE I have run into problems. After getting the latest compiler I wanted to install ocamlfind, a tool required by a project I wanted to play with. To my disappointment installation ended with an error:

[ERROR] The compilation of conf-ncurses failed at "pkg-config ncurses".
 
This package relies on external (system) dependencies that may be missing.
`opam depext conf-ncurses.1' may help you find the correct installation for your 
system.

I verified that I indeed have installed development files for the ncurses library as well as the pkg-config tool. Running the suggested opam command also didn’t find any missing dependencies, and the log files from the installation turned out to be completely empty, so I was left clueless. Googling revealed that I am not the first to encounter this problem, but offered no solution. I did some more reading on pkg-config and learned that: a) it is a tool that provides meta-information about installed libraries, and b) in order to recognize that a library is installed it requires extra configuration files (aka *.pc files) provided by the library. Running pkg-config --list-all revealed that ncurses is not recognized as installed on my system, which suggested that the relevant *.pc files are missing. Some more googling revealed that ncurses library can be configured and then compiled with --enable-pc-files switch, which should build the files needed by pkg-config. I got the sources for the ncurses version installed on my system (5.7) only to learn that this build option is unsupported. This explains why the files are missing on my system. I got the sources for the latest version of ncurses (6.0), configured them with --enable-pc-files and compiled, only to learn that the *.pc files were not built. After several minutes of debugging I realized that for some unexplained reasons the configure-generated script which should build the *.pc files (located at misc/gen-pkgconfig) did not receive +x (executable) permission. After adding this permission manually I ran the script and got five *.pc files for the ncurses 6.0 library. Then I had to edit the files to match the version of ncurses of my system – relevant information can be obtained by running ncurses5-config --version. The only remaining thing was to place the five *.pc files in a place where pkg-config can find them. On openSUSE this was /usr/local/pkgconfig, but this can differ between various Linux flavours.

After all these magical incantations the installation of ocamlfind went through fine and I can enjoy a working OCaml installation on both of my machines. Now I’m waiting for the “Real-world OCaml” book ordered from Amazon (orders shipped from UK Amazon to Poland tend to take around two weeks to arrive).

Typed holes support in Template Haskell

Back in April I found myself in a need for typed holes in Template Haskell. To my disappointment it turned out that typed holes are not implemented in TH. Sadly, this happens too often: a feature is added to GHC but no Template Haskell support is implemented for it. This was the time when I was working on injective type families and I already had some experience in extending TH implementation. I figured that adding support for typed holes should be a trivial task, no more than 30 minutes of coding. I created a feature request on Trac and started coding. I quickly realized that it won’t be that simple. Not that the amount of required work was that extensive. I simply tripped over the way GHC handles names internally. As a result the work got stalled for several months and I only finished it two weeks ago thanks to help from Richard Eisenberg.

My patch allows you to do several interesting things. Firstly, it allows to quote typed holes, ie. expressions with name starting with an underscore:

[d| i :: a -> a
    i x = _ |]

This declaration quote will represent _ using an UnboundVarE constructor. Secondly, you can now splice unbound variables:

i :: a -> a
i x = $( return $ VarE (mkName "_") )
 
j :: a -> a
j x = $( return $ UnboundVarE (mkName "_") )

Notice that in a splice you can use either VarE or UnboundVarE to represent an unbound variable – they are treated the same.

A very important side-effect of my implementation is that you can actually quote unbound variables. This means that you can now use nested pattern splices, as demonstrated by one of the tests in GHC testsuite:

baz = [| \ $( return $ VarP $ mkName "x" ) -> x |]

Previously this code was rejected. The reason is that:

  1. nested pattern splice is not compiled immediately, because it is possible that it refers to local variables defined outside of the bracket;
  2. the bracket is renamed immediately at the declaration site and all the variables were required to be in scope at that time.

The combination of the above means that the pattern splice does not bring anything into scope (because it is not compiled until the outer bracket is spliced in), which lead to x being out of scope. But now it is perfectly fine to have unbound variables in a bracket. So the above definition of baz is now accepted. When it is first renamed x is treated as an unbound variable, which is now fine, and when the bracket is spliced in, the inner splice is compiled and it correctly brings binding for x into scope. Getting nested pattern splices to work was not my intention when I started implementing this patch but it turned out we essentially got this feature for free.

One stumbling block during my work was typed Template Haskell. With normal, untyped TH I can place a splice at top-level in a file:

$$(return [
   SigD (mkName "m")
        (ForallT [PlainTV (mkName "a")]
                 []
                 (AppT (AppT ArrowT (VarT (mkName "a"))) (VarT (mkName "a"))))
 , FunD (mkName "m")
        [Clause [VarP (mkName "x")] (NormalB (VarE (mkName "x"))) [] ]
   ])

and this will build a definition that will be spliced into the source code. But converting this into a typed splice, by saying $$(return ...., resulted in compiler panic. I reported this as #10945. The reason turned out to be quite tricky. When Template Haskell is enabled, top-level expressions are allowed. Each such expression is treated as an implicit splice. The problem with typed TH splice is that it doesn’t really make sense at the top-level and it should be treated as an implicit splice. Yet it was treated as an explicit splice, which resulted in a panic later in the compiler pipeline.

Another issue that came up with typed TH was that typed holes cannot be quoted, again leading to panic. I reported this as #10946. This issue has not yet been solved.

The above work is now merged with HEAD and will be available in GHC 8.0.

Staypressed theme by Themocracy