Category: general

DeepSpec Summer School 2017 – a summary

I have spent the last two and a half week in Philadelphia attending the first DeepSpec Summer School, In this post I want to summarize the event and give an overview of all the courses.

The DeepSpec Project is a research project lead by several US East Coast universities (University of Pennsylvania, MIT, Yale University and Princeton University) and aims to “push forward the state of the art in applying computer proof assistants to verify realistic software and hardware stacks at scale”. It consists of several smaller projects, including a formal verification of a hypervisor (CertiKOS), LLVM (Vellvm), Coq compiler (CertiCoq) and GHC’s Core language (CoreSpec).

The goal of DeepSpec Summer School was to introduce people to real-life formal verification using Coq proof assistant. School was divided into three parts. All the lectures can be found on a YouTube channel. Coq code for the courses is available on GitHub. Summer school’s web page also provides installation instructions as well as other supplementary material (click on a given lecture or select from “Lectures” tab).

Week 0: Coq Intensive

First three days of the summer school were a very intensive introductory course on Coq lead by Benjamin Pierce. This essentially covered the first volume of Software Foundations. (Aside: For those of you who don’t know yet, original Software Foundations online book has been split into two volumes: Logical Foundations and Programming Language Foundations. Also, a third volume has been added to the series: Verified Functional Algorithms by Andrew Appel. All three volumes can be found here, although expect that this link will likely become broken soon when this draft versions become an official release. There are also plans for two more volumes, one on Separation Logic and another one on Systems Verification.)

Week 1: Programming Language Verification

First full week of the school consisted of four courses centred around programming language verification:

  • Property-based random testing with QuickChick by Benjamin Pierce. I assume many of you heard about Haskell library called QuickCheck. It offers property-based testing: programmer writes properties that the should hold for a given piece of code and QuickCheck tests whether they hold for randomly generated test data. QuickChick is implementation of the same idea in Coq. Now, you might wonder what is the point of doing such a thing in Coq. After all, Coq is about formally proving that a given property is always true, not randomly testing whether it holds. I was sceptical about this as well, but it actually turns to be quite a good idea. The point is, specifications are difficult to write and often even more difficult to prove. They are especially difficult to prove when they are false ;-) And this is exactly when QuickChick can be beneficial: by trying to find a counter-example for which a stated property does not hold. This can indeed save programmer from spending hours on trying to prove something that is false. If QuickChick doesn’t find a counter-example we can start writing a formal proof.
    This course also gives a nice overview of type classes in Coq.
  • The structure of verified compiler by Xavier Leroy. This series of lectures was based on CompCert, which is a formally verified C compiler. The ideas behind formal verification of a compiler were presented on a compiler of Imp (a toy imperative language used in Software Foundations) to a simple virtual machine. Fourth, final lecture covered the CompCert project itself. To me this was the most interesting course of the summer school.
  • Language specification and variable binding by Stephanie Weirich. Software Foundations is a great book, but it completely omits one topic that is very important in formalizing programming languages: dealing with variable bindings. In this courses Stephanie presented “locally nameless” representation of variable bindings. This is something I had planned to learn for a very long time but couldn’t find the time.
  • Vellvm: Verifying the LLVM by Steve Zdancewic. For a change, in this course Imp was compiled to a simplified variant of LLVM, the compilation process being verified of course. Also, a nice introduction to LLVM.

Week 2: Systems Verification

Courses during the second week put more focus on verifying computer systems. Again, there were four courses:

  • Certifying software with crashes by Frans Kaashoek and Nickolai Zeldovich. The topic of this course was certification of a hard-drive operating routines, including bad-sector remapping and a simple virtual RAID 1 implementation. Although still using toy examples, specifications presented during this course were much more abstract giving a good idea how to scale to a real-world system verification. I found this course very difficult to follow, although the lectures were really superb. Note: materials for this one course are available in a separate GitHub repo.
  • CertiKOS: Certified kit operating systems by Zhong Shao. Ok, I admit I was completely unable to follow this series of lectures. Way to difficult. In fact, I skipped two out of four lectures because I figured out it will make more sense to work on homework assignments for other lectures.
  • Program-specific proof automation by Adam Chlipala. Unsurprisingly to those who know Adam’s “Certified Programming with Dependent Types” book, his course focused on proof automation using Ltac. One lecture was specifically dedicated to proofs by reflection.
  • Verified Functional Algorithms by Andrew Appel. This course covered a majority of third volume of new Software Foundations.

Summary

First and foremost let me say this: DeepSpec Summer School was the best research meeting I have ever attended. The courses were really good and inspiring, but the most important thing that made this summer school so great were fantastic people who attended it. Spending evening hours together working on homework assignments was especially enjoyable.

There might be a 2018 edition of the summer school so be on the lookout – this is a really great event for anyone interested in Coq and formal verification.

Moving to University of Edinburgh

I wanted to let you all know that after working for 8 years as a Lecturer at the Institute of Information Technology (Lodz University of Technology, Poland), I have received a sabbatical leave to focus solely on research. Yesterday I began my work as a Research Associate at the Laboratory for Foundations of Computer Science, University of Edinburgh. This is a two-year post-doc position. I will be part of the team working on the Skye project under supervision of James Cheney. This means that from now on I will mostly focus on developing the Links programming language.

Coq’Art, CPDT and SF: a review of books on Coq proof assistant

I have been pretty quiet on the blog in the past couple of months. One of the reasons for this is that I have spent most of my time learning Coq. I had my first contact with Coq well over a year ago when I started reading CPDT. Back then I only wanted to learn the basics of Coq to see how it works and what it has to offer compared to other languages with dependent types. This time I wanted to apply Coq to some ideas I had at work, so I was determined to be much more thorough in my learning. Coq is far from being a mainstream language but nevertheless it has some really good learning resources. Today I would like to present a brief overview of what I believe are the three most important books on Coq: “Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions” (which I will briefly refer to as Coq’Art) by Yves Bertot and Pierre Castéran, “Certified Programming with Dependent Types” (CPDT) by Adam Chlipala and “Software Foundations” (SF for short) by Benjamin Pierce and over a dozen over contributors. All three books significantly differ in their scope and focus. CPDT and Coq’Art are standard, printed books. CPDT is also available online for free. Software Foundations is only available as an online book. Interestingly, there is also a version of SF that seems to be in the process of being revised.

I believe Coq’Art was the first book published on Coq. There are two editions – 2004 hardcover version and a 2010 paperback version – but as far as I know there are no differences between them. Too bad the 2010 edition was not updated for the newest versions of Coq – some of the code examples don’t work in the newest compiler. Coq’Art takes a theoretical approach, ie. it teaches Coq largely by explaining how the rules of Calculus of Constructions work. There are also practical elements like case studies and exercises but they do not dominate the book. Personally I found Coq’Art to be a very hard read. Not because it dives too much in theory – it doesn’t – but because the presentation seems to be chaotic. For example, description of a single tactic can be spread throughout deveral places in the book. In principle, I don’t object to extending earlier presentation with new details once the reader gets a hold of some new concepts, but I feel that Coq’Art definitely goes too far. Coq’Art also presents material in a very unusual order. Almost every introduction to Coq or any other functional language begins with defining data types. Coq’Art introduces them in chapter 6. On the other hand sorts and universes – something I would consider an advanced concept for anyone who is not familiar with type-level programming – are presented in the second chapter. (Note that first chapter is a very brief overview of the language.) By contrast, CPDT goes into detailed discussion of universes in chapter 12 and SF does not seem to cover them at all. Overall, Coq’Art is of limited usefulness to me. To tell the truth this is not because of its focus on theory rather than practice, but because of language style, which I find rather inaccessible. Many times I had problems understanding passages I was reading, forcing me to re-read them again and again, trying to figure out what is the message that the authors are trying to convey. I did not have such problems with CPDT, SF, nor any other book I have read in the past few years. At the moment I have given up on the idea of reading the book from cover to cover. Nevertheless I find Coq’Art a good supplementary reading for SF. Most importantly because of the sections that explain in detail the inner workings of various tactics.

As mentioned at the beginning, I already wrote a first impressions post about CPDT. Back then I said the book “is a great demonstration of what can be done in Coq but not a good explanation of how it can be done”. Having read all of it I sustain my claim. CPDT does not provide a thorough and systematic coverage of basics, but instead focuses on advanced topics. As such, it is not the best place to start for beginners but it is a priceless resource for Coq practitioners. The main focus of the book is proof automation with Ltac, Coq’s language for writing automated proof procedures. Reader is exposed to Ltac early on in the book, but detailed treatment of Ltac is delayed until chapter 14. Quite surprisingly, given that it is hard to understand earlier chapters without knowing Ltac. Luckily, the chapters are fairly independent of each other and can be read in any order the reader wishes. Definitely it is worth to dive into chapter 14 and fragments of apter 13 as early as possible – it makes understanding the book a whole lot easier. So far I have already read chapter 14 three times. As I learn Coq more and more I discover new bits of knowledge with each read. In fact, I expect to be going back regularly to CPDT.

Coq’Art and CPDT approach teaching Coq in totally different ways. It might then be surprising that Software Foundations uses yet another approach. Unlike Coq’Art it is focused on practice and unlike CPDT it places a very strong emphasis on learning the basics. I feel that SF makes Coq learning curve as flat as possible. The main focus of SF is applying Coq to formalizing programming languages semantics, especially their type systems. This should not come as a big surprise given that Benjamin Pierce, the author of SF, authored also “Types and Programming Languages” (TAPL), the best book on the topic of type systems and programming language semantics I have seen. It should not also be surprising that a huge chunk of material overlaps between TAPL and SF. I find this to be amongst the best things about SF. All the proofs that I read in TAPL make a lot more sense to me when I can convert them to a piece of code. This gives me a much deeper insight into the meaning of lemmas and theorems. Also, when I get stuck on an exercise I can take a look at TAPL to see what is the general idea behind the proof I am implementing.

SF is packed with material and thus it is a very long read. Three months after beginning the book and spending with it about two days a week I am halfway through. The main strength of SF is a plethora of exercises. (Coq’Art has some exercises, but not too many. CPDT has none). They can take a lot of time – and I really mean a lot – but I think this is the only way to learn a programming language. Besides, the exercises are very rewarding. One downside of the exercises is that the book provides no solutions, which is bad for self-studying. Moreover, the authors ask people not to publish the solutions on the internet, since “having solutions easily available makes [SF] much less useful for courses, which typically have graded homework assignments”. That being said, there are plenty of github repositories that contain the solved exercises (I also pledge guilty!). Although it goes against the authors’ will I consider it a really good thing for self-study: many times I have been stuck on exercises and was able to make progress only by peeking at someone else’s solution. This doesn’t mean I copied the solutions. I just used them to overcome difficulties and in some cases ended up with proofs more elegant than the ones I have found. As a side note I’ll add that I do not share the belief that publishing solutions on the web makes SF less useful for courses. Students who want to cheat will get the solutions from other students anyway. At least that has been my experience as an academic teacher.

To sum up, each of the books presents a different approach. Coq’Art focuses on learning Coq by understanding its theoretical foundations. SF focuses on learning Coq through practice. CPDT focuses on advanced techniques for proof automation. Personally, I feel I’ve learned the most from SF, with CPDT closely on the second place. YMMV

Staypressed theme by Themocracy