Some books about compilers

Online Compilers course at Stanford University began last Monday. Although it should be self-containing I still prefer to have some books at hand, especially that I’ll be leaving for 10 days and I want to self-study material for the next week. On Monday I’ve paid a visit to my university’s library. Here’s a quick glimpse at some literature that I’ve found.

First of all, there’s the Dragon Book, officially known as “Compilers: Principles, Techniques, and Tools”, by Aho, Lam, Sethi, and Ullman. This is a classic originally published in 1986. There’s nothing to worry though – there was no revolution in the field of compiler construction, so what was true in 1986 is mostly still valid now. The book was re-released in 2006 with some updates about garbage collection and code-optimization (OK, I lied, there was some progress in a few aspects of compiler engineering). I have a Polish edition of the first edition. In fact I borrowed it long before the course had started hoping to read it on my own. The book is very detailed and extremely verbose (750 pages total) which is a bit discouraging to me. I’ve read only a couple of sections so I don’t have a clear opinion about it yet.

The second book recommended as a textbook for the Compilers course is “Engineering a compiler” by Cooper and Torczon. This one was more difficult to find. I just got it today and haven’t read it yet. With over 800 pages it’s longer than the Dragon Book. Skimming through the book I noticed that it seems to be detailed and precise and puts emphasis on practical aspects. Chapter about lexical analysis has a section about implementing a lexer. I’m looking forward to this one, since I still don’t have a clear vision on how to implement lexer efficiently. These are only the first impressions of course. I’ll verify them when I start reading (expect a follow-up on this one read a follow up on this one).

Third of the recommended books – “Modern compiler implementation in C/Java/ML” (there are three different editions of this book) by Appel – was also available in my library. I’ve already read the chapter about lexical analysis and I think that, contrary to the Dragon Book, it was a bit too concise. It covers the subject of Deterministic Finite Automata (DFA) and Non-deterministic Finite Automata (NFA), which luckily I am familiar with (more on that in a moment), so that wasn’t much of a problem. However if this was my first contact with DFAs and NFAs I would have some trouble understanding what it’s all about. I hope that the next chapters will be more detailed. I was very happy to see that this book has a chapters about functional languages and polymorphic types. The latter one covers the subject of the Hindley-Milner type inference algorithm in a form that seems to be accessible.

Now, here’s a surprise! Just a few days ago I found about a book “The Implementation of Functional Programming Languages” by Simon Peyton Jones (see my previous post). To my greatest surprise it was in the library! I stumbled upon it by pure accident while I was searching the shelves for Appel’s book – I never would have thought that it’s available in the library of my university. Do I have to mention that I borrowed it right away? I’ve read the Introduction and some random parts of the first chapter about the lambda calculus. I must say that the book looks very accessible so I hope to find some time to read at least couple of chapters. What’s interesting about the book (or rather about functional languages) is the fact that Haskell didn’t exist when this book was written, yet much of the code in the book is almost Haskell (in fact it’s Miranda).

I’ve got one more book entitled “Compiler construction” by Waite and Goos. This one is old (1984) and, aside from universal knowledge about parsing, semantic analysis, error handling an so on, it seems to cover many aspects that are a bit outdated. I think I’ll just rely on previous books and skip reading this one.

I also enrolled to Automata course. Initially I didn’t plan to do it, but I figured out that I can just give it a try and, if I won’t have enough time, I can simply download the videos and watch them during summer holiday. Luckily the first week covers the DFAs and NFAs so I got 100% from the first quiz without even watching the lectures. I’m afraid that next weeks won’t be that easy.

Sunday web overview

Here’s an overview of some interesting web resources I’ve found recently:

  • First of all, I’ve found a great question on StackOverflow that outlines stages of mastering Haskell. According to it I’m somewhere in the beginner level.
  • I’ve encountered many interesting pages in the Haskell Wiki.
    • First of all, there is a great FAQ. I wish I found it earlier, it would save me a lot of time and effort and other’s effort as well, since I was asking some of these questions on #haskell channel.
    • There is a very short and nice introduction to I/O in Haskell. I’m already beyond that point in my Haskell education, but earlier it would have been useful.
    • A more in-depth tutorial called IO Inside. I’ve read some parts of it to clarify information from Real World Haskell.
    • A list of research papers about Haskell.
    • Tutorial on how to maintain laziness in Haskell.
    • A discussion of Monomorphism Restriction. I tried to read Haskell 2010 report to understand why the restriction exists, but I didn’t understand a thing.


Stanford opens new online courses about Compilers and Automata

A few months ago Stanford University started a great initiative: it offered free online courses for everyone. Courses were mostly about computer science stuff (Natural Language Processing, Machine Learning, Computer Vision etc.). If you didn’t take part in these courses, then you must know that they are just like normal classes at the university. There are lectures (about 2 hours per week), tests after each week, final exams and rather demanding programming assignments. I enrolled to Natural Language Processing, but I have to admit that I couldn’t keep the pace and stopped following the course after about 3 weeks. Perhaps I wasn’t motivated enough, since I NLP has no connection to my research and I wouldn’t be able to use the gained knowledge in practice. Yesterday however my motivation jumped to maximum, because I got a newsletter about a whole bunch of new courses. Among them are two courses, which are of high interest two me: Compilers and Automata.

Recently I got very interested in the compiler construction subject. I got myself the Dragon Book, but it’s really huge so I think I wouldn’t read all of it. I also visited #compilers IRC channel and got some very valuable hints on how to dig into the subject of constructing a compiler. I was suggested to begin with reading some sort of “Lisp interpreter in Lisp” tutorial. I chose Scheme as my Lisp dialect and – knowing nothing about Scheme – I read the R5RS language specification (only 50 pages!) and The Little Schemer (chapter 10 shows how to do a Scheme interpreter in Scheme). I’ve also found Write Yourself a Scheme in 48 Hours tutorial, which shows how to write Scheme interpreter in Haskell. This is a bit more advanced and, despite what author claims, is not suitable for Haskell newbies like myself. I decided to get back to that tutorial after learning some more Haskell, especially monads and Parsec, both used heavily by the author. Guys at #compilers further suggested that after reading this kind of tutorial I should read Niklaus Wirth’s “Compiler Construction”. Indeed, the book looks very accessible. Another usefull advice I received was “the older the better” since fundamentals of compilers haven’t changed much and the only substantial progress was achieved in the back-end1 development. However, the most motivating was the estimation that writing a simple compiler from scratch should take about 6 months. Nevertheless, learning everything by myself from scratch was a very slow process and I wish I had someone who would teach me. The course on Compilers from Stanford is therefore a real blessing. Well, looks like my wish was granted.

The course about Automata is also something very exciting. I have a book “Introduction to Automata Theory, Languages, and Computation” by John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman. I approached this book twice but always got stuck around third chapter. Not that I couldn’t understand it. It was my motivation that failed. Anyway, I feel that as a computer scientist I should have a basic knowledge of this subject. Believe it or not, but my studies didn’t cover the subject of automata and computations!

There’s a lot of other courses, not only about computer science, so I suggest to take a look. Perhaps you’ll find something interesting for yourself. Participating in NLP course had showed me that these courses are very demanding and require a lot of time and effort. This means that I’m going to pick only the Compilers course and concentrate solely on it, hoping that the Automata course will be repeated in the future. Both courses start this Monday (April 23rd).

  1. back-end is the part of compiler responsible for generating the final code (most often a machine code) []

Staypressed theme by Themocracy