Benchmarking GHC HEAD with Criterion
So you’re developing GHC. You make some changes that affect performance of compiled programs, but how do you check whether the performance is really improved? Well, if you’re making some general optimisations – a new Core-to-Core transformation perhaps – than you can use the NoFib benchmark suite, which is a commonly accepted method of measuring GHC performance. But what if you’re developing some very specific optimisations that are unlikely to be benchmarked by NoFib? What if you extended the compiler in a way that allows you to write faster code in a way that was previously impossible and there is now way for NoFib to measure your improvements? Sounds like writing some criterion benchmarks would be a Good Thing. There’s a problem though – installing criterion with GHC HEAD. Criterion has lots of dependencies, but you cannot install them automatically with cabal-install, because cabal-install usually doesn’t work with GHC HEAD (although the Cabal library is one of GHC boot libraries). On the other hand installing dependencies manually is a pain. Besides, many libraries will not compile with GHC HEAD. So how to write criterion benchmarks for HEAD? I faced this problem some time ago and found a solution which, although not perfect, works fine for me.
In principle my idea is nothing fancy:
- download all the required dependencies from hackage to the disk and extract them in a single directory,
- determine the order in which they need to be installed,
- build each library with GHC HEAD, resolving the build errors if necessary
- register each library with GHC HEAD (see Appendix below)
Doing these things for the first time was very tedious and took me about 2-3 hours. Determining package dependencies was probably the most time consuming. Resolving build errors wasn’t that bad, though there were a couple of difficulties. It turned out that many packages put an upper bound on the version of the base package and removing these dependency is the only change required to build that package.
The key to my solution is that once you figure out in what order packages should be installed and remove the build errors, you can write a shell script that builds and installs packages automatically. This means that after installing GHC HEAD in a sandbox (see Appendix below) you can run the script to build and install all the packages. This will give you a fully working GHC installation in which you can write Criterion benchmarks for new features that you implemented in the compiler. Here’s what the script looks like (full version available here):
#!/bin/bash PKGS="\ primitive-0.5.0.1 \ vector-0.10.0.1 \ dlist-0.5 \ vector-algorithms-0.5.4.2 \ ..." # more packages in this list if [[ $# -gt 1 ]]; then echo "Too many parameters" exit elif [[ $# -eq 1 ]]; then if [[ $1 == "clean" ]]; then echo -n "Cleaning" for i in $PKGS do echo -n "." cd $i rm -rf dist rm -f Setup Setup.o Setup.hi cd .. done echo "done" else echo "Invalid parameter: $1" exit fi else for i in $PKGS do echo "Installing package $i" cd $i ((if [[ -f Setup.lhs ]]; then ghc Setup.lhs; else ghc Setup.hs; fi) && \ ./Setup configure --user --enable-shared \ && ./Setup build && ./Setup install) \ || exit cd .. done fi |
The script is nothing elaborate. Running without any parameters will build and install all packages on the list. If you run it with “clean” parameter it will remove build artefacts from package directories. If for some reason the script fails – e.g. one of the libraries fails to build – you can comment out already installed libraries so that the script resumes from the point it previously stopped.
Summary
Using the approach described above I can finally write criterion benchmarks for GHC HEAD. There are a couple of considerations though:
- things are likely to break as HEAD gets updated. Be prepared to add new libraries as dependencies, change compilation parameters or fix new build errors,
- since some time you need to pass
--enable-sharedflag tocabal configurewhen building a shared library. This causes every library to be compiled twice. I don’t know if there’s anything one can do about that, - you need to manually download new versions of libraries,
- fixing build errors manually may not be easy,
- rerunning the script when something fails may be tedious,
- changes in HEAD might cause performance problems in libraries you are using. If this goes unnoticed the benchmarking results might be invalid (I think this problem is hypothetical).
You can download my script and the source code for all the modified packages here. I’m not giving you any guarantee that it will work for you, since HEAD changes all the time. It’s also quite possible that you don’t need some of the libraries I’m using, for example Repa.
Appendix: Sandboxing GHC
For the above method to work effectively you need to have a sandboxed installation of GHC. There are tools designed for sandboxing GHC (e.g. hsenv) but I use a method described here. It’s perfectly suited for my needs. I like to have full manual control when needed but I also have this shell script to automate switching of sandboxes:
#!/bin/bash SANDBOX_DIR="/path/to/ghc-sandbox/" ACTIVE_SYMLINK="${SANDBOX_DIR}active" STARTCOLOR="\e[32m"; ENDCOLOR="\e[0m"; active_link_name=`readlink ${ACTIVE_SYMLINK}` active_name=`basename ${active_link_name}` if [[ $# -lt 1 ]]; then for i in `ls ${SANDBOX_DIR}`; do if [[ $i != "active" ]]; then if [[ $i == $active_name ]]; then echo -e "* $STARTCOLOR$i$ENDCOLOR" else echo " $i" fi fi done exit fi for i in `ls ${SANDBOX_DIR}`; do if [[ $i == $1 ]]; then cd $SANDBOX_DIR rm ${ACTIVE_SYMLINK} ln -s $1 ${ACTIVE_SYMLINK} exit fi done echo "Sandbox $1 not found" |
It displays list of sandboxes when run without any parameter (the active sandbox is displayed in green and marked with an asterisk) and switches the active sandbox when given a command-line parameter. Together with bash auto completion feature switching between different GHC versions is a matter of seconds.


