<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Yet Another Lambda Blog &#187; haskell</title>
	<atom:link href="http://lambda.jstolarek.com/category/haskell/feed/" rel="self" type="application/rss+xml" />
	<link>http://lambda.jstolarek.com</link>
	<description>various functional programming stuff</description>
	<lastBuildDate>Thu, 09 May 2013 08:37:50 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Benchmarking GHC HEAD with Criterion</title>
		<link>http://lambda.jstolarek.com/2013/05/benchmarking-ghc-head-with-criterion/</link>
		<comments>http://lambda.jstolarek.com/2013/05/benchmarking-ghc-head-with-criterion/#comments</comments>
		<pubDate>Thu, 09 May 2013 08:37:50 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[ghc]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=1181</guid>
		<description><![CDATA[So you&#8217;re developing GHC. You make some changes that affect performance of compiled programs, but how do you check whether the performance is really improved? Well, if you&#8217;re making some general optimisations &#8211; a new Core-to-Core transformation perhaps &#8211; than you can use the NoFib benchmark suite, which is a commonly accepted method of measuring [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">So you&#8217;re developing GHC. You make some changes that affect performance of compiled programs, but how do you check whether the performance is really improved? Well, if you&#8217;re making some general optimisations &#8211; a new Core-to-Core transformation perhaps &#8211; than you can use the <a href="http://hackage.haskell.org/trac/ghc/wiki/Building/RunningNoFib">NoFib</a> benchmark suite, which is a commonly accepted method of measuring GHC performance. But what if you&#8217;re developing some very specific optimisations that are unlikely to be benchmarked by NoFib? What if you extended the compiler in a way that allows you to write faster code in a way that was previously impossible and there is now way for NoFib to measure your improvements? Sounds like writing some <a href="http://hackage.haskell.org/package/criterion">criterion</a> benchmarks would be a Good Thing. There&#8217;s a problem though &#8211; installing criterion with GHC HEAD. Criterion has lots of dependencies, but you cannot install them automatically with cabal-install, because cabal-install usually doesn&#8217;t work with GHC HEAD (although the Cabal library is one of GHC boot libraries). On the other hand installing dependencies manually is a pain. Besides, many libraries will not compile with GHC HEAD. So how to write criterion benchmarks for HEAD? I faced this problem some time ago and found a solution which, although not perfect, works fine for me.</p>
<p style="text-align: justify;">In principle my idea is nothing fancy:</p>
<ol>
<li>download all the required dependencies from hackage to the disk and extract them in a single directory,</li>
<li>determine the order in which they need to be installed,</li>
<li>build each library with GHC HEAD, resolving the build errors if necessary</li>
<li>register each library with GHC HEAD (see Appendix below)</li>
</ol>
<p style="text-align: justify;">Doing these things for the first time was very tedious and took me about 2-3 hours. Determining package dependencies was probably the most time consuming. Resolving build errors wasn&#8217;t that bad, though there were a couple of difficulties. It turned out that many packages put an upper bound on the version of the base package and removing these dependency is the only change required to build that package.</p>
<p style="text-align: justify;">The key to my solution is that once you figure out in what order packages should be installed and remove the build errors, you can write a shell script that builds and installs packages automatically. This means that after installing GHC HEAD in a sandbox (see Appendix below) you can run the script to build and install all the packages. This will give you a fully working GHC installation in which you can write Criterion benchmarks for new features that you implemented in the compiler. Here&#8217;s what the script looks like (full version available <a href="https://gist.github.com/jstolarek/5546184">here</a>):</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/bash</span>
&nbsp;
<span style="color: #007800;">PKGS</span>=<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\
</span>primitive-0.5.0.1 <span style="color: #000099; font-weight: bold;">\
</span>vector-0.10.0.1 <span style="color: #000099; font-weight: bold;">\
</span>dlist-0.5 <span style="color: #000099; font-weight: bold;">\
</span>vector-algorithms-0.5.4.2 <span style="color: #000099; font-weight: bold;">\
</span>...&quot;</span> <span style="color: #666666; font-style: italic;"># more packages in this list</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$#</span> <span style="color: #660033;">-gt</span> <span style="color: #000000;">1</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;Too many parameters&quot;</span>
    <span style="color: #7a0874; font-weight: bold;">exit</span>
<span style="color: #000000; font-weight: bold;">elif</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$#</span> <span style="color: #660033;">-eq</span> <span style="color: #000000;">1</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$1</span> == <span style="color: #ff0000;">&quot;clean&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">&quot;Cleaning&quot;</span>
        <span style="color: #000000; font-weight: bold;">for</span> i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #007800;">$PKGS</span>
        <span style="color: #000000; font-weight: bold;">do</span>
            <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">&quot;.&quot;</span>
            <span style="color: #7a0874; font-weight: bold;">cd</span> <span style="color: #007800;">$i</span>
            <span style="color: #c20cb9; font-weight: bold;">rm</span> <span style="color: #660033;">-rf</span> dist
            <span style="color: #c20cb9; font-weight: bold;">rm</span> <span style="color: #660033;">-f</span> Setup Setup.o Setup.hi
            <span style="color: #7a0874; font-weight: bold;">cd</span> ..
        <span style="color: #000000; font-weight: bold;">done</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;done&quot;</span>
    <span style="color: #000000; font-weight: bold;">else</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;Invalid parameter: $1&quot;</span>
        <span style="color: #7a0874; font-weight: bold;">exit</span>
    <span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #000000; font-weight: bold;">else</span>
    <span style="color: #000000; font-weight: bold;">for</span> i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #007800;">$PKGS</span>
    <span style="color: #000000; font-weight: bold;">do</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;Installing package <span style="color: #007800;">$i</span>&quot;</span>
        <span style="color: #7a0874; font-weight: bold;">cd</span> <span style="color: #007800;">$i</span>
        <span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #660033;">-f</span> Setup.lhs <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span> ghc Setup.lhs; <span style="color: #000000; font-weight: bold;">else</span> ghc Setup.hs; <span style="color: #000000; font-weight: bold;">fi</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #000000; font-weight: bold;">&amp;&amp;</span> \
            .<span style="color: #000000; font-weight: bold;">/</span>Setup configure <span style="color: #660033;">--user</span> <span style="color: #660033;">--enable-shared</span> \
            <span style="color: #000000; font-weight: bold;">&amp;&amp;</span> .<span style="color: #000000; font-weight: bold;">/</span>Setup build <span style="color: #000000; font-weight: bold;">&amp;&amp;</span> .<span style="color: #000000; font-weight: bold;">/</span>Setup <span style="color: #c20cb9; font-weight: bold;">install</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> \
            <span style="color: #000000; font-weight: bold;">||</span> <span style="color: #7a0874; font-weight: bold;">exit</span>
        <span style="color: #7a0874; font-weight: bold;">cd</span> ..
    <span style="color: #000000; font-weight: bold;">done</span>
<span style="color: #000000; font-weight: bold;">fi</span></pre></td></tr></table></div>

<p style="text-align: justify;">The script is nothing elaborate. Running without any parameters will build and install all packages on the list. If you run it with &#8220;<code>clean</code>&#8221; parameter it will remove build artefacts from package directories. If for some reason the script fails &#8211;  e.g. one of the libraries fails to build &#8211; you can comment out already installed libraries so that the script resumes from the point it previously stopped.</p>
<h1 style="text-align: justify;">Summary</h1>
<p style="text-align: justify;">Using the approach described above I can finally write criterion benchmarks for GHC HEAD. There are a couple of considerations though:</p>
<ul>
<li style="text-align: justify;">things are likely to break as HEAD gets updated. Be prepared to add new libraries as dependencies, change compilation parameters or fix new build errors,</li>
<li style="text-align: justify;">since some time you need to pass <code>--enable-shared</code> flag to <code>cabal configure</code> when building a shared library. This causes every library to be compiled twice. I don&#8217;t know if there&#8217;s anything one can do about that,</li>
<li style="text-align: justify;">you need to manually download new versions of libraries,</li>
<li style="text-align: justify;">fixing build errors manually may not be easy,</li>
<li style="text-align: justify;">rerunning the script when something fails may be tedious,</li>
<li style="text-align: justify;">changes in HEAD might cause performance problems in libraries you are using. If this goes unnoticed the benchmarking results might be invalid (I think this problem is hypothetical).</li>
</ul>
<p style="text-align: justify;">You can download my script and the source code for all the modified packages <a href="http://lambda.jstolarek.com/downloads/ghc-head-pkgs.tar.gz">here</a>. I&#8217;m not giving you any guarantee that it will work for you, since HEAD changes all the time. It&#8217;s also quite possible that you don&#8217;t need some of the libraries I&#8217;m using, for example <a href="http://hackage.haskell.org/package/repa">Repa</a>.</p>
<h1 style="text-align: justify;">Appendix: Sandboxing GHC</h1>
<p style="text-align: justify;">For the above method to work effectively you need to have a sandboxed installation of GHC. There are tools designed for sandboxing GHC (e.g. <a href="https://github.com/Paczesiowa/hsenv">hsenv</a>) but I use a method described <a href="http://www.edsko.net/2013/02/10/comprehensive-haskell-sandboxes/">here</a>. It&#8217;s perfectly suited for my needs. I like to have full manual control when needed but I also have this shell script to automate switching of sandboxes:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/bash</span>
&nbsp;
<span style="color: #007800;">SANDBOX_DIR</span>=<span style="color: #ff0000;">&quot;/path/to/ghc-sandbox/&quot;</span>
<span style="color: #007800;">ACTIVE_SYMLINK</span>=<span style="color: #ff0000;">&quot;<span style="color: #007800;">${SANDBOX_DIR}</span>active&quot;</span>
<span style="color: #007800;">STARTCOLOR</span>=<span style="color: #ff0000;">&quot;\e[32m&quot;</span>;
<span style="color: #007800;">ENDCOLOR</span>=<span style="color: #ff0000;">&quot;\e[0m&quot;</span>;
&nbsp;
<span style="color: #007800;">active_link_name</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">readlink</span> <span style="color: #800000;">${ACTIVE_SYMLINK}</span><span style="color: #000000; font-weight: bold;">`</span>
<span style="color: #007800;">active_name</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">basename</span> <span style="color: #800000;">${active_link_name}</span><span style="color: #000000; font-weight: bold;">`</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$#</span> <span style="color: #660033;">-lt</span> <span style="color: #000000;">1</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
  <span style="color: #000000; font-weight: bold;">for</span> i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">ls</span> <span style="color: #800000;">${SANDBOX_DIR}</span><span style="color: #000000; font-weight: bold;">`</span>; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$i</span> <span style="color: #000000; font-weight: bold;">!</span>= <span style="color: #ff0000;">&quot;active&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
      <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$i</span> == <span style="color: #007800;">$active_name</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;* <span style="color: #007800;">$STARTCOLOR</span><span style="color: #007800;">$i</span><span style="color: #007800;">$ENDCOLOR</span>&quot;</span>
      <span style="color: #000000; font-weight: bold;">else</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;  <span style="color: #007800;">$i</span>&quot;</span>
      <span style="color: #000000; font-weight: bold;">fi</span>
    <span style="color: #000000; font-weight: bold;">fi</span>
  <span style="color: #000000; font-weight: bold;">done</span>
  <span style="color: #7a0874; font-weight: bold;">exit</span>
<span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">for</span> i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">ls</span> <span style="color: #800000;">${SANDBOX_DIR}</span><span style="color: #000000; font-weight: bold;">`</span>; <span style="color: #000000; font-weight: bold;">do</span>
  <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #007800;">$i</span> == <span style="color: #007800;">$1</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>; <span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #7a0874; font-weight: bold;">cd</span> <span style="color: #007800;">$SANDBOX_DIR</span>
    <span style="color: #c20cb9; font-weight: bold;">rm</span> <span style="color: #800000;">${ACTIVE_SYMLINK}</span>
    <span style="color: #c20cb9; font-weight: bold;">ln</span> <span style="color: #660033;">-s</span> <span style="color: #007800;">$1</span> <span style="color: #800000;">${ACTIVE_SYMLINK}</span>
    <span style="color: #7a0874; font-weight: bold;">exit</span>
  <span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #000000; font-weight: bold;">done</span>
&nbsp;
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;Sandbox $1 not found&quot;</span></pre></td></tr></table></div>

<p style="text-align: justify;">It displays list of sandboxes when run without any parameter (the active sandbox is displayed in green and marked with an asterisk) and switches the active sandbox when given a command-line parameter. Together with bash auto completion feature switching between different GHC versions is a matter of seconds.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2013/05/benchmarking-ghc-head-with-criterion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Haskell as fast as C: A case study</title>
		<link>http://lambda.jstolarek.com/2013/04/haskell-as-fast-as-c-a-case-study/</link>
		<comments>http://lambda.jstolarek.com/2013/04/haskell-as-fast-as-c-a-case-study/#comments</comments>
		<pubDate>Tue, 02 Apr 2013 14:02:39 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[c]]></category>
		<category><![CDATA[ghc]]></category>
		<category><![CDATA[llvm]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=1122</guid>
		<description><![CDATA[Once in a while someone, most likely new to Haskell, asks how does Haskell performance compare with C. In fact, when I was beginning with Haskell, I asked exactly the same question. During last couple of days I&#8217;ve been playing a bit with squeezing out some performance from a very simple piece of Haskell code. [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">Once in a while someone, most likely new to Haskell, asks how does Haskell performance compare with C. In fact, when I was beginning with Haskell, I asked exactly the same question. During last couple of days I&#8217;ve been playing a bit with squeezing out some performance from a very simple piece of Haskell code. Turned out that the results I got are comparable with C so I thought I might share this. This will be a short case study, so I don&#8217;t intend to cover the whole subject of Haskell vs. C performance. There was a lot written on this already so I encourage to search through the Haskell-cafe archives as well as some other blogs. Most of all I suggest reading <a href="http://donsbot.wordpress.com/2008/05/06/write-haskell-as-fast-as-c-exploiting-strictness-laziness-and-recursion/">this</a> and <a href="http://donsbot.wordpress.com/2008/06/04/haskell-as-fast-as-c-working-at-a-high-altitude-for-low-level-performance/">this</a> post on Don Stewart&#8217;s blog.</p>
<p style="text-align: justify;">Here is my simple piece of code:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">sumSqrL <span style="color: #339933; font-weight: bold;">::</span> <span style="color: green;">&#91;</span><span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: green;">&#93;</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumSqrL <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">sum</span> <span style="color: #339933; font-weight: bold;">.</span> <span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">^</span><span style="color: red;">2</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">.</span> <span style="font-weight: bold;">filter</span> <span style="font-weight: bold;">odd</span></pre></td></tr></table></div>

<p style="text-align: justify;">It takes a list of <code>Int</code>s, removes all even numbers from it, squares the remaining odd numbers and computes the sum. This is idiomatic Haskell code: it uses built-in list processing functions from the standard Prelude and relies on function composition to get code that is both readable and modular. So how can we make that faster? The simplest thing to do is to switch to a more efficient data structure, namely an unboxed <code>Vector</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">import</span> Data<span style="color: #339933; font-weight: bold;">.</span>Vector<span style="color: #339933; font-weight: bold;">.</span>Unboxed <span style="color: #06c; font-weight: bold;">as</span> U
&nbsp;
sumSqrV <span style="color: #339933; font-weight: bold;">::</span> U<span style="color: #339933; font-weight: bold;">.</span>Vector <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumSqrV <span style="color: #339933; font-weight: bold;">=</span> U<span style="color: #339933; font-weight: bold;">.</span><span style="font-weight: bold;">sum</span> <span style="color: #339933; font-weight: bold;">.</span> U<span style="color: #339933; font-weight: bold;">.</span><span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">^</span><span style="color: red;">2</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">.</span> U<span style="color: #339933; font-weight: bold;">.</span><span style="font-weight: bold;">filter</span> <span style="font-weight: bold;">odd</span></pre></td></tr></table></div>

<p style="text-align: justify;">The code practically does not change, except for the type signature and namespace prefix to avoid clashing with the names from Prelude. As you will see in a moment this code is approximately three times faster than the one working on lists.</p>
<p style="text-align: justify;">Can we do better than that? Yes, we can. The code below is three times faster than the one using <code>Vector</code>, but there is a price to pay. We need to sacrifice modularity and elegance of the code:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">sumSqrPOp <span style="color: #339933; font-weight: bold;">::</span> U<span style="color: #339933; font-weight: bold;">.</span>Vector <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumSqrPOp vec <span style="color: #339933; font-weight: bold;">=</span> runST <span style="color: #339933; font-weight: bold;">$</span> <span style="color: #06c; font-weight: bold;">do</span>
  <span style="color: #06c; font-weight: bold;">let</span> add a x <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">do</span>
        <span style="color: #06c; font-weight: bold;">let</span> <span style="color: #339933; font-weight: bold;">!</span><span style="color: green;">&#40;</span>I# v#<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">=</span> x
            <span style="font-weight: bold;">odd</span>#     <span style="color: #339933; font-weight: bold;">=</span> v# `andI#` <span style="color: red;">1</span>#
        <span style="font-weight: bold;">return</span> <span style="color: #339933; font-weight: bold;">$</span> a <span style="color: #339933; font-weight: bold;">+</span> I# <span style="color: green;">&#40;</span><span style="font-weight: bold;">odd</span># <span style="color: #339933; font-weight: bold;">*</span># v# <span style="color: #339933; font-weight: bold;">*</span># v#<span style="color: green;">&#41;</span>
  foldM` add <span style="color: red;">0</span> vec <span style="color: #5d478b; font-style: italic;">-- replace ` with ' here</span></pre></td></tr></table></div>

<p style="text-align: justify;">This code works on an unboxed vector. The <code>add</code> function, used to fold the vector, takes an accumulator <code>a</code> (initiated to <code>0</code> in the call to <code>foldM'</code>) and an element of the vector. To check parity of the element the function unboxes it and zeros all its bits except the least significant one. If the vector element is even then <code>odd#</code> will contain <code>0</code>, if the element is odd then <code>odd#</code> will contain <code>1</code>. By multiplying square of the vector element by <code>odd#</code> we avoid a conditional branch instruction at the expense of possibly performing unnecessary multiplication and addition for even elements.</p>
<p style="text-align: justify;">Let&#8217;s see how these functions compile into Core intermediate language. The <code>sumSqrV</code> looks like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #339933; font-weight: bold;">$</span>wa <span style="color: #339933; font-weight: bold;">=</span>
  \vec <span style="color: #339933; font-weight: bold;">&gt;</span>
    <span style="color: #06c; font-weight: bold;">case</span> vec <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span> Vector vecAddressBase vecLength vecData <span style="color: #339933; font-weight: bold;">-&gt;</span>
    letrec <span style="color: green;">&#123;</span>
      workerLoop <span style="color: #339933; font-weight: bold;">=</span>
        \index acc <span style="color: #339933; font-weight: bold;">-&gt;</span>
          <span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&gt;=</span># index vecLength <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
            False <span style="color: #339933; font-weight: bold;">-&gt;</span>
              <span style="color: #06c; font-weight: bold;">case</span> indexIntArray# vecData <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># vecAddressBase index<span style="color: green;">&#41;</span>
              <span style="color: #06c; font-weight: bold;">of</span> element <span style="color: green;">&#123;</span> <span style="color: #339933; font-weight: bold;">__</span>DEFAULT <span style="color: #339933; font-weight: bold;">-&gt;</span>
              <span style="color: #06c; font-weight: bold;">case</span> remInt# element <span style="color: red;">2</span> <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
                <span style="color: #339933; font-weight: bold;">__</span>DEFAULT <span style="color: #339933; font-weight: bold;">-&gt;</span>
                  workerLoop <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># index <span style="color: red;">1</span><span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># acc <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">*</span># element element<span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>;
                <span style="color: red;">0</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> workerLoop <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># index <span style="color: red;">1</span><span style="color: green;">&#41;</span> acc
              <span style="color: green;">&#125;</span>
              <span style="color: green;">&#125;</span>;
            True <span style="color: #339933; font-weight: bold;">-&gt;</span> acc
          <span style="color: green;">&#125;</span>; <span style="color: green;">&#125;</span> <span style="color: #06c; font-weight: bold;">in</span>
    workerLoop <span style="color: red;">0</span> <span style="color: red;">0</span>
    <span style="color: green;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">while <code>sumSqrPOp</code> compiles to:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #339933; font-weight: bold;">$</span>wsumSqrPrimOp <span style="color: #339933; font-weight: bold;">=</span>
  \ vec <span style="color: #339933; font-weight: bold;">-&gt;</span>
    runSTRep
      <span style="color: green;">&#40;</span> <span style="color: green;">&#40;</span>\ <span style="color: #339933; font-weight: bold;">@</span> s<span style="color: #339933; font-weight: bold;">_</span>X1rU <span style="color: #339933; font-weight: bold;">-&gt;</span>
          <span style="color: #06c; font-weight: bold;">case</span> vec <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span> Vector vecAddressBase vecLength vecData <span style="color: #339933; font-weight: bold;">-&gt;</span>
          <span style="color: green;">&#40;</span>\ w1<span style="color: #339933; font-weight: bold;">_</span>s37C <span style="color: #339933; font-weight: bold;">-&gt;</span>
             letrec <span style="color: green;">&#123;</span>
               workerLoop <span style="color: #339933; font-weight: bold;">=</span>
                 \ state index acc <span style="color: #339933; font-weight: bold;">-&gt;</span>
                   <span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&gt;=</span># index vecLength <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
                     False <span style="color: #339933; font-weight: bold;">-&gt;</span>
                       <span style="color: #06c; font-weight: bold;">case</span> indexIntArray# vecData <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># vecAddressBase index<span style="color: green;">&#41;</span>
                       <span style="color: #06c; font-weight: bold;">of</span> element <span style="color: green;">&#123;</span> <span style="color: #339933; font-weight: bold;">__</span>DEFAULT <span style="color: #339933; font-weight: bold;">-&gt;</span>
                       workerLoop
                         state
                         <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># index <span style="color: red;">1</span><span style="color: green;">&#41;</span>
                         <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">+</span># acc <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">*</span># <span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">*</span># <span style="color: green;">&#40;</span>andI# element <span style="color: red;">1</span><span style="color: green;">&#41;</span> element<span style="color: green;">&#41;</span> element<span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
                       <span style="color: green;">&#125;</span>;
                     True <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span># state<span style="color: #339933; font-weight: bold;">,</span> I# acc #<span style="color: green;">&#41;</span>
                   <span style="color: green;">&#125;</span>; <span style="color: green;">&#125;</span> <span style="color: #06c; font-weight: bold;">in</span>
             workerLoop w1<span style="color: #339933; font-weight: bold;">_</span>s37C <span style="color: red;">0</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
          <span style="color: green;">&#125;</span><span style="color: green;">&#41;</span>
       <span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">I cleaned up the code a bit to make it easier to read. In the second version there is some noise from the ST monad, but aside from that both pieces of code are very similar. They differ in how the worker loop is called inside the most nested case expression. First version does a conditional call of one of the two possible calls to <code>workerLoop</code>, whereas the second version does an unconditional call. This may seem not much, but it turns out that this makes the difference between the code that is comparable in performance with C and code that is three times slower.</p>
<p style="text-align: justify;">Let&#8217;s take a look at the assembly generated by the LLVM backend. The main loop of <code>sumSqrV</code> compiles to:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="asm" style="font-family:monospace;">LBB1_4<span style="color: #339933;">:</span>
    imulq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdx</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdx</span>
    addq     <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdx</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span>
<span style="color: #339933;">.</span>LBB1_1<span style="color: #339933;">:</span>
    leaq    <span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">r8</span><span style="color: #339933;">,%</span><span style="color: #46aa03; font-weight: bold;">rsi</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdx</span>
    leaq    <span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rcx</span><span style="color: #339933;">,%</span><span style="color: #46aa03; font-weight: bold;">rdx</span><span style="color: #339933;">,</span><span style="color: #ff0000;">8</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdi</span>
    <span style="color: #339933;">.</span><span style="color: #0000ff; font-weight: bold;">align</span>  <span style="color: #ff0000;">16</span><span style="color: #339933;">,</span> <span style="color: #ff0000;">0x90</span>
<span style="color: #339933;">.</span>LBB1_2<span style="color: #339933;">:</span>
    cmpq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rax</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rsi</span>
    <span style="color: #00007f; font-weight: bold;">jge</span>     <span style="color: #339933;">.</span>LBB1_5
    incq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rsi</span>
    <span style="color: #b00040;">movq</span>    <span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdi</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdx</span>
    addq    <span style="color: #0000ff; font-weight: bold;">$</span><span style="color: #ff0000;">8</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdi</span>
    testb   <span style="color: #0000ff; font-weight: bold;">$</span><span style="color: #ff0000;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">dl</span>
    <span style="color: #00007f; font-weight: bold;">je</span>      <span style="color: #339933;">.</span>LBB1_2
    <span style="color: #00007f; font-weight: bold;">jmp</span>     <span style="color: #339933;">.</span>LBB1_4</pre></td></tr></table></div>

<p style="text-align: justify;">While the main loop of <code>sumSqrPOp</code> compiles to:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="asm" style="font-family:monospace;"><span style="color: #339933;">.</span>LBB0_4<span style="color: #339933;">:</span>
    <span style="color: #b00040;">movq</span>    <span style="color: #009900; font-weight: bold;">&#40;</span><span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rsi</span><span style="color: #009900; font-weight: bold;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span>
    <span style="color: #b00040;">movq</span>    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rax</span>
    imulq   <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rax</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rax</span>
    andq    <span style="color: #0000ff; font-weight: bold;">$</span><span style="color: #ff0000;">1</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span>
    negq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span>
    andq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rax</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span>
    addq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rbx</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rcx</span>
    addq    <span style="color: #0000ff; font-weight: bold;">$</span><span style="color: #ff0000;">8</span><span style="color: #339933;">,</span> <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rsi</span>
    decq    <span style="color: #339933;">%</span><span style="color: #46aa03; font-weight: bold;">rdi</span>
    <span style="color: #00007f; font-weight: bold;">jne</span>     <span style="color: #339933;">.</span>LBB0_4</pre></td></tr></table></div>

<p style="text-align: justify;">No need to be an assembly expert to see that the second version is much more dense.</p>
<p style="text-align: justify;">I promised you comparison with C. Here&#8217;s the code:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">long</span> <span style="color: #993333;">int</span> c_sumSqrC<span style="color: #009900;">&#40;</span> <span style="color: #993333;">long</span> <span style="color: #993333;">int</span><span style="color: #339933;">*</span> xs<span style="color: #339933;">,</span> <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> xn <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> index   <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> result  <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> element <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
 Loop<span style="color: #339933;">:</span>
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span>index <span style="color: #339933;">==</span> xn<span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">goto</span> Return<span style="color: #339933;">;</span>
  element <span style="color: #339933;">=</span> xs<span style="color: #009900;">&#91;</span>index<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
  index<span style="color: #339933;">++;</span>
  <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #009900;">&#40;</span>0x1L <span style="color: #339933;">&amp;</span>amp<span style="color: #339933;">;</span> element<span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #0000dd;">0</span><span style="color: #009900;">&#41;</span> <span style="color: #b1b100;">goto</span> Loop<span style="color: #339933;">;</span>
  result <span style="color: #339933;">+=</span> element <span style="color: #339933;">*</span> element<span style="color: #339933;">;</span>
  <span style="color: #b1b100;">goto</span> Loop<span style="color: #339933;">;</span>
 Return<span style="color: #339933;">:</span>
  <span style="color: #b1b100;">return</span> result<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">You&#8217;re probably wondering why the hell did I use <code>goto</code>s. The reason is that the whole idea of this sum-square-of-odds function was taken from the paper <a href="http://dl.acm.org/citation.cfm?id=102806">&#8220;Automatic transformation of series expressions into loops&#8221;</a> by Richard Waters and I intended to closely mimic the solution produced by his fusion framework.</p>
<p style="text-align: justify;">I used criterion to compare the performance of four presented implementations: based on list, base on vector, based on vector using <code>foldM</code>+primops and C. I used FFI to call C implementation from Haskell so that I can benchmark it with criterion as well. Here are the results for a list/vector containing one million elements:</p>
<p style="text-align: center;"><a href="http://lambda.jstolarek.com/wp-content/uploads/2013/04/sumsqrperf.png"><img class="aligncenter  wp-image-1123" alt="Performance of sumSqr" src="http://lambda.jstolarek.com/wp-content/uploads/2013/04/sumsqrperf.png" width="545" height="70" /></a></p>
<p style="text-align: justify;">C version is still faster than the one based on primops by about 8%. I think this is a very good achievement given that the version based on Vector library is three times slower.</p>
<h1 style="text-align: justify;">A few words of summary</h1>
<p style="text-align: justify;">The <a href="http://hackage.haskell.org/package/vector">vector</a> library uses stream fusion under the hood to optimize the code working on vectors. In the blog posts I mentioned in the beginning Don Stewart talks a bit about stream fusion, but if you want to learn more you&#8217;ll probably be interested in two papers: <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7401">Stream Fusion. From Lists to Streams to Nothing at All</a> and  <a href="http://www.eecs.harvard.edu/~mainland/publications/mainland12simd.pdf">Haskell Beats C Using Generalized Stream Fusion</a>. My <code>sumSqrPOp</code> function, although as fast as C, is in fact pretty ugly and I wouldn&#8217;t recommend anyone to write Haskell code in such a way. You might have realized that while efficiency of <code>sumSqrPOp</code> comes from avoiding the conditional instruction within the loop, the C version does in fact use the conditional instruction within the loop to determine the parity of the vector element. The interesting thing is that this conditional is eliminated by <code>gcc</code> during the compilation.</p>
<p style="text-align: justify;">As you can see it might be possible to write Haskell code that is as fast as C. The bad thing is that to get efficient code you might be forced to sacrifice the elegance and abstraction of functional programming. I hope that one day Haskell will have a fusion framework capable of doing more optimisations than the frameworks existing today and that we will be able to have both the elegance of code and high performance. After all, if <code>gcc</code> is able to get rid of unnecessary conditional instructions then it should be possible to make GHC do the same.</p>
<h1 style="text-align: justify;">A short appendix</h1>
<p style="text-align: justify;">To dump Core produced by GHC use <code>-ddump-simpl</code> flag during compilation. I also recommend using <code>-dsuppress-all</code> flag, which suppresses all information about types &#8211; this makes the Core much easier to read.</p>
<p style="text-align: justify;">To dump the assembly produced by GHC use <code>-ddump-asm</code> flag. When compiling with LLVM backend you need to use <code>-keep-s-files</code> flag instead.</p>
<p style="text-align: justify;">To disassemble compiled object files (e.g. compiled C files) use the <code>objdump -d</code> command.</p>
<h1 style="text-align: justify;">Update &#8211; discussion on Reddit</h1>
<p style="text-align: justify;">There was some discussion about this post on <a href="http://www.reddit.com/r/haskell/comments/1bikvs/yet_another_lambda_blog_haskell_as_fast_as_c_a/">reddit</a> and I&#8217;d like to address some of the objections that were raised there and in the comments below.</p>
<p style="text-align: justify;">Mikhail Glushenkov pointed out that the following Haskell code produces the same result as my <code>sumSqrPOp</code> function:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">sumSqrB <span style="color: #339933; font-weight: bold;">::</span> U<span style="color: #339933; font-weight: bold;">.</span>Vector <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumSqrB <span style="color: #339933; font-weight: bold;">=</span> U<span style="color: #339933; font-weight: bold;">.</span><span style="font-weight: bold;">sum</span> <span style="color: #339933; font-weight: bold;">.</span> U<span style="color: #339933; font-weight: bold;">.</span><span style="font-weight: bold;">map</span> <span style="color: green;">&#40;</span>\x <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">.</span>&amp;<span style="color: #339933; font-weight: bold;">.</span> <span style="color: red;">1</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">*</span> x <span style="color: #339933; font-weight: bold;">*</span> x<span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">I admit I didn&#8217;t notice this simple solution and could have come with a better example were such a solution would not be possible.</p>
<p style="text-align: justify;">There was a request to compare performance with idiomatic C code, because the C code I have shown clearly is not idiomatic. So here&#8217;s the most idiomatic C code I can come up with (not necessarily the fastest one):

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">long</span> <span style="color: #993333;">int</span> c_sumSqrC<span style="color: #009900;">&#40;</span> <span style="color: #993333;">long</span> <span style="color: #993333;">int</span><span style="color: #339933;">*</span> xs<span style="color: #339933;">,</span> <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> xn <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> result <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> i <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
  <span style="color: #993333;">long</span> <span style="color: #993333;">int</span> e<span style="color: #339933;">;</span>
  <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span> <span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> xn<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    e <span style="color: #339933;">=</span> xs<span style="color: #009900;">&#91;</span> i <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span> e <span style="color: #339933;">%</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">!=</span> <span style="color: #0000dd;">0</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
      result <span style="color: #339933;">+=</span> e <span style="color: #339933;">*</span> e<span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#125;</span>
  <span style="color: #b1b100;">return</span> result<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">The performance turns out to be the same as before (&#8220;Bits&#8221; represents Mikhail Glushenkov&#8217;s solution, &#8220;C&#8221; now represents the new C code):</p>
<p><a href="http://lambda.jstolarek.com/wp-content/uploads/2013/04/sumsqrperf1.png"><img src="http://lambda.jstolarek.com/wp-content/uploads/2013/04/sumsqrperf1.png" alt="sumsqrperf" width="545" height="75" class="aligncenter size-full wp-image-1155" /></a></p>
<p style="text-align: justify;">There was a suggestion to use the following C code:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #b1b100;">for</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">int</span> i <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> xn<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    result <span style="color: #339933;">+=</span> xs<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">*</span> xs<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">*</span> <span style="color: #009900;">&#40;</span>xs<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">&amp;</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Author claims that this code is faster than the version I proposed, but I cannot confirm that on my machine &#8211; I get results that are noticeably slower (2.7ms vs 1.7ms for vectors of 1 million elements). Perhaps this comes from me using GCC 4.5, while the latest available version is 4.8.</p>
<p style="text-align: justify;">Finally, there were questions about overhead added by calling C code via FFI. I was concerned with this also when I first wanted to benchmark my C code via FFI. After making some experiments it turned out that this overhead is so small that it can be ignored. For more information see <a href="http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/">this post</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2013/04/haskell-as-fast-as-c-a-case-study/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Don&#8217;t panic! It&#8217;s only an upgrade</title>
		<link>http://lambda.jstolarek.com/2013/02/dont-panic-its-only-an-upgrade/</link>
		<comments>http://lambda.jstolarek.com/2013/02/dont-panic-its-only-an-upgrade/#comments</comments>
		<pubDate>Fri, 15 Feb 2013 14:07:47 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ghc]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=1079</guid>
		<description><![CDATA[Time for another upgrade of my GHC installation. OK, I know I already posted about this twice but yet again the process was different from the previous ones. My first attempts of installing GHC and the Haskell Platform a year ago relied on using packages from my distribution&#8217;s repository. This quickly turned out to be [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">Time for another upgrade of my GHC installation. OK, I know I already posted about this twice but yet again the process was different from the previous ones.</p>
<p style="text-align: justify;">My <a href="http://lambda.jstolarek.com/2012/03/installing-ghc-on-opensuse-linux/">first attempts of installing GHC and the Haskell Platform</a> a year ago relied on using packages from my distribution&#8217;s repository. This quickly turned out to be problematic so I decided for a direct installation of the Haskell Platform. This worked perfectly fine except for the fact that Haskell packages were installed in different subdirectories of <code>/usr/local</code>, which lead to a bit of a mess and problems with controlling what is installed where (this is useful if you want to remove a package). So the second time I was installing Haskell Platform <a href="http://lambda.jstolarek.com/2012/06/upgrading-haskell-platform-on-opensuse/">I was smarter and refined the whole process</a>. This time I confined the installation to a single directory so that both GHC and all the packages are located in a single, easy to find place.</p>
<p style="text-align: justify;">Yesterday I figured out it would be great to get a new version of GHC. GHC 7.6.1 was released on 6th September 2012 and the updated 7.6.2 version is only two weeks old. While GHC 7.6.1 has been out for over 5 months it is still not part of the Haskell Platform <a href="http://trac.haskell.org/haskell-platform/wiki/ReleaseTimetable">and it won&#8217;t be for the next three months</a>. That&#8217;s too long a wait for me so I decided to send the Platform to <code>/dev/null</code> and just install GHC and its environment from scratch.</p>
<p style="text-align: justify;">My plan to install GHC from precompiled binaries went up the spout:</p>
<blockquote>
<p style="text-align: justify;">This build requires <code>libgmp.so.3</code>.</p>
</blockquote>
<p style="text-align: justify;">Watwatwat? Now what is that supposed to mean? Previously released binaries didn&#8217;t depend on one particular version of <code>libgmp</code> library. Of course my system has <code>libgmp.so.10</code> and any attempt to install an older version results in breaking package dependencies. I downloaded binaries anyway and tried to run them:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">[killy@xerxes : ~/ghc-7.6.2/ghc/stage2/build/tmp] ./ghc-stage2 --interactive
 ./ghc-stage2: error while loading shared libraries: libgmp.so.3: cannot open shared object file: No such file or directory</pre></td></tr></table></div>

<p style="text-align: justify;">OK, so that requirement is true &#8211; you need the exact version of <code>libgmp</code>. So what now? I know! Compilation from sources! I&#8217;ve been hacking on GHC recently so I already have sources on my drive. Unfortunately it turned out that after switching GHC repo and all its subrepos to <code>ghc-7.6</code> branch I get some compilation errors. I wasn&#8217;t in the mood for debugging this so I switched everything back to master and <a href="http://www.haskell.org/ghc/dist/stable/dist/">downloaded the source snapshot</a>. From now on things are easy, assuming that you already have an older version of GHC on your system. After extracting the sources I copied <code>$(TOP)/</code><code>mk/</code><code>build.mk.sample</code> to <code>$(TOP)/</code><code>mk/</code><code>build.mk</code> (<code>$(TOP)</code> refers to directory containing GHC sources) and uncommented the line <code>BuildFlavour = </code><code>perf-llvm</code>. This gives me fully optimized build using LLVM. Now the compilation:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">perl boot
./configure --prefix=/usr/local/ghc-7.6.2
make</pre></td></tr></table></div>

<p style="text-align: justify;">This will build GHC and prepare it for installation in <code>/usr/local/ghc-7.6.2</code>. Fully optimized build takes much over an hour on all 4 cores. After the build is done all one needs to do is run <code>make install</code> as root. At this stage old GHC can be removed. You of course need to add <code>/usr/</code><code>local/</code><code>ghc-7.6.2/bin</code> to <code>PATH</code> environmental variable. As I already have mentioned I have the habit of installing all the packages system-wide in a single directory. For that I need to edit <code>/root/.cabal/config</code> file by adding the following entry:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">install-dirs global
    prefix:/usr/local/ghc-7.6.2</pre></td></tr></table></div>

<p style="text-align: justify;">All that is left now is installing <a href="http://hackage.haskell.org/package/cabal-install">cabal-install</a>. Grab the sources from hackage, extract them and run (as root) <code>sh bootstrap.sh --global</code> in the source directory. This installs cabal-install with its dependencies. Now you can start installing other packages that you need (a.k.a. compile the World).</p>
<p style="text-align: justify;">This completes Yet Another Installation of GHC.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2013/02/dont-panic-its-only-an-upgrade/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Taking magic out of GHC or: Tracing compilation by transformation</title>
		<link>http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/</link>
		<comments>http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/#comments</comments>
		<pubDate>Sat, 26 Jan 2013 10:10:08 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[compilers]]></category>
		<category><![CDATA[ghc]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=908</guid>
		<description><![CDATA[When I was planning to learn about compilers I heard one phrase that still sticks in my mind: Taking compilers course is a good thing because it takes magic out of compilers. It may have been put into words differently but this was the meaning. Having taken compilers course and learning about GHC&#8217;s internals I [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">When I was planning to learn about compilers I heard one phrase that still sticks in my mind:</p>
<blockquote>
<p style="text-align: justify;">Taking compilers course is a good thing because it takes magic out of compilers.</p>
</blockquote>
<p style="text-align: justify;">It may have been put into words differently but this was the meaning. Having taken compilers course and learning about GHC&#8217;s internals I think this is so true. A compiler is no longer a black box. It&#8217;s just a very complicated program built according to some general rules. In this post I want to reveal a little bit of magic that GHC does behind the scenes when compiling your Haskell program.</p>
<p style="text-align: justify;">Imagine you are writing an image processing algorithm and you want to check whether pixel coordinates <code>x</code> and <code>y</code> are within the image. That&#8217;s simple: if any of these coordinates is less than <code>0</code> or if <code>x</code> equals or exceeds image width or if <code>y</code> equals or exceeds image height than coordinates are not within the image. Here&#8217;s how we can express this condition in Haskell:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1 <span style="color: #5d478b; font-style: italic;">-- do this when (x,y) is within the image</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2 <span style="color: #5d478b; font-style: italic;">-- do that when (x,y) is outside of image</span></pre></td></tr></table></div>

<p style="text-align: justify;">When compiling a Haskell program GHC uses language called Core as an intermediate representation. It&#8217;s a very simplified<sup><a href="http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/#footnote_0_908" id="identifier_0_908" class="footnote-link footnote-identifier-link" title="We usually call it &ldquo;desugared&rdquo;, because simplifying Haskell to Core simply removes syntactic sugar.">1</a></sup> form of Haskell that has only let and case expressions and type annotations. You don&#8217;t need any knowledge of Core to understand this post but if you want to learn more I suggest to start with <a href="http://www.haskellforall.com/2012/10/hello-core.html">a blog post by Gabriel Gonzalez</a> and then take a look at <a href="http://blog.ezyang.com/2011/04/tracing-the-compilation-of-hello-factorial/">Edward Z. Yang&#8217;s post</a>, that also shows GHC&#8217;s other intermediate languages: STG and Cmm. You can see Core representation to which your program was transformed by passing <code>-ddump-simpl</code> flag to GHC (I also use <code>-dsuppress-al</code>l to get rid of all type informations that obscure the output). If you compile above case expression with optimisations (pass the <code>-O2</code> option to GHC) you end up with following Core representation:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&lt;</span># x <span style="color: red;">0</span># <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
  False <span style="color: #339933; font-weight: bold;">-&gt;</span>
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&gt;=</span># x width <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
      False <span style="color: #339933; font-weight: bold;">-&gt;</span>
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&lt;</span># y <span style="color: red;">0</span># <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
          False <span style="color: #339933; font-weight: bold;">-&gt;</span>
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: #339933; font-weight: bold;">&gt;=</span># y height <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #339933; font-weight: bold;">_</span> <span style="color: green;">&#123;</span>
              False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1;
              True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2;
            <span style="color: green;">&#125;</span>;
          True <span style="color: #339933; font-weight: bold;">-&gt;</span> e2;
        <span style="color: green;">&#125;</span>;
      True <span style="color: #339933; font-weight: bold;">-&gt;</span> e2;
    <span style="color: green;">&#125;</span>;
  True <span style="color: #339933; font-weight: bold;">-&gt;</span> e2;
<span style="color: green;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">GHC turned our infix comparison operators into prefix notation. It also unboxed integer variables. This can be noticed by <code>#</code> appended to integer literals and comparison operators. There&#8217;s also some syntactic change in <code>case</code> expressions: there are braces surrounding the branches, semicolon is used to delimit branches from each other and there is a mysterious underscore after the keyword <code>of</code>. We can rewrite this in a more familiar Haskell syntax (I will also reverse the order of <code>True</code> and <code>False</code> branches &#8211; it will be more readable):</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span> <span style="color: #06c; font-weight: bold;">of</span>
  True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
  False <span style="color: #339933; font-weight: bold;">-&gt;</span>
    <span style="color: #06c; font-weight: bold;">case</span> x <span style="color: #339933; font-weight: bold;">&gt;=</span> width <span style="color: #06c; font-weight: bold;">of</span> 
      True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
      False <span style="color: #339933; font-weight: bold;">-&gt;</span>
        <span style="color: #06c; font-weight: bold;">case</span> y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span> <span style="color: #06c; font-weight: bold;">of</span>
          True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
          False <span style="color: #339933; font-weight: bold;">-&gt;</span>
            <span style="color: #06c; font-weight: bold;">case</span> y <span style="color: #339933; font-weight: bold;">&gt;=</span> height <span style="color: #06c; font-weight: bold;">of</span>
              True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
              False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1</pre></td></tr></table></div>

<p style="text-align: justify;">The most noticeable thing however is that our original <code>case</code> expression has suddenly turned into four nested <code>case</code>s, which resulted in duplicating expression <code>e2</code> four times. In this post I will show you how GHC arrived at this representation.</p>
<h1 style="text-align: justify;">A bit of theory</h1>
<p style="text-align: justify;">There are some things you need to know in order to understand how GHC transformed the code. First is the definition of <code>(||)</code> operator (logical or):</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">||</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Bool</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Bool</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Bool</span>
<span style="color: green;">&#40;</span><span style="color: #339933; font-weight: bold;">||</span><span style="color: green;">&#41;</span> x y <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #06c; font-weight: bold;">case</span> x <span style="color: #06c; font-weight: bold;">of</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> y</pre></td></tr></table></div>

<p style="text-align: justify;">When optimizations are turned on GHC performs inlining of short functions. This means that function calls are replaced by function definitions and this will be the case with <code>(||)</code> function.</p>
<p style="text-align: justify;">Second thing is case-to-case code transformation. Imagine a code fragment like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
  <span style="color: #06c; font-weight: bold;">case</span> C <span style="color: #06c; font-weight: bold;">of</span> 
      B1 <span style="color: #339933; font-weight: bold;">-&gt;</span> F1
      B2 <span style="color: #339933; font-weight: bold;">-&gt;</span> F2
 <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
    A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2</pre></td></tr></table></div>

<p style="text-align: justify;">We have a <code>case</code> expression nested within a scrutinee<sup><a href="http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/#footnote_1_908" id="identifier_1_908" class="footnote-link footnote-identifier-link" title="A scrutinee is an expression which value is checked by the the case expression. Scrutinee is placed between words &lsquo;case&rsquo; and &lsquo;of&rsquo;.">2</a></sup> of another <code>case</code> expression. You may be thinking that you would never write such a code and you are right. GHC however compiles programs by performing subsequent Core-to-Core transformations and such nesting of <code>case</code> expressions is often generated during that process (as we will see in a moment). If nested <code>case</code> expressions appear in the Core representation of a program they are turned inside out by case-of-case transformation: the nested <code>case</code> scrutinising <code>C</code> becomes the outer <code>case</code> expression and the outer case expression is pushed into branches <code>B1</code> and <code>B2</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> C <span style="color: #06c; font-weight: bold;">of</span>    
    B1 <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">case</span> F1 <span style="color: #06c; font-weight: bold;">of</span>
              A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
              A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2
    B2 <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">case</span> F2 <span style="color: #06c; font-weight: bold;">of</span>
              A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
              A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2</pre></td></tr></table></div>

<p style="text-align: justify;">You see that code for <code>E1</code> and <code>E2</code> has been duplicated. This is worst case scenario. In real life programs one of the branches can often be simplified using case-of-known-constructor transformation. See what happens when expression returned by a branch of nested case is a constructor that is matched by outer case (<code>A1</code> in this example):</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
  <span style="color: #06c; font-weight: bold;">case</span> C <span style="color: #06c; font-weight: bold;">of</span> 
      B1 <span style="color: #339933; font-weight: bold;">-&gt;</span> A1
      B2 <span style="color: #339933; font-weight: bold;">-&gt;</span> F2
 <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
    A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2</pre></td></tr></table></div>

<p style="text-align: justify;">After performing case-of-case transformation we end up with:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> C <span style="color: #06c; font-weight: bold;">of</span>    
    B1 <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">case</span> A1 <span style="color: #06c; font-weight: bold;">of</span>
              A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
              A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2
    B2 <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">case</span> F2 <span style="color: #06c; font-weight: bold;">of</span>
              A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
              A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2</pre></td></tr></table></div>

<p style="text-align: justify;">In the first branch of outer case expression we are now matching <code>A1</code> against <code>A1</code>. So we know that first branch will be taken and thus can get rid of this <code>case</code> expression reducing it to <code>E1</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> C <span style="color: #06c; font-weight: bold;">of</span>    
    B1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
    B2 <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #06c; font-weight: bold;">case</span> F2 <span style="color: #06c; font-weight: bold;">of</span>
              A1 <span style="color: #339933; font-weight: bold;">-&gt;</span> E1
              A2 <span style="color: #339933; font-weight: bold;">-&gt;</span> E2</pre></td></tr></table></div>

<p style="text-align: justify;">Thus only <code>E1</code> was duplicated. We will see that happen a lot in a moment.</p>
<h1 style="text-align: justify;">The fun begins</h1>
<p style="text-align: justify;">Knowing all this we can begin optimizing our code:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">First thing that happens is inlining of <code>(||)</code> operators. The call to <code>(x < 0) || (x >= width)</code> is replaced by definition of <code>(||)</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>  <span style="color: #5d478b; font-style: italic;">-- this case comes from definition of ||</span>
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">Let&#8217;s inline next use of <code>(||)</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>  <span style="color: #5d478b; font-style: italic;">-- this case is introduced by inlining of second ||</span>
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span>
        <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span> 
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">||</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">One more inlining and where done with <code>(||)</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>  <span style="color: #5d478b; font-style: italic;">-- this case is introduced by inlining of last ||</span>
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span>
            <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span> 
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
        <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span> 
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">We ended up with three <code>case</code>s nested as scrutinees of other <code>case</code>s &#8211; I told you this will happen. Now GHC will start applying case-of-case transformation to get rid of all this nesting. Let&#8217;s focus on two most internal <code>case</code>s for simplicity:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span> 
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Performing case-of-case transformation on them gives:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>  <span style="color: #5d478b; font-style: italic;">-- nested case is floated out</span>
    True <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> True <span style="color: #06c; font-weight: bold;">of</span>  <span style="color: #5d478b; font-style: italic;">-- outer case is pushed into this branch...</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span> <span style="color: #5d478b; font-style: italic;">-- ...and into this branch</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Looking at first nested <code>case</code> we see that case-of-known-constructor transformation can be applied:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True  <span style="color: #5d478b; font-style: italic;">-- case-of-known-constructor eliminated </span>
                   <span style="color: #5d478b; font-style: italic;">-- case expression in this branch</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Now let&#8217;s put these <code>case</code>s back into our expression:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                    False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
        <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">Now we only have two <code>case</code>s nested as scrutinees of other <code>case</code>. Applying case-of-case one more time will get rid of the first nesting:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
        True <span style="color: #339933; font-weight: bold;">-&gt;</span>  <span style="color: #5d478b; font-style: italic;">-- we can use case-of-known-constructor here</span>
            <span style="color: #06c; font-weight: bold;">case</span> True <span style="color: #06c; font-weight: bold;">of</span>  
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> <span style="color: #5d478b; font-style: italic;">-- these nested cases weren't here before!</span>
                <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                    True <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                    False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span>
                <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">Hey, that&#8217;s something new here! We eliminated nested <code>case</code>s in one place only to introduce them in another. But we know what to do with nested <code>case</code>s &#8211; use case-of-case of course. Let&#8217;s apply it to the second branch and case-of-known-constructor to the first one:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True  <span style="color: #5d478b; font-style: italic;">-- case-of-known-constructor used here</span>
        False <span style="color: #339933; font-weight: bold;">-&gt;</span>       
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>    <span style="color: #5d478b; font-style: italic;">-- case-of-case used here</span>
                True <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                    <span style="color: #06c; font-weight: bold;">case</span> True <span style="color: #06c; font-weight: bold;">of</span>    <span style="color: #5d478b; font-style: italic;">-- what about this case?</span>
                        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">We just got another chance to perform case-of-known-constructor:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span> 
    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
        False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True  <span style="color: #5d478b; font-style: italic;">-- case-of-known-constructor</span>
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                   <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                       True <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                       False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">We have on more nested <code>case</code> to eliminate. Let&#8217;s hit it:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    True <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> True <span style="color: #06c; font-weight: bold;">of</span>
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>
            <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                        True <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
                <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
                True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">Do you see how expressions <code>e1</code> and <code>e2</code> got duplicated? Let&#8217;s apply case-of-known-constructor in the first branch and case-of-case + case-of-known-constructor in the second one:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>
                    <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                        True  <span style="color: #339933; font-weight: bold;">-&gt;</span> True
                        False <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span>
                    <span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                    False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
                    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">One more case-of-case:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
            False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                    True <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                        <span style="color: #06c; font-weight: bold;">case</span> True <span style="color: #06c; font-weight: bold;">of</span>
                            False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
                            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
                    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                        <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                            False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
                            True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">And one more case-of-known-constructor:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
    True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
    False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
         <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>x <span style="color: #339933; font-weight: bold;">&gt;=</span> width<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
             True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
             False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                 <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&lt;</span> <span style="color: red;">0</span><span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                     True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2
                     False <span style="color: #339933; font-weight: bold;">-&gt;</span> 
                         <span style="color: #06c; font-weight: bold;">case</span> <span style="color: green;">&#40;</span>y <span style="color: #339933; font-weight: bold;">&gt;=</span> height<span style="color: green;">&#41;</span> <span style="color: #06c; font-weight: bold;">of</span>
                             False <span style="color: #339933; font-weight: bold;">-&gt;</span> e1
                             True  <span style="color: #339933; font-weight: bold;">-&gt;</span> e2</pre></td></tr></table></div>

<p style="text-align: justify;">And we&#8217;re done! We arrived at the same expression that GHC compiled. Wasn&#8217;t that simple?</p>
<h1>Summary</h1>
<p style="text-align: justify;">This should give you an idea of how GHC&#8217;s core-to-core transformations work. I&#8217;ve only shown you two of them &#8211; case-of-case and case-of-known-constructor &#8211; but there are many more. If you&#8217;re interested in learning others take a look at paper by Simon Peyton Jones and Andre Satnos <a href="http://research.microsoft.com/pubs/67064/comp-by-trans-scp.ps.gz">&#8220;A transformation-based optimiser for Haskell&#8221;</a>. If you want to learn more details than the paper provides see Andre Santos&#8217; PhD thesis <a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/santos-thesis.ps.gz">&#8220;Compilation by Transformation in Non-Strict Functional Languages&#8221;</a>.  You can also take a look at a discussion at GHC <a href="http://hackage.haskell.org/trac/ghc/ticket/6135">ticket 6135</a>.</p>
<ol class="footnotes"><li id="footnote_0_908" class="footnote">We usually call it &#8220;desugared&#8221;, because simplifying Haskell to Core simply removes syntactic sugar.</li><li id="footnote_1_908" class="footnote">A scrutinee is an expression which value is checked by the the <code>case</code> expression. Scrutinee is placed between words &#8216;case&#8217; and &#8216;of&#8217;.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2013/01/taking-magic-out-of-ghc-or-tracing-compilation-by-transformation/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Real World Haskell &#8211; impressions after initial chapters</title>
		<link>http://lambda.jstolarek.com/2013/01/real-world-haskell-impressions-after-initial-chapters/</link>
		<comments>http://lambda.jstolarek.com/2013/01/real-world-haskell-impressions-after-initial-chapters/#comments</comments>
		<pubDate>Sun, 06 Jan 2013 16:33:19 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=213</guid>
		<description><![CDATA[December was a busy month for me with not much time for blogging. It looks that January will be even more busy, but I want to write some posts I didn&#8217;t have time to finish for the last couple of months. I have some book reviews on my mind and today I will begin with [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">December was a busy month for me with not much time for blogging. It looks that January will be even more busy, but I want to write some posts I didn&#8217;t have time to finish for the last couple of months. I have some book reviews on my mind and today I will begin with Real World Haskell.</p>
<p style="text-align: justify;"><a href="http://lambda.jstolarek.com/wp-content/uploads/2013/01/rwh_cover.jpg"><img class="alignleft  wp-image-911" style="margin-left: 10px; margin-right: 10px;" alt="rwh_cover" src="http://lambda.jstolarek.com/wp-content/uploads/2013/01/rwh_cover-228x300.jpg" width="160" height="210" hspace="10" /></a>Real World Haskell &#8211; commonly referred to as RWH &#8211; was written by <a href="http://www.serpentine.com/blog/">Bryan O&#8217;Sullivan</a>, <a href="http://www.complete.org/">John Goerzen</a> and <a href="http://donsbot.wordpress.com/">Don Stewart</a> and published in 2008 by O&#8217;Reilly. With 28 chapters on 670 pages it is <em>the</em> book about Haskell. There is no more comprehensive book on Haskell at this moment. The best thing is <a href="http://book.realworldhaskell.org/read/">it is available on-line for free</a>. I don&#8217;t like reading large amounts of text from a computer screen, so when I decided to learn Haskell 10 months ago I mailed my university&#8217;s library and asked if they could get the book for me. Three weeks later <a href="http://lambda.jstolarek.com/2012/03/a-glance-at-some-haskell-books/">brand new copy of RWH was on my desk</a>. Up till now I read chapters 1-11, 17 and 25 which is about half of the book. Perhaps a review should be based on reading the whole, but after over 300 pages I already have my opinion and I don&#8217;t think it will change much after reading remaining chapters.</p>
<p style="text-align: justify;">The book assumes no prior knowledge of Haskell or functional programming. It starts off with simple, introductory topics and explains concepts of functional approach to programming. While the first examples of code are rather simple the book quickly moves to real-world applications (as the title rightly suggests). Aside from standard Haskell topics like lazy evaluation, typeclasses or monads, the book covers also parsing, databases, GUI programming, testing, profiling or interfacing with C. There are also many case studies. The good thing about the book is that chapters about particular technologies (that would be from 16 onwards) are self-contained and can be read in any order. On the other hand initial chapters seem to be a bit messy &#8211; they contain lots of valuable information scattered around in completely unexpected places.</p>
<p style="text-align: justify;">While the book is a great source of information there are some reasons not to be happy with it. First of all I think that it is not suitable for most beginners. While RWH assumes no prior knowledge, the examples quickly get complicated and it might be hard to figure out what is really going on in the code. Authors often use top-down approach, that is they present a lot of code out of nowhere and then go into explaining what it does (this is the analytical approach). For that reason I had a lot of hard time with parsing described in chapter 10 and after reading the chapter twice I still don&#8217;t understand all of it. I think that in some cases bottom-up (a.k.a synthetic) approach to algorithm construction would be more suitable. This is of course very subjective &#8211; some people might be able to understand everything without problems. Anyway, the book is demanding and the reader should be prepared for it. Also, be prepared that chapters about more advanced technologies give mostly an overview and are not comprehensive treatment of the subject. After reading chapter about FFI I wasn&#8217;t able to write my own C bindings and only after reading the official FFI specification things became clear.</p>
<p style="text-align: justify;">I was disappointed with the lack of discussion of GHC internals. I think it would be useful to explain things like thunks, WHNF as well as lazy evaluation model and possible performance issues related to it (memory leaks for example). There is one chapter about performance and profiling in general but it doesn&#8217;t go into details of how the compiler and runtime system works.</p>
<p style="text-align: justify;">Speaking of the performance chapter, RWH is also affected by changes made to GHC since the book was published. Luckily this is not a major issue as almost all examples work as they should. Chapter about performance is seriously affected though. I wasn&#8217;t able to reproduce any of the presented results with GHC 7.4.2, but of course authors can&#8217;t be blamed for that. Also, <a href="http://stackoverflow.com/questions/10578572/the-handle-function-and-real-world-haskell">one code snippet does not work because of changes made in GHC</a> but again this is not a big problem.</p>
<p style="text-align: justify;">I had a feeling that some basic concepts are not explained clearly enough. For example the book shows usage of <code>if</code> instruction in place where a more experienced functional programmer would probably use pattern matching (page 30). This might be fine since at this point the reader was not yet introduced to pattern matching. The problem is that this is only clarified on page 207, way too late in my opinion, especially that patterns are introduced on page 50. I also had a feeling that currying was not explained clearly and stressed enough. Moreover the book does not mention the term &#8220;referential transparency&#8221; (might be a good thing to teach wannabe Haskell programmers some jargon they are likely to encounter) and from the comments on the web I noticed that people not familiar with the concept of &#8220;side effects&#8221; are confused by the book&#8217;s explanation of that term. I also found that omitting type signatures in some function definitions makes the code harder to understand. Finally, one note about editorial side of the book. The book uses different kinds of crosses, stars, paragraphs and other signs to denote footnotes. It looks as though editors have decided not to use the same mark twice which is a bit annoying for me, but this is of course very subjective and definitely not related to the quality of the book itself.</p>
<p style="text-align: justify;">Although I complained that some concepts could have been explained in a different way I still think that RWH is a must-read for any serious Haskell programmer. Mostly because it gathers a lot of practical information in one place. I hope to find more time to finish reading it as I&#8217;m very curious about chapters on monads and monad transformers.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2013/01/real-world-haskell-impressions-after-initial-chapters/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Strange benchmarking results for FFI bindings</title>
		<link>http://lambda.jstolarek.com/2012/12/strange-benchmarking-results-for-ffi-bindings/</link>
		<comments>http://lambda.jstolarek.com/2012/12/strange-benchmarking-results-for-ffi-bindings/#comments</comments>
		<pubDate>Sat, 01 Dec 2012 05:32:27 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[c]]></category>
		<category><![CDATA[ffi]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=880</guid>
		<description><![CDATA[It looks like I am getting pretty good at getting hit by Haskell bugs. My previous post described behaviour that turned out to be a bug in GHC (thanks to Joachim Breitner for pointing this out). Now I found problems with benchmarking FFI bindings using method described a month ago. I work on a project [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">It looks like I am getting pretty good at getting hit by Haskell bugs. My <a href="http://lambda.jstolarek.com/2012/11/waiting-for-garbage-collection-can-kill-parallelism/">previous post</a> described behaviour that turned out to be <a href="http://hackage.haskell.org/trac/ghc/ticket/367">a bug in GHC</a> (thanks to Joachim Breitner for pointing this out). Now I found problems with benchmarking FFI bindings using <a href="http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/">method described a month ago</a>.</p>
<p style="text-align: justify;">I work on a project in which the same algorithm is implemented using different data structures &#8211; one implementation is done in C, another using Vector library and yet another using Repa. Everything is benchmarked with Criterion and C implementation is the fastest one (look at first value after <code>mean</code> &#8211; this is mean time of running a function):</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">benchmarking DWT/C1
mean: 87.26403 us, lb 86.50825 us, ub 90.05830 us, ci 0.950
std dev: 6.501161 us, lb 1.597160 us, ub 14.81257 us, ci 0.950
&nbsp;
benchmarking DWT/Vector1
mean: 209.4814 us, lb 208.8169 us, ub 210.5628 us, ci 0.950
std dev: 4.270757 us, lb 2.978532 us, ub 6.790762 us, ci 0.950</pre></td></tr></table></div>

<p style="text-align: justify;">This algorithm uses a simpler <code>lattice</code> function that is repeated a couple of times. I wrote benchmarks that measure time needed by a single invocation of <code>lattice</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">benchmarking C1/Lattice Seq
mean: 58.36111 us, lb 58.14981 us, ub 58.65387 us, ci 0.950
std dev: 1.260742 us, lb 978.6512 ns, ub 1.617153 us, ci 0.950
&nbsp;
benchmarking Vector1/Lattice Seq
mean: 34.97816 us, lb 34.87454 us, ub 35.14377 us, ci 0.950
std dev: 661.5554 ns, lb 455.7412 ns, ub 1.013466 us, ci 0.950</pre></td></tr></table></div>

<p style="text-align: justify;">Hey, what&#8217;s this!? Vector implementation is suddenly faster than C? Not possible given that DWT in C is faster than DWT using Vector. After some investigation it turned out that the first C benchmark runs correctly while subsequent benchmarks of C functions take performance hit. I managed to create a simple code that demonstrates the problem in as few lines as possible. I implemented a copy function in C that takes an array and copies it to another array. Here&#8217;s <code>copy.c</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#include</span>
<span style="color: #339933;">#include &quot;copy.h&quot;</span>
&nbsp;
<span style="color: #993333;">double</span><span style="color: #339933;">*</span> c_copy<span style="color: #009900;">&#40;</span> <span style="color: #993333;">double</span><span style="color: #339933;">*</span> inArr<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> arrLen <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">double</span><span style="color: #339933;">*</span> outArr <span style="color: #339933;">=</span> <span style="color: #000066;">malloc</span><span style="color: #009900;">&#40;</span> arrLen <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span> <span style="color: #993333;">double</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span> <span style="color: #993333;">int</span> i <span style="color: #339933;">=</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> arrLen<span style="color: #339933;">;</span> i<span style="color: #339933;">++</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    outArr<span style="color: #009900;">&#91;</span> i <span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> inArr<span style="color: #009900;">&#91;</span> i <span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #b1b100;">return</span> outArr<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">and <code>copy.h</code>:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#ifndef _COPY_H_</span>
<span style="color: #339933;">#define _COPY_H_</span>
&nbsp;
<span style="color: #993333;">double</span><span style="color: #339933;">*</span> c_copy<span style="color: #009900;">&#40;</span> <span style="color: #993333;">double</span><span style="color: #339933;">*,</span> <span style="color: #993333;">int</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #339933;">#endif</span></pre></td></tr></table></div>

<p style="text-align: justify;">I wrote a simple binding for that function and benchmarked it multiple times in a row:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">module</span> Main <span style="color: #06c; font-weight: bold;">where</span>
&nbsp;
<span style="color: #06c; font-weight: bold;">import</span> Criterion<span style="color: #339933; font-weight: bold;">.</span>Main
<span style="color: #06c; font-weight: bold;">import</span> Data<span style="color: #339933; font-weight: bold;">.</span>Vector<span style="color: #339933; font-weight: bold;">.</span>Storable <span style="color: #06c; font-weight: bold;">hiding</span> <span style="color: green;">&#40;</span>copy<span style="color: green;">&#41;</span>
<span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">Monad</span> <span style="color: green;">&#40;</span>liftM<span style="color: green;">&#41;</span>
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Foreign</span> <span style="color: #06c; font-weight: bold;">hiding</span> <span style="color: green;">&#40;</span>unsafePerformIO<span style="color: green;">&#41;</span>
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Foreign</span><span style="color: #339933; font-weight: bold;">.</span>C
<span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">IO</span><span style="color: #339933; font-weight: bold;">.</span>Unsafe <span style="color: green;">&#40;</span>unsafePerformIO<span style="color: green;">&#41;</span>
&nbsp;
foreign <span style="color: #06c; font-weight: bold;">import</span> ccall unsafe <span style="background-color: #3cb371;">&quot;copy.h&quot;</span>
  c<span style="color: #339933; font-weight: bold;">_</span>copy <span style="color: #339933; font-weight: bold;">::</span> Ptr CDouble <span style="color: #339933; font-weight: bold;">-&gt;</span> CInt <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span>Ptr CDouble<span style="color: green;">&#41;</span>
&nbsp;
signal <span style="color: #339933; font-weight: bold;">::</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span>
signal <span style="color: #339933; font-weight: bold;">=</span> fromList <span style="color: green;">&#91;</span><span style="color: red;">1.0</span> <span style="color: #339933; font-weight: bold;">..</span> <span style="color: red;">16384.0</span><span style="color: green;">&#93;</span>
&nbsp;
copy <span style="color: #339933; font-weight: bold;">::</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span>
copy sig <span style="color: #339933; font-weight: bold;">=</span> unsafePerformIO <span style="color: #339933; font-weight: bold;">$</span> <span style="color: #06c; font-weight: bold;">do</span>
    <span style="color: #06c; font-weight: bold;">let</span> <span style="color: green;">&#40;</span>fpSig<span style="color: #339933; font-weight: bold;">,</span> <span style="color: #339933; font-weight: bold;">_,</span> lenSig<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">=</span> unsafeToForeignPtr sig
    pLattice <span style="color: #339933; font-weight: bold;">&lt;-</span> liftM castPtr <span style="color: #339933; font-weight: bold;">$</span> withForeignPtr fpSig <span style="color: #339933; font-weight: bold;">$</span> \ptrSig <span style="color: #339933; font-weight: bold;">-&gt;</span>
                c<span style="color: #339933; font-weight: bold;">_</span>copy <span style="color: green;">&#40;</span>castPtr ptrSig<span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">fromIntegral</span> lenSig<span style="color: green;">&#41;</span>
    fpLattice <span style="color: #339933; font-weight: bold;">&lt;-</span> newForeignPtr finalizerFree pLattice
    <span style="font-weight: bold;">return</span> <span style="color: #339933; font-weight: bold;">$</span> unsafeFromForeignPtr0 fpLattice lenSig
&nbsp;
&nbsp;
main <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span>
main <span style="color: #339933; font-weight: bold;">=</span> defaultMain <span style="color: green;">&#91;</span>
         bgroup <span style="background-color: #3cb371;">&quot;FFI&quot;</span> <span style="color: green;">&#91;</span>
           bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: #339933; font-weight: bold;">,</span> bench <span style="background-color: #3cb371;">&quot;C binding&quot;</span> <span style="color: #339933; font-weight: bold;">$</span> whnf copy signal
         <span style="color: green;">&#93;</span>
       <span style="color: green;">&#93;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Compiling and running this benchmark with:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">$ ghc -O2 -Wall -optc -std=c99 ffi_crit.hs copy.c
$ ./ffi_crit -g</pre></td></tr></table></div>

<p style="text-align: justify;">gave me this results:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">benchmarking FFI/C binding
mean: 17.44777 us, lb 16.82549 us, ub 19.84387 us, ci 0.950
std dev: 5.627304 us, lb 968.1911 ns, ub 13.18222 us, ci 0.950
&nbsp;
benchmarking FFI/C binding
mean: 45.46269 us, lb 45.17545 us, ub 46.01435 us, ci 0.950
std dev: 1.950915 us, lb 1.169448 us, ub 3.201935 us, ci 0.950
&nbsp;
benchmarking FFI/C binding
mean: 45.79727 us, lb 45.55681 us, ub 46.26911 us, ci 0.950
std dev: 1.669191 us, lb 1.029116 us, ub 3.098384 us, ci 0.950</pre></td></tr></table></div>

<p style="text-align: justify;">The first run takes about 17μs, later runs take about 45μs. I found this result repeatable across different runs, although in about 10-20% of runs all benchmarks &#8211; including the first one &#8211; took about 45μs. I obtained this results on GHC 7.4.1, openSUSE 64-bit linux with 2.6.37 kernel, <a href="http://ark.intel.com/products/43560/Intel-Core-i7-620M-Processor-4M-Cache-2_66-GHz">Intel Core i7 M 620</a> CPU. I posted this on Haskell-cafe and #haskell. Surprisingly nobody could replicate the result! I was confused so I gave it a try on my second machine: Debian Squeeze, 64-bit, GHC 7.4.2, 2.6.32 kernel, <a href="http://ark.intel.com/products/33099/Intel-Core2-Duo-Processor-T8300-3M-Cache-2_40-GHz-800-MHz-FSB">Intel Core 2 Due T8300</a> CPU. At first the problem did not appear:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">benchmarking FFI/C binding
mean: 107.3837 us, lb 107.2013 us, ub 107.5862 us, ci 0.950
std dev: 983.6046 ns, lb 822.6750 ns, ub 1.292724 us, ci 0.950
&nbsp;
benchmarking FFI/C binding
mean: 108.1152 us, lb 107.9457 us, ub 108.3052 us, ci 0.950
std dev: 916.2469 ns, lb 793.1004 ns, ub 1.122127 us, ci 0.950</pre></td></tr></table></div>

<p style="text-align: justify;">All benchmarks took about 107μs. Now watch what happens when I increase size of the copied vector from 16K elements to 32K:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">benchmarking FFI/C binding
mean: 38.50100 us, lb 36.71525 us, ub 46.87665 us, ci 0.950
std dev: 16.93131 us, lb 1.033678 us, ub 40.23900 us, ci 0.950
&nbsp;
benchmarking FFI/C binding
mean: 209.9733 us, lb 209.5316 us, ub 210.4680 us, ci 0.950
std dev: 2.401398 us, lb 2.052981 us, ub 2.889688 us, ci 0.950</pre></td></tr></table></div>

<p style="text-align: justify;">This first run is 2.5 time faster (!), while all other runs are two times slower. While the latter could be expected, the former certainly is not.</p>
<p style="text-align: justify;">So what exactly is going on? I tried analysing eventlog of the program but I wasn&#8217;t able to figure out the cause of the problem. I noticed that if I comment out the loop in C function so that it only allocates memory and returns an empty vector then the problem disappears. Someone on Haskell-cafe suggested that these are cache effects, but I am sceptical about this explanation. If this is caused by cache then why did the first benchmark sped up when size of the vector was increased? And why does this effect occur for 16K length vectors on a machine with 4MB cache, while machine with 3MB cache needs twice longer vector for the problem to occur? So if anyone has a clue what causes this strange behaviour please let me know. I would be happy to resolve that since now result of my benchmarks are distorted (perhaps yours are too only you didn&#8217;t notice).</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2012/12/strange-benchmarking-results-for-ffi-bindings/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Waiting for garbage collection can kill parallelism?</title>
		<link>http://lambda.jstolarek.com/2012/11/waiting-for-garbage-collection-can-kill-parallelism/</link>
		<comments>http://lambda.jstolarek.com/2012/11/waiting-for-garbage-collection-can-kill-parallelism/#comments</comments>
		<pubDate>Sat, 17 Nov 2012 12:23:18 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[ghc]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=859</guid>
		<description><![CDATA[I am reposting my mail from Haskell-cafe, since I got no replies in over a week and I think it is an interesting case. I was reading &#8220;Parallel Performance Tuning for Haskell&#8221; by Jones, Marlow and Singh and wanted to replicate the results for their first case study. The code goes like this: module Main [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">I am reposting my mail from Haskell-cafe, since I got no replies in over a week and I think it is an interesting case. I was reading <a href="http://community.haskell.org/~simonmar/papers/threadscope.pdf">&#8220;Parallel Performance Tuning for Haskell&#8221;</a> by Jones, Marlow and Singh and wanted to replicate the results for their first case study. The code goes like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">module</span> Main <span style="color: #06c; font-weight: bold;">where</span>
&nbsp;
<span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span>Parallel
&nbsp;
main <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span>
main <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">print</span> <span style="color: #339933; font-weight: bold;">.</span> parSumFibEuler <span style="color: red;">38</span> <span style="color: #339933; font-weight: bold;">$</span> <span style="color: red;">5300</span>
&nbsp;
fib <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
fib <span style="color: red;">0</span> <span style="color: #339933; font-weight: bold;">=</span> <span style="color: red;">0</span>
fib <span style="color: red;">1</span> <span style="color: #339933; font-weight: bold;">=</span> <span style="color: red;">1</span>
fib n <span style="color: #339933; font-weight: bold;">=</span> fib <span style="color: green;">&#40;</span>n <span style="color: #339933; font-weight: bold;">-</span> <span style="color: red;">1</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">+</span> fib <span style="color: green;">&#40;</span>n <span style="color: #339933; font-weight: bold;">-</span> <span style="color: red;">2</span><span style="color: green;">&#41;</span>
&nbsp;
mkList <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#91;</span><span style="color: #cccc00; font-weight: bold;">Int</span><span style="color: green;">&#93;</span>
mkList n <span style="color: #339933; font-weight: bold;">=</span> <span style="color: green;">&#91;</span><span style="color: red;">1</span><span style="color: #339933; font-weight: bold;">..</span>n<span style="color: #339933; font-weight: bold;">-</span><span style="color: red;">1</span><span style="color: green;">&#93;</span>
&nbsp;
relprime <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Bool</span>
relprime x y <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">gcd</span> x y <span style="color: #339933; font-weight: bold;">==</span> <span style="color: red;">1</span>
&nbsp;
euler <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
euler n <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">length</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">filter</span> <span style="color: green;">&#40;</span>relprime n<span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span>mkList n<span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
&nbsp;
sumEuler <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumEuler <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">sum</span> <span style="color: #339933; font-weight: bold;">.</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">map</span> euler<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">.</span> mkList
&nbsp;
sumFibEuler <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
sumFibEuler a b <span style="color: #339933; font-weight: bold;">=</span> fib a <span style="color: #339933; font-weight: bold;">+</span> sumEuler b
&nbsp;
parSumFibEuler <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">Int</span>
parSumFibEuler a b <span style="color: #339933; font-weight: bold;">=</span> f `par` <span style="color: green;">&#40;</span>e `pseq` <span style="color: green;">&#40;</span>e <span style="color: #339933; font-weight: bold;">+</span> f<span style="color: green;">&#41;</span><span style="color: green;">&#41;</span>
    <span style="color: #06c; font-weight: bold;">where</span> f <span style="color: #339933; font-weight: bold;">=</span> fib a
          e <span style="color: #339933; font-weight: bold;">=</span> sumEuler b</pre></td></tr></table></div>

<p style="text-align: justify;">In the paper authors show that this code performs computation of <code>fib</code> ans <code>sumEuler</code> in parallel and that good speed-up is achieved:</p>
<blockquote>
<p style="text-align: justify;">To make the parallelism more robust, we need to be explicit about the evaluation order we intend. The way to do this is to use <code>pseq</code> in combination with <code>par</code>, the idea being to ensure that the main thread works on <code>sumEuler</code> while the sparked thread works on <code>fib</code>. (&#8230;) This version does not make any assumptions about the evaluation order of <code>+</code>, but relies only on the evaluation order of <code>pseq</code>, which is guaranteed to be stable.</p>
</blockquote>
<p style="text-align: justify;">These results were obtained on older GHC version<sup><a href="http://lambda.jstolarek.com/2012/11/waiting-for-garbage-collection-can-kill-parallelism/#footnote_0_859" id="identifier_0_859" class="footnote-link footnote-identifier-link" title="Paper does not mention which version exactly. I believe it was 6.10, since &ldquo;Runtime Support for Multicore Haskell&rdquo; by the same authors released at the same time uses GHC 6.10">1</a></sup>. However, compiling program with:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">$ ghc -O2 -Wall -threaded -rtsopts -fforce-recomp -eventlog parallel.hs</pre></td></tr></table></div>

<p style="text-align: justify;">and then running on GHC 7.4.1 using:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">$ ./parallel +RTS -qa -g1 -s -ls -N2</pre></td></tr></table></div>

<p style="text-align: justify;">yields a completely different result. These are statistics for a parallel run on two cores:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">SPARKS: 1 (1 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
&nbsp;
INIT    time    0.00s  (  0.00s elapsed)
MUT     time    2.52s  (  2.51s elapsed)
GC      time    0.03s  (  0.05s elapsed)
EXIT    time    0.00s  (  0.00s elapsed)
Total   time    2.55s  (  2.56s elapsed)</pre></td></tr></table></div>

<p style="text-align: justify;">Running the same code on one core results in:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="text" style="font-family:monospace;">SPARKS: 1 (0 converted, 0 overflowed, 0 dud, 1 GC'd, 0 fizzled)
&nbsp;
INIT    time    0.00s  (  0.00s elapsed)
MUT     time    2.51s  (  2.53s elapsed)
GC      time    0.03s  (  0.05s elapsed)
EXIT    time    0.00s  (  0.00s elapsed)
Total   time    2.55s  (  2.58s elapsed)</pre></td></tr></table></div>

<p style="text-align: justify;">Looking and <code>MUT</code> (mutator time) it looks that there is no speed-up at all. Investigating eventlog using ThreadScope sheds some light on execution of a program:</p>
<p style="text-align: center;"><a href="http://lambda.jstolarek.com/wp-content/uploads/2012/11/parallel_general_view1.png"><img class="aligncenter  wp-image-863" title="parallel_general_view" src="http://lambda.jstolarek.com/wp-content/uploads/2012/11/parallel_general_view1.png" alt="" width="460" height="272" /></a></p>
<p style="text-align: justify;">Both threads start computation, but HEC 1 soon blocks and only resumes when HEC 0 finishes computation. Zooming in it looks that HEC 1 stops because it requests garbage collection, but HEC 0 does not respond to that request so GC begins only when HEC 0 is done with its computation:</p>
<p style="text-align: center;"><a href="http://lambda.jstolarek.com/wp-content/uploads/2012/11/parallel_detailed_view.png"><img class="aligncenter  wp-image-865" title="parallel_detailed_view" src="http://lambda.jstolarek.com/wp-content/uploads/2012/11/parallel_detailed_view.png" alt="" width="460" height="272" /></a></p>
<p style="text-align: justify;">Why does this happen? I am no expert on GHC&#8217;s garbage collection, my only knowledge of that comes from section 6 of &#8220;<a href="http://community.haskell.org/~simonmar/papers/multicore-ghc.pdf">Runtime Support for Multicore Haskell</a>&#8220;. If I understood correctly this should not happen &#8211; it certainly didn&#8217;t happen when the paper was published. Do we have a regression or am I misunderstanding something?</p>
<ol class="footnotes"><li id="footnote_0_859" class="footnote">Paper does not mention which version exactly. I believe it was 6.10, since &#8220;Runtime Support for Multicore Haskell&#8221; by the same authors released at the same time uses GHC 6.10</li></ol>]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2012/11/waiting-for-garbage-collection-can-kill-parallelism/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>How to shoot yourself in the foot with Haskell</title>
		<link>http://lambda.jstolarek.com/2012/11/how-to-shoot-yourself-in-the-foot-with-haskell/</link>
		<comments>http://lambda.jstolarek.com/2012/11/how-to-shoot-yourself-in-the-foot-with-haskell/#comments</comments>
		<pubDate>Tue, 06 Nov 2012 13:38:02 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[quickcheck]]></category>
		<category><![CDATA[repa]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=849</guid>
		<description><![CDATA[Haskell is &#8220;advertised&#8221; as a safe language that does all type checking upfront, making sure that you don&#8217;t experience runtime type errors, null pointers and all that kind of stuff. It also gives you ways to bypass some of the safety mechanisms, so a conscious programmer can use unsafe functions to get a boost in [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">Haskell is &#8220;advertised&#8221; as a safe language that does all type checking upfront, making sure that you don&#8217;t experience runtime type errors, null pointers and all that kind of stuff. It also gives you ways to bypass some of the safety mechanisms, so a conscious programmer can use unsafe functions to get a boost in performance (e.g. by not performing bounds checking when indexing a vector).</p>
<p style="text-align: justify;">I&#8217;ve written some very ugly Haskell code that creates a vector using destructive updates. It is in fact an imperative algorithm, not a functional one. When the initialization is over the vector is frozen using <a href="http://hackage.haskell.org/packages/archive/vector/latest/doc/html/Data-Vector-Unboxed.html#v:unsafeFreeze"><code>unsafeFreeze</code></a>. I wrote my code using <a href="http://hackage.haskell.org/packages/archive/vector/latest/doc/html/Data-Vector-Storable-Mutable.html#v:read"><code>read</code></a> and <a href="http://hackage.haskell.org/packages/archive/vector/latest/doc/html/Data-Vector-Storable-Mutable.html#v:write"><code>write</code></a> functions, tested it using QuickCheck and when the tests passed I switched to <a href="http://hackage.haskell.org/packages/archive/vector/latest/doc/html/Data-Vector-Storable-Mutable.html#v:unsafeRead"><code>unsafeRead</code></a> and <a href="http://hackage.haskell.org/packages/archive/vector/latest/doc/html/Data-Vector-Storable-Mutable.html#v:unsafeWrite"><code>unsafeWrite</code></a> to make my program faster. Some time later I started getting random segfaults when running my tests. This never happened before in any of my Haskell programs so I almost panicked. At first I didn&#8217;t had a slightest idea how to even approach this problem. I suspected that this might even be a bug in GHC. Then I started disabling groups of tests trying to track down the bug and finally managed to locate a single test that was causing the problem. Guess what &#8211; it was the test of my vector initialization with unsafe functions. What happened is that after switching to <code>unsafeRead</code>/<code>unsafeWrite</code> I refactored the function and its test. I made a mistake in the testing code and passed incorrect data that resulted with an attempt to write an element at address -1. Finding this bug took me a little over an hour. A factor that made debugging harder was that disabling tests that seemed to cause the segfault resulted in problems appearing in a completely different part of the program &#8211; or so I thought by looking at the output of my program. Looks like I completely forgot about lazy evaluation and unspecified evaluation order!</p>
<p style="text-align: justify;">Second bug I encountered was even trickier. I wrote functions that perform cyclic shifts of a signal by any value. For example shifting <code>[1,2,3,4]</code> left by 1 yields <code>[2,3,4,1]</code>. Note that shifting by 5, 9, 13 and so on gives exactly the same result &#8211; the shift function is periodic. You might recall that I used shift functions to demonstrate <a href="http://lambda.jstolarek.com/2012/10/code-testing-in-haskell/">code testing</a>. This time however I written shifts using <a href="www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial">Repa</a> library. Then I created QuickCheck property stating that shifting any signal left and then right by same value yields the original signal. This is pretty obvious property and the tests passed without any problems. Later, when writing some other tests I ended up with one of the tests failing. Using ghci I checked that this error should not be happening at all, but after careful debugging it turned out that during actual tests some of the values in the results become zeros. After two hours of debugging I realized that the actual bug is in the shifting functions &#8211; they worked only for the basic period, that is shift values from 0 to signal length. Why QuickCheck didn&#8217;t manage to falsify my property of left/right shift compositions? Repa is a smart (and tricky) library that attempts to fuse many operations on an array into one. And it fused application of left shift followed by right shift into identity transform! Well, this is great. After all this is the kind of optimization we would like to have in our programs. But it turns out that it can also impact tests! After realizing what is going on it was actually a matter of 5 minutes to fix the bug, but finding it was not a trivial task.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2012/11/how-to-shoot-yourself-in-the-foot-with-haskell/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Benchmarking C functions using Foreign Function Interface</title>
		<link>http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/</link>
		<comments>http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/#comments</comments>
		<pubDate>Fri, 02 Nov 2012 14:10:16 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[c]]></category>
		<category><![CDATA[ffi]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=813</guid>
		<description><![CDATA[I am currently working on implementing Discrete Wavelet Transform (DWT) in Haskell. I want to make use of Haskell&#8217;s parallel programing capabilities to implement an algorithm that can take advantage of multiple CPU cores. My previous posts on testing and benchmarking were by-products of this project, as I needed to ensure reliability of my implementation [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">I am currently working on implementing Discrete Wavelet Transform (DWT) in Haskell. I want to make use of Haskell&#8217;s parallel programing capabilities to implement an algorithm that can take advantage of multiple CPU cores. My previous posts on <a href="http://lambda.jstolarek.com/2012/10/code-testing-in-haskell/">testing</a> and <a href="http://lambda.jstolarek.com/2012/10/code-benchmarking-in-haskell/">benchmarking</a> were by-products of this project, as I needed to ensure reliability of my implementation and to measure its performance. The key question that is in my head all the time is &#8220;can I write Haskell code that outperforms C when given more CPU cores?&#8221;. To answer this question I needed a way to benchmark performance of algorithm written in C and I must admit that this problem was giving me a real headache. One obvious solution was to implement the algorithm in C and measure its running time. This didn&#8217;t seem acceptable. I use <a href="http://hackage.haskell.org/package/criterion">Criterion</a> for benchmarking and it does lots of fancy stuff like measuring clock resolution and calculating <a href="http://en.wikipedia.org/wiki/Kernel_density_estimation">kernel density estimation</a>. So unless I implemented this features in C (read: re-implement the whole library) the results of measurements would not be comparable.</p>
<p style="text-align: justify;">Luckily for me there is a better solution: Foreign Function Interface (FFI). This is an extension of Haskell 98 standard &#8211; and part of Haskell 2010 &#8211; that allows to call functions written in C<sup><a href="http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/#footnote_0_813" id="identifier_0_813" class="footnote-link footnote-identifier-link" title="Specification mentions also the calling conventions for other languages and platforms (Java VM, .Net and C++) but I think that currently there is no implementation of these.">1</a></sup>. This means that I could write my function in C, wrap it in a pure Haskell function and benchmark that wrapper with Criterion. The results would be comparable with Haskell implementation, but I was afraid that overheads related to data copying would affect the performance measurements. As it turned out I was wrong.</p>
<p style="text-align: justify;">I started with <a href="http://book.realworldhaskell.org/read/interfacing-with-c-the-ffi.html">chapter 17 of Real World Haskell</a>. It presents a real world example &#8211; I guess that title of the book is very adequate &#8211; of creating bindings for an already existing library. Sadly, after reading it I felt very confused. I had a general idea of what should be done but I didn&#8217;t understand many of the details. I had serious doubts about proper usage of <code>Ptr</code> and <code>ForeignPtr</code> data types and these are in fact very important when working with FFI. Someone on #haskell advised me to read the <a href="http://www.cse.unsw.edu.au/~chak/haskell/ffi/">official specification of FFI</a> and this was a spot-on. This is actually one of the few official specifications that are a real pleasure to read (if you read <a href="http://www.schemers.org/Documents/Standards/R5RS/">R5RS</a> then you know what I mean). It is concise (30 pages) and provides a comprehensive overview of all data types and functions used for making foreign calls.</p>
<p style="text-align: justify;">After reading the specification it was rather straightforward to write my own bindings to C. Here&#8217;s a prototype of called C function, located in <code>dwt.h</code> header file:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">double</span><span style="color: #339933;">*</span> c_dwt<span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span><span style="color: #339933;">*</span> ls<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> ln<span style="color: #339933;">,</span> <span style="color: #993333;">double</span><span style="color: #339933;">*</span> xs<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> xn<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p style="text-align: justify;">The corresponding <code>dwt.c</code> source file contains:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="c" style="font-family:monospace;"><span style="color: #993333;">double</span><span style="color: #339933;">*</span> c_dwt<span style="color: #009900;">&#40;</span> <span style="color: #993333;">double</span><span style="color: #339933;">*</span> ls<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> ln<span style="color: #339933;">,</span> <span style="color: #993333;">double</span><span style="color: #339933;">*</span> xs<span style="color: #339933;">,</span> <span style="color: #993333;">int</span> xn <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">double</span><span style="color: #339933;">*</span> ds <span style="color: #339933;">=</span> <span style="color: #000066;">malloc</span><span style="color: #009900;">&#40;</span> xn <span style="color: #339933;">*</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span> <span style="color: #993333;">double</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">// fill ds array with result</span>
&nbsp;
  <span style="color: #b1b100;">return</span> ds<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p style="text-align: justify;">The important thing is that C function mallocates new memory which we will later manage using Haskell&#8217;s garbage collector. Haskell binding for such a function looks like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">foreign <span style="color: #06c; font-weight: bold;">import</span> ccall unsafe <span style="background-color: #3cb371;">&quot;dwt.h&quot;</span>
  c<span style="color: #339933; font-weight: bold;">_</span>dwt <span style="color: #339933; font-weight: bold;">::</span> Ptr CDouble <span style="color: #339933; font-weight: bold;">-&gt;</span> CInt <span style="color: #339933; font-weight: bold;">-&gt;</span> Ptr CDouble <span style="color: #339933; font-weight: bold;">-&gt;</span> CInt <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span>Ptr CDouble<span style="color: green;">&#41;</span></pre></td></tr></table></div>

<p style="text-align: justify;">Here&#8217;s what it does: <code>ccall</code> denotes C calling convention, <code>unsafe</code> improves performance of the call at the cost of safety<sup><a href="http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/#footnote_1_813" id="identifier_1_813" class="footnote-link footnote-identifier-link" title="Calls need to be safe only when called C code calls Haskell code, which I think is rare">2</a></sup> and <code>"dwt.h"</code> points to a header file. Finally, I define the name of the function and it&#8217;s type. This name is the same as the name of original C function, but if it were different I would have to specify name of C function in the string that specifies name of the header file. You probably already noticed that type <code>int</code> from C is represented by <code>CInt</code> in Haskell and <code>double</code> by <code>CDouble</code>. You can convert between <code>Int</code> and <code>CInt</code> with <code>fromIntegral</code> and between <code>Double</code> and <code>CDouble</code> with <code>realToFrac</code>. Pointers from C became <code>Ptr</code>, so <code>double*</code> from C is represented as <code>Ptr Double</code> in Haskell binding. What might be surprising about this type signature is that the result is in the <code>IO</code> monad, that is our function from C is denoted as impure. The reason for this is that every time we run <code>c_dwt</code> function a different memory address will be allocated by <code>malloc</code>, so indeed the function will return different results given the same input. In my function however the array addressed by that pointer will always contain exactly the same values (for the same input data), so in fact my function is pure. The problem is that Haskell doesn&#8217;t know that and we will have to fix that problem using the infamous <code>unsafePerformIO</code>. For that we have to create a wrapper function that has pure interface:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="code"><pre class="haskell" style="font-family:monospace;"><span style="color: #06c; font-weight: bold;">import</span> Control<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">Monad</span> <span style="color: green;">&#40;</span>liftM<span style="color: green;">&#41;</span>
<span style="color: #06c; font-weight: bold;">import</span> Data<span style="color: #339933; font-weight: bold;">.</span>Vector<span style="color: #339933; font-weight: bold;">.</span>Storable
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Foreign</span> <span style="color: #06c; font-weight: bold;">hiding</span> <span style="color: green;">&#40;</span>unsafePerformIO<span style="color: green;">&#41;</span>
<span style="color: #06c; font-weight: bold;">import</span> <span style="color: #06c; font-weight: bold;">Foreign</span><span style="color: #339933; font-weight: bold;">.</span>C
<span style="color: #06c; font-weight: bold;">import</span> System<span style="color: #339933; font-weight: bold;">.</span><span style="color: #cccc00; font-weight: bold;">IO</span><span style="color: #339933; font-weight: bold;">.</span>Unsafe
&nbsp;
dwt <span style="color: #339933; font-weight: bold;">::</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span> <span style="color: #339933; font-weight: bold;">-&gt;</span> Vector <span style="color: #cccc00; font-weight: bold;">Double</span>
dwt ls sig <span style="color: #339933; font-weight: bold;">=</span> unsafePerformIO <span style="color: #339933; font-weight: bold;">$</span> <span style="color: #06c; font-weight: bold;">do</span>
    <span style="color: #06c; font-weight: bold;">let</span> <span style="color: green;">&#40;</span>fpLs <span style="color: #339933; font-weight: bold;">,</span> <span style="color: #339933; font-weight: bold;">_,</span> lenLs <span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">=</span> unsafeToForeignPtr ls
        <span style="color: green;">&#40;</span>fpSig<span style="color: #339933; font-weight: bold;">,</span> <span style="color: #339933; font-weight: bold;">_,</span> lenSig<span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">=</span> unsafeToForeignPtr sig
    pDwt <span style="color: #339933; font-weight: bold;">&lt;-</span> liftM castPtr <span style="color: #339933; font-weight: bold;">$</span> withForeignPtr fpLs <span style="color: #339933; font-weight: bold;">$</span> \ptrLs <span style="color: #339933; font-weight: bold;">-&gt;</span>
            withForeignPtr fpSig <span style="color: #339933; font-weight: bold;">$</span> \ptrSig <span style="color: #339933; font-weight: bold;">-&gt;</span>
                c<span style="color: #339933; font-weight: bold;">_</span>dwt <span style="color: green;">&#40;</span>castPtr ptrLs <span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">fromIntegral</span> lenLs <span style="color: green;">&#41;</span>
                      <span style="color: green;">&#40;</span>castPtr ptrSig<span style="color: green;">&#41;</span> <span style="color: green;">&#40;</span><span style="font-weight: bold;">fromIntegral</span> lenSig<span style="color: green;">&#41;</span>
    fpDwt <span style="color: #339933; font-weight: bold;">&lt;-</span> newForeignPtr finalizerFree pDwt
    <span style="font-weight: bold;">return</span> <span style="color: #339933; font-weight: bold;">$</span> unsafeFromForeignPtr0 fpDwt lenSig</pre></td></tr></table></div>

<p style="text-align: justify;">Our wrapper function takes two <code>Vector</code>s as input and returns a new <code>Vector</code>. To interface with C we need to use <a href="http://hackage.haskell.org/packages/archive/vector/0.10.0.1/doc/html/Data-Vector-Storable.html#t:Storable">storable</a> vectors, which store data that can be written to raw memory (that&#8217;s what the C function is doing). I wasn&#8217;t able to figure out what is the difference between storable and unboxed vectors. It seems that both store primitive values in continuous memory block and therefore both offer similar performance (assumed, not verified). First thing to do is to get <code>ForeignPtr</code>s out of input vectors. <code>ForeignPtr</code> is a <code>Ptr</code> with a finalizer attached. Finalizer is a function called when the object is no longer in use and needs to be garbage collected. In this case we need a function that will free memory allocated with <code>malloc</code>. This is a common task, so FFI implementation already provides a <code>finalizerFree</code> function for that. The actual call to foreign function is made on lines 11-14. We can operate on <code>Ptr</code> values stored in <code>ForeignPtr</code> using <code>withForeignPtr</code> function. However, since we have vectors of <code>Double</code>s as input, we also have <code>Ptr Double</code>, not <code>Ptr CDouble</code> that <code>c_dwt</code> function expects. There are two possible solutions to that problem. One would be to copy memory, converting every value in a vector using <code>realToFrac</code>. I did not try that assuming this would kill performance. Instead I used <code>castPtr</code> which casts pointer of one type to a pointer of another type. This is potentially dangerous and relies on the fact that <code>Double</code> and <code>CDouble</code> have the same internal structure. This is in fact expected, but by no means it is guaranteed by any specification! I wouldn&#8217;t be surprised it that didn&#8217;t work on some sort of exotic hardware architecture. Anyway, I written tests to make sure that this cast does work the way I want it to. This little trick allows to avoid copying the input data. The output pointer has to be cast from <code>Ptr CDouble</code> to <code>Ptr Double</code> and since the result is in the <code>IO</code> monad the <code>castPtr</code> has to be lifted with <code>liftM</code>. After getting the result as <code>Ptr Double</code> we wrap it in a <code>ForeignPtr</code> with a memory-freeing finalizer (line 15) and use that foreign pointer to construct the resulting vector of <code>Double</code>s.</p>
<h1>Summary</h1>
<p style="text-align: justify;">I had two concerns when writing this binding. First was the possible performance overhead. Thanks to using pointer casts it was possible to avoid any sort of data copying and that makes this binding real quick. Measuring execution time with criterion shows that calling C function that does only memory allocation (as shown in this post) takes about 250µs. After adding the rest of C code that actually does computation the execution time jumps to about 55ms, so the FFI calling overhead does not skew the performance tests. Big thanks go to Mikhail Glushenkov who convinced me with <a href="http://stackoverflow.com/questions/13009728/how-to-reliably-compare-runtime-of-haskell-and-c">his answer on StackOverflow</a> to use FFI. My second concern was the necessity to use many functions with the word &#8220;unsafe&#8221;, especially the <code>unsafePerformIO</code>. I googled a bit and it seems that this is a normal thing when working with FFI and I guess there is no reason to worry, provided that the binding is thoroughly tested. So in the end I am very happy with the result. It is fast, Haskell manages garbage collection of memory allocated with C and most importantly I can benchmark C code using Criterion.</p>
<ol class="footnotes"><li id="footnote_0_813" class="footnote">Specification mentions also the calling conventions for other languages and platforms (Java VM, .Net and C++) but I think that currently there is no implementation of these.</li><li id="footnote_1_813" class="footnote">Calls need to be safe only when called C code calls Haskell code, which I think is rare</li></ol>]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2012/11/benchmarking-c-functions-using-foreign-function-interface/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Code benchmarking in Haskell &#8211; some thoughts about random data generation</title>
		<link>http://lambda.jstolarek.com/2012/10/code-benchmarking-in-haskell-some-thoughts-about-random-data-generation/</link>
		<comments>http://lambda.jstolarek.com/2012/10/code-benchmarking-in-haskell-some-thoughts-about-random-data-generation/#comments</comments>
		<pubDate>Wed, 24 Oct 2012 17:11:29 +0000</pubDate>
		<dc:creator>Jan Stolarek</dc:creator>
				<category><![CDATA[haskell]]></category>
		<category><![CDATA[benchmarking]]></category>

		<guid isPermaLink="false">http://lambda.jstolarek.com/?p=799</guid>
		<description><![CDATA[In my last post I showed you how to use criterion library to write benchmarks for Haskell code. In tutorial project that I created to demonstrate my ideas I decided to generate random data for benchmarking. Bryan O&#8217;Sullivan has commented on my approach that &#8220;the code (&#8230;) that generates random inputs on every run would [...]]]></description>
				<content:encoded><![CDATA[<p style="text-align: justify;">In my last post I showed you how to use <a href="http://hackage.haskell.org/package/criterion">criterion</a> library to write benchmarks for Haskell code. In <a href="https://github.com/jstolarek/haskell-testing-stub/">tutorial project</a> that I created to demonstrate my ideas I decided to generate random data for benchmarking. <a href="http://www.serpentine.com/blog/">Bryan O&#8217;Sullivan</a> has <a href="http://www.reddit.com/r/haskell/comments/11w5c1/code_benchmarking_in_haskell_using_criterion_and/">commented on my approach</a> that &#8220;the code (&#8230;) that generates random inputs on every run would be a good antipattern for performance testing.&#8221; After giving some thought to his words I think I see his point.</p>
<p style="text-align: justify;">The code that Bryan refers to looks like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">main <span style="color: #339933; font-weight: bold;">::</span> <span style="color: #cccc00; font-weight: bold;">IO</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span>
main <span style="color: #339933; font-weight: bold;">=</span> newStdGen <span style="color: #339933; font-weight: bold;">&gt;&gt;=</span> defaultMainWith benchConfig <span style="color: green;">&#40;</span><span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">.</span> benchmarks
&nbsp;
benchmarks <span style="color: #339933; font-weight: bold;">::</span> RandomGen g <span style="color: #339933; font-weight: bold;">=&gt;</span> g <span style="color: #339933; font-weight: bold;">-&gt;</span> <span style="color: green;">&#91;</span>Benchmark<span style="color: green;">&#93;</span>
benchmarks <span style="color: #339933; font-weight: bold;">=</span> <span style="color: #339933; font-weight: bold;">...</span></pre></td></tr></table></div>

<p style="text-align: justify;">Each time a benchmark suite is run a different random numbers generator is created with <code>newStdGen</code>. This generator is then used by the <code>benchmarks</code> function to create values used for benchmarking. When I designed this <strong>I made an assumption that values of the data don&#8217;t influence the flow of computations</strong>. I believe that this holds for the shift functions I benchmarked in my tutorial. It doesn&#8217;t really matter what values are in the shifted list. As long as lists have the same length on different runs of the benchmark the results are comparable, but if you want to have the same random values generated on each run you can create a <code>StdGen</code> based on a seed that you supply. The modified <code>main</code> function would look like this:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="haskell" style="font-family:monospace;">main <span style="color: #339933; font-weight: bold;">=</span> <span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span>mkStdGen <span style="color: red;">123456</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">&gt;&gt;=</span> defaultMainWith benchConfig <span style="color: green;">&#40;</span><span style="font-weight: bold;">return</span> <span style="color: green;">&#40;</span><span style="color: green;">&#41;</span><span style="color: green;">&#41;</span> <span style="color: #339933; font-weight: bold;">.</span> benchmarks</pre></td></tr></table></div>

<p style="text-align: justify;">What happens however when data values do influence the flow of computation? In that case you definitely don&#8217;t want <code>newStdGen</code>, as it would make results of benchmark incomparable between different runs: you wouldn&#8217;t know if the speed-up is caused by changes in the code or data. It is also very likely that you don&#8217;t want to use <code>mkStdGen</code>. Why? Well, you would certainly get results comparable between different runs. The problem is that you wouldn&#8217;t know the characteristics of the data used for this particular benchmark. For example let&#8217;s assume that your algorithm executes faster when the data it processes contains many zeros. You benchmark the algorithm with random values created with a fixed <code>StdGen</code> and get a very good result. But how many zeros were in the data used for benchmarking? Perhaps 50% of input were zeros? You don&#8217;t know that. In this case you definitely want to prepare your own input data sets (e.g. one with many zeros and one without any) to measure the performance of your code based on input it receives. I guess Bryan is right here &#8211; careless use of random data generation for benchmarking can be a shot in the foot.</p>
]]></content:encoded>
			<wfw:commentRss>http://lambda.jstolarek.com/2012/10/code-benchmarking-in-haskell-some-thoughts-about-random-data-generation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- This Quick Cache file was built for (  lambda.jstolarek.com/category/haskell/feed/ ) in 2.34183 seconds, on May 25th, 2013 at 7:48 pm UTC. -->
<!-- This Quick Cache file will automatically expire ( and be re-built automatically ) on May 25th, 2013 at 8:48 pm UTC -->