
On Thu, Nov 5, 2009 at 11:25 AM, Bulat Ziganshin
Hello John,
Thursday, November 5, 2009, 1:59:14 PM, you wrote:
I hope you don't mind me asking this, but when you state that "idiomatic haskell is much slower than idiomatic C", what do you mean by "much slower"? 2 times? 3 times? 20 times? Order of magnitude?
it depends on task, of course. with a number crunching code, it may be hundreds or even thousands, but this application area isn't typical for haskell, after all. for typical apps like text processing it may be dozens. when we are going into real life, use of FFI libraries and OS calls makes difference even smaller, so i think that practical difference is 3-10 times, in most cases
Ok, this is a reasonable starting point.
I'm not going to disagree with you, however it seems to me the gap is closing.
except for BS library, i don't see too much practical improvements. GHC by itself was made ~20% faster with pointer tagging
I'm really only interested in looking at data of comparisons here.
My opinion may be colored by having just examined Masayuki Takagi's SPH code, which I considered very idiomatic Haskell and for me performed equivalently to his C++ code with the changes I detailed in an earlier email (none of which involved any substantial changes to his code or style in my opinion).
i don't know why his haskell code is so fast - it may be due to improvements in ghc about compiling tight loops, unoptimal C code, bounds placed by memory speed. well, i know one problem in his C compilation - he doesn't used -O3 -fexcess-precision and other funny optimization tricks
True enough. If I did so with ghc, it's only fair to give g++ the chance as well. Although when I tried it with -O3 -ffast-math, I didn't see any gross changes in runtime speed gained from -O2. There may be benefits, but it would take statistical analysis to determine their significance. Also, you may not believe it, but the results I posted were for the first unfolding thresholds I picked. Those could probably be tuned further as well.
there is difference in treating good and bad comparisons. when someone shows that haskell is much slower, there are lot of suggestions how to improve the code, add strictness, use other libs, apply tricky compiling options, even unroll loops by hand. when someone shows small gap between C and Haskell, noone checks that C code is optimized as much as possible. of course, it's mainly because many haskellers can't optimize C code. but this should lead to the conclusion "i'm incompetent in comparison" rather than "haskell is fast"
Re: most optimal C code I will agree that for most discussions on this list nobody checks that the C versions are as optimal as possible, however I would suggest it's for a different reason. Many C libraries and programs are already widely available in a highly-optimized form, often as a result of years of work. Programs like the shootout lead to a similar effect (that is, Haskellers don't bother checking that C is optimized because the C submitters will be doing that). When doing comparisons against code like that, I think it's fair to assume that the C versions are already, if not as fast as possible, within a close enough percentage to establish a fair benchmark. For this specific case, I would look at it slightly differently. The C++ version has a level of performance the author was satisfied with. It presumably isn't optimized to the hilt, but it's very readable. The Haskell version is similarly readable, however the author wasn't happy with the performance (I never saw anything as poor as originally reported, BTW) and asked how it could be improved. I made a few suggestions, as did you and others, none of which IMO make the code non-idiomatic Haskell, and none of which takes more than a few minutes to implement. Could the speed of the C++ version be improved? Probably so, but it wouldn't look as nice. This is probably true for the Haskell code too. Doing either one would likely take a significant amount of work. Neither one may be most optimal, but I would argue that the C++ and Haskell (with suggestions from this list incorporated) programs are basically equivalent in being idiomatic to the language and performance. You could say that both are essentially optimized to the same standard, which is good enough for the task at hand. Even without any changes, for me the starting difference was roughly 50%, which is under your threshold of 3x as "much slower". Most of that difference was made up with compiler flags, whereas I doubt that any flags you pass to g++ (on top of -O3) will yield anything close to a 40% improvement. Cheers, John