
On Mon, Dec 13, 2004 at 11:56:45AM +0100, Daniel Fischer wrote:
I had the idea that, instead of using 'seq', one might check for parity:
countLines2 aux (_:l) | even aux = ... | odd aux = ...
that prevents stack overflow, but is much slower than countLines1 aux (_:l) | aux `seq` False = ... | otherwise = ...
I also tried countLines3 aux (_:l) | even aux || odd aux = ... | otherwise = ... and countLines4 aux (_:l) | even aux || True = ... | otherwise = ... which also are slower than countLines1. Well, checking for parity takes time, so that is no real surprise. What somehow puzzles me is that, compiling with -O, all versions of countLines are equally fast, if I give the type
countLines :: Int -> [a] -> Int,
but if I replace Int with Integer, the plain version without guards and the 'seq' version are equally fast whereas the parity-check versions are equally fast among themselves but slower than plain.
This is likely because since Int is builtin to the compiler and compiled directly into 'case' statments in core, it can deduce that one of even or odd must be true and via normal case simplifications transform (even x || odd x) directly into (x `seq` True). Integer, however is implemented partially by the gmp library, so ghc is less privy to its internals and probably was unable to deduce that one of even or odd must be true for Integer. In particular, pattern matching on Integers is implemented via nested equality tests rather than direct case statements. (AFAIK)
If I omit the type-declaration, all versions perform differently, plain producing stack overflow, countLines1 being by far the fastest, countLines4 coming in second, then countLines3 and last countLines2.
This is because if you omit the type declaration, ghc will deduce the much more general type of countLines :: Num a => [x] -> a which in ghc's implementation of overloading will take an extra argument describing how to work on 'a's and call the overloaded functions, the compiler cannot know how they are implemented so is unable to apply certain transformations.
If I give the type Int -> [a] -> Int and use interpreted code, plain overflows, countLines1 is by far the fastest, but then things change. Now countLines2 comes in second, third is countLines4, relatively close, last, at a distance, is countLines3. (Qualitatively the same, but slower, without type-declaration).
With type Int -> [a] -> Int, but compiled without -O, the results are (for countLinesX 0 [1 .. 1500000]) plain : stack overflow, 1 : 0.43 secs, 2 : 1.10 secs, 3 : 1.26 secs, 4 : 0.93 secs.
Now obviously -O does a good job (though 'seq' isn't so bad even without -O), but why does checking for parity take extra time for Integer and not for Int?
see first response above for my guess :)
And why does compilation without -O change the order of versions 2-4 ?
This is likely because of strictness analysis. What strictness analysis does in effect is analyze the program to determine whether a result will definitly be required, it is a rather expensive operation in the presence of higher order and mutually recursive functions, but the end result is that if ghc deduces that a value will always be needed, such as the first argument to countLines', it places (the equivalant) of a seq in there for you :). This is called the 'let-to-case' transformation in the compiler because a 'case' always evaluates its argument in the internal language while a let allocates a lazy thunk. It is just something that one must get used to in haskell programming (with ghc at least) that -O can have drastic order changing effects compared to the relativly incremental ones for other languages. The various papers available on the net regarding ghc's internals are fascinating if you are interested in the various optimizations it trys.
If anybody knows, mind telling me?
nope :). A lot of the above is speculation, I am by no means a ghc expert, but I belive it is what is happening. -- John Meacham - ⑆repetae.net⑆john⑈