
On 28/10/2010 14:21, Bertram Felgenhauer wrote:
Hi,
let (first,rest) = break (const False) input in print (length (first ++ rest))
When I compile this program using -O2 and use a large text file as input the code runs in constant space. If I understand correctly, the program runs in constant space because ghc uses an optimization similar to the one proposed by Wadler/Sparud.
Right. The optimization works by producing special thunks for tuple selectors which the garbage collector can recognize and evaluate during GC.
However the implementation in GHC is quite brittle. See
http://hackage.haskell.org/trac/ghc/ticket/2607
I suspect your program is another instance of this behaviour.
If I define the following function which is based on break
splitBy :: (a -> Bool) -> [a] -> [[a]] splitBy _ [] = [] splitBy p xs = l : case ys of [] -> [] _:zs -> splitBy' p zs where (l,ys) = break p xs
I haven't looked in detail; what follows is a guess of what ghc may be doing. It could be verified by looking at the generated core.
The where-binding desugars to something like
let q = break p xs ys = case q of (_, ys) -> ys l = case q of (l, _) -> l in ...
ys can be inlined into splitBy, producing
l : case (case q of (l, ys) -> ys) of [] -> [] _:zs -> splitBy' p zs
l : case q of (l, ys) -> case ys of [] -> [] _:zs -> splitBy' p zs
and now the tuple selector is no longer recognizable.
Yes, that's exactly what happens. Cheers, Simon