
Well for one thing, note that allPairs3 produces the result in reverse order:
allPairs1 "abc" [('a','b'),('a','c'),('b','c')] allPairs2 "abc" [('a','b'),('a','c'),('b','c')] allPairs3 "abc" [('b','c'),('a','c'),('a','b')]
allPairs2 uses "guarded recursion" which the optimizer probably likes,
although I don't quite see why that version would use twice as much memory
as allPairs3.
See also:
http://www.haskell.org/haskellwiki/Tail_recursion
Here's a fun alternative for you to benchmark, using an old trick. I kind
of doubt that this one will optimize as nicely as the others, but I am by
no means an optimization guru:
allPairsS :: [a] -> [(a, a)]
allPairsS xs = go xs [] where
go [] = id
go (y:ys) = (map (\a -> (y, a)) ys ++) . go xs
Further reading:
http://www.haskell.org/haskellwiki/Difference_list
-- Dan Burton
On Tue, Sep 3, 2013 at 3:28 PM, Scott Pakin
I'm a Haskell beginner, and I'm baffled trying to reason about code performance, at least with GHC. For a program I'm writing I needed to find all pairs of elements of a list. That is, given the list "ABCD" I wanted to wind up with the list [('A','B'),('A','C'),('A','D')**,('B','C'),('B','D'),('C','D')**], where the order of elements in the resulting list is immaterial, but I don't want both ('A', 'B') and ('B', 'A'), for example.
My first attempt looked like this:
allPairs1 :: [a] -> [(a, a)] allPairs1 [] = [] allPairs1 (x:xs) = map (\a -> (x, a)) xs ++ allPairs1 xs
Benchmarking with ghci's ":set +s" reveals the following performance (median of three runs shown):
$ ghci GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> :!ghc -c -O2 allpairs.hs Prelude> :load allpairs Ok, modules loaded: AllPairs. Prelude AllPairs> :m +Control.DeepSeq Prelude Control.DeepSeq AllPairs> :show modules AllPairs ( allpairs.hs, allpairs.o ) Prelude Control.DeepSeq AllPairs> :set +s Prelude Control.DeepSeq AllPairs> deepseq (allPairs1 [1..10000]) True True (4.85 secs, 4004173984 bytes)
A colleague suggested getting rid of the list append as follows:
allPairs2 :: [a] -> [(a, a)] allPairs2 [] = [] allPairs2 (x:xs) = allPairs2' x xs xs where allPairs2' x [] [] = [] allPairs2' x (y:ys) zs = (x,y) : allPairs2' x ys zs allPairs2' x [] (z:zs) = allPairs2' z zs zs
Not surprisingly, that runs faster:
Prelude Control.DeepSeq AllPairs> deepseq (allPairs2 [1..10000]) True True (2.14 secs, 4403686376 bytes)
I then figured I could do even better by rewriting the code tail-recursively:
allPairs3 :: [a] -> [(a, a)] allPairs3 [] = [] allPairs3 (x:xs) = allPairs3' x xs xs [] where allPairs3' :: a -> [a] -> [a] -> [(a, a)] -> [(a, a)] allPairs3' h (x:xs) ys result = allPairs3' h xs ys ((h, x):result) allPairs3' _ [] (y:ys) result = allPairs3' y ys ys result allPairs3' _ [] [] result = result
This takes half the memory as the above (yay!) but runs substantially *slower* (and appears to be thrashing my computer's memory system):
Prelude Control.DeepSeq AllPairs> deepseq (allPairs3 [1..10000]) True True (10.58 secs, 2403820152 bytes)
What gives? Why does my tail-recursive implementation run even slower and presumably produce more garbage than my initial, naive implementation? Is there some optimization GHC is performing for allPairs1 and allPairs2 that it can't apply to allPairs3?
Similarly, how should I, as a newcomer who wants to write efficient Haskell code, reason about whether a code change is likely to improve rather than degrade performance? Guidelines like "per-iteration appends = bad" and "tail recursion = good" are great, but the preceding data indicate that something subtler is taking precedence over those guidelines with respect to code performance.
Thanks in advance, -- Scott
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe