
apfelmus
While your observation that merge may create an implicit heap is true, it doesn't happen in your code :) When unfolding the foldr1, we get something like
2:.. `merge'` (3:.. `merge'` (5:.. `merge1` (...)))
i.e. just a linear chain of merges. Retrieving the least element is linear time in the worst case. This shape will not change with subsequent reductions of merge. In other words, it's the responsibility of fold to build a heap. Mergesort shows how a fold can build a heap:
http://thread.gmane.org/gmane.comp.lang.haskell.general/15007
For primes , the heap shape has to be chosen carefully in order to ensure termination. It's the same problem that forces you to use foldr1 merge' instead of foldr1 merge .
There's also a long thread about prime sieves
Indeed. Your answer sent my head spinning, giving me something to think about on a flight AMS to SFO. Thanks! Here is a prime sieve that can hang within a factor of two of the fastest code in that thread, until it blows up on garbage collection: ----------------------------------------------------------------- diff :: Ord a => [a] -> [a] -> [a] diff xs@(x:xt) ys@(y:yt) = case compare x y of LT -> x : (diff xt ys) EQ -> (diff xt yt) GT -> (diff xs yt) diff _ _ = undefined union :: Ord a => [a] -> [a] -> [a] union xs@(x:xt) ys@(y:yt) = case compare x y of LT -> x : (union xt ys) EQ -> x : (union xt yt) GT -> y : (union xs yt) union _ _ = undefined twig :: Ord a => [a] -> [a] -> [a] twig (x:xt) ys = x : (union xt ys) twig _ _ = undefined pair :: Ord a => [[a]] -> [[a]] pair (x:y:xs) = twig x y : (pair xs) pair _ = undefined tree :: Ord a => [[a]] -> [a] tree xs = let g (x:xt) = x : (g $ pair xt) g _ = undefined in foldr1 twig $ g xs seed :: Integral a => [a] seed = [2,3,5,7,11,13] wheel :: Integral a => [a] wheel = drop 1 [ 30*j+k | j <- [0..], k <- [1,7,11,13,17,19,23,29] ] multiples :: Integral a => [a] multiples = tree ps where f p n = mod n p /= 0 g (_,ns) p = ([ n*p | n <- ns ], filter (f p) ns) ps = map fst $ tail $ scanl g ([], wheel) $ drop 3 primes primes :: Integral a => [a] primes = seed ++ (diff (drop 3 wheel) multiples) ----------------------------------------------------------------- Here are some timings: [Integer] -O 10^4 10^5 10^6 10^7 ----------------------------------------------------------------- ONeillPrimes | 0m0.023s | 0m0.278s | 0m3.682s | 0m53.920s primes | 0m0.022s | 0m0.341s | 0m5.664s | 8m12.239s This differs from your code in that it works with infinite lists, so it can't build a balanced tree; the best it can do is to build a vine of subtrees that double in size. My conclusion so far from this and other experiments is that pushing data structures into the lazy evaluation model does make them run faster, but at the expense of space, which eventually leads to the code's untimely demise. I can imagine a lazy functional language that would support reification of suspended closures, so one could incrementally balance the suspended computation above. As far as I can tell, Haskell is not such a language. I'd love it, however, if someone could surprise me by showing me the idiom I'm missing here. I will post a version of this code (I have faster but less readable versions) to the prime sieve thread. First, I'm waiting for the other shoe to drop, I still feel like I'm missing something.