
Am Samstag 09 Januar 2010 08:04:20 schrieb Will Ness:
Daniel Fischer
writes: Am Freitag 08 Januar 2010 19:45:47 schrieb Will Ness:
Daniel Fischer
writes: It's not tail-recursive, the recursive call is inside a celebrate.
It is (spMerge that is).
No. "In computer science, tail recursion (or tail-end recursion) is a special case of recursion in which the last operation of the function, the tail call, is a recursive call." The last operation of spMerge is a call to celebrate or the pair constructor (be that P or (,)). Doesn't matter, though, as for lazy languages, tail recursion isn't very important.
It calls tail-recursive celebrate in a tail position. What you've done, is to eliminate the outstanding context, by moving it inward. Your detailed explanation is more clear than that. :)
BTW when I run VIP code it is consistently slower than using just pairs,
I can't reproduce that. Ceteris paribus, I get the exact same allocation and GC figures whether I use People or (,), running times identical enough (difference between People and (,) is smaller than the difference between runs of the same; the difference between the fastest and the slowest run of the two is less than 0.5%). I think it must be the other changes you made.
modified with wheel and feeder and all. So what's needed is to re-implement your approach for pairs:
mergeSP (a,b) ~(c,d) = let (bc,bd) = spMerge b c d in (a ++ bc, bd) where spMerge u [] d = ([], merge u d) spMerge u@(x:xs) w@(y:ys) d = case compare x y of LT -> consSP x $ spMerge xs w d EQ -> consSP x $ spMerge xs ys d GT -> consSP y $ spMerge u ys d
consSP x ~(a,b) = (x:a,b) -- don't forget that magic `~` !!!
I called that (<:).
BTW I'm able to eliminate sharing without a compiler switch by using
Yes, I can too. But it's easy to make a false step and trigger sharing. I can get a nice speedup (~15%, mostly due to much less garbage collecting) by doing the final merge in a function without unnecessarily wrapping the result in a pair (whose secondcomponent is ignored): -- Doesn't need -fno-cse anymore, -- but it needs -XScopedTypeVariables for the local type signatures primes :: forall a. Integral a => () -> [a] primes () = 2:3:5:7:11:13:calcPrimes 17 primes'' where calcPrimes s cs = rollFrom s `minus` compos cs bootstrap = 17:19:23:29:31:37:calcPrimes 41 bootstrap primes' = calcPrimes 17 bootstrap primes'' = calcPrimes 17 primes' pmults :: a -> ([a],[a]) pmults p = case map (*p) (rollFrom p) of (x:xs) -> ([x],xs) multip :: [a] -> [([a],[a])] multip ps = map pmults ps compos :: [a] -> [a] compos ps = case pairwise mergeSP (multip ps) of ((a,b):cs) -> a ++ funMerge b (pairwise mergeSP cs) funMerge b (x:y:zs) = let (c,d) = mergeSP x y in mfun b c d (pairwise mergeSP zs) mfun u@(x:xs) w@(y:ys) d l = case compare x y of LT -> x:mfun xs w d l EQ -> x:mfun xs ys d l GT -> y:mfun u ys d l mfun u [] d l = funMerge (merge u d) l This uses a different folding structure again, which seems to give slightly better performance than the original tree-fold structure. In contrast to the VippyPrimes, it profits much from a larger allocation area, running with +RTS -A2M gives a >10% speedup for prime # 10M/20M, +RTS -A8M nearly 20%. -A16M and -A32M buy a little more, but in that range at least, it's not much (may be significant for larger targets). Still way slower than PQ, but the gap is narrowing.
mtwprimes () = 2:3:5:7:primes where primes = doPrimes 121 primes
doPrimes n prs = let (h,t) = span (< n) $ rollFrom 11 in h ++ t `diff` comps prs doPrimes2 n prs = let (h,t) = span (< n) $ rollFrom (12-1) in h ++ t `diff` comps prs
mtw2primes () = 2:3:5:7:primes where primes = doPrimes 26 primes2 primes2 = doPrimes2 121 primes2
Using 'splitAt 26' in place of 'span (< 121)' didn't work though.
How about them wheels? :)
Well, what about them?