
Minor errors in copying/memory:
initsHO = map ($[]) . scanl q id
where
q f x = f . (x :)
On Sun, Jul 20, 2014 at 2:03 AM, David Feuer
As previously discussed, Data.List.inits has terrible performance: something as simple as
print $ length $ inits [100000]
will take an absurdly long time. The simplest decent improvement seems to be
initsHO = map ($[]) . scanl q [] where q f x = f . (q :)
A slight transformation of this by Andrew Seniuk (rasfar on #haskell) gives the slightly faster
initsR = map reverse . scanl (flip (:)) []
As Henning Thielemann and Edward Kmett pointed out, these perform poorly when calculating things like `map head $ drop 100000 $ inits [100050]` because the costs of the reverses or compositions are monolithic and unshared. Edward Kmett pointed out that this could probably be fixed by using a queue (although he isn't sure it's worth doing so). As it turns out, it can, and with only a trivial penalty in other cases. The following uses a stripped-down partial implementation of Okasaki's banker's queue (see the attached file for a version with explanatory comments as well as the benchmark code):
import Data.Bits (popCount) import Data.Word (Word)
data Queue a = Queue {-# UNPACK #-} !Word [a] [a]
queue :: Word -> [a] -> [a] -> Queue a queue lp1 f r | popCount lp1 /= 1 = Queue lp1 f r | otherwise = Queue lp1 (f ++ reverse r) []
emptyQ :: Queue a emptyQ = Queue 1 [] []
snocQ :: Queue a -> a -> Queue a snocQ (Queue lp1 f r) x = queue (lp1 + 1) f (x:r)
toListQ :: Queue a -> [a] toListQ (Queue _ f r) = f ++ reverse r
initsQ :: [a] -> [[a]] initsQ = map toListQ . scanl snocQ emptyQ
This implementation appears to take care of the performance wrinkle in initsR and initsHO, while being only very slightly slower in other contexts. It's longer, certainly, but it's not long.
I have attached the Criterion report produced by the attached program comparing initsR to initsQ. Previous tests definitively showed that initsHO is a small constant factor worse than initsR, and Data.List.inits is incomparably worse, so I limited this round to initsR and initsQ.
I therefore propose that we should replace the current implementation of inits with initsQ, the queue-based one, unless someone comes up with something faster and/or simpler very soon.