New subject: Formal proposal: replace Data.List.inits

20 Jul 2014

      Minor errors in copying/memory:

initsHO = map ($[]) . scanl q id
 where
   q f x = f . (x :)

On Sun, Jul 20, 2014 at 2:03 AM, David Feuer  wrote:
...
As previously discussed, Data.List.inits has terrible performance:
something as simple as
print $ length $ inits [100000]
will take an absurdly long time. The simplest decent improvement seems to be
initsHO = map ($[]) . scanl q []
 where
   q f x = f . (q :)
A slight transformation of this by Andrew Seniuk (rasfar on #haskell)
gives the slightly faster
initsR = map reverse . scanl (flip (:)) []
As Henning Thielemann and Edward Kmett pointed out, these perform
poorly when calculating things like `map head $ drop 100000 $ inits
[100050]` because the costs of the reverses or compositions are
monolithic and unshared. Edward Kmett pointed out that this could
probably be fixed by using a queue (although he isn't sure it's worth
doing so). As it turns out, it can, and with only a trivial penalty in
other cases. The following uses a stripped-down partial implementation
of Okasaki's banker's queue (see the attached file for a version with
explanatory comments as well as the benchmark code):
import Data.Bits (popCount)
import Data.Word (Word)
data Queue a = Queue {-# UNPACK #-} !Word [a] [a]
queue :: Word -> [a] -> [a] -> Queue a
queue lp1 f r
  | popCount lp1 /= 1 = Queue lp1 f r
  | otherwise         = Queue lp1 (f ++ reverse r) []
emptyQ :: Queue a
emptyQ = Queue 1 [] []
snocQ :: Queue a -> a -> Queue a
snocQ (Queue lp1 f r) x = queue (lp1 + 1) f (x:r)
toListQ :: Queue a -> [a]
toListQ (Queue _ f r) = f ++ reverse r
initsQ :: [a] -> [[a]]
initsQ = map toListQ . scanl snocQ emptyQ
This implementation appears to take care of the performance wrinkle in
initsR and initsHO, while being only very slightly slower in other
contexts. It's longer, certainly, but it's not long.
I have attached the Criterion report produced by the attached program
comparing initsR to initsQ. Previous tests definitively showed that
initsHO is a small constant factor worse than initsR, and
Data.List.inits is incomparably worse, so I limited this round to
initsR and initsQ.
I therefore propose that we should replace the current implementation
of inits with initsQ, the queue-based one, unless someone comes up
with something faster and/or simpler very soon.

Re: Formal proposal: replace Data.List.inits

David Feuer

Alexander Berntsen

David Feuer

Bertram Felgenhauer

David Feuer

Bertram Felgenhauer

David Feuer

David Feuer

David Feuer

David Feuer

Joachim Breitner

tags

participants (4)