New subject: Speeding up Data.List.inits

19 Jul 2014


      I did a little more benchmarking, and it turns out that the overhead
to use Okasaki-style persistently efficient banker's queues is not
terribly severe because these queues are simple and therefore fast
(it's around 1.5 times as slow at its very worst, but that may have
been a fluke--the performance generally seems comparable). It does
indeed give far, far better performance when looking at the heads of
many of the lists at the end of the result, as Edward Kmett predicted.
I don't know if these queues are available in a standard library, but
the following is a full implementation of initsQ, including what it
needs of the queue implementation:

data Queue a = Queue
  {front :: [a],
   lenF :: Int,
   rear :: [a],
   lenR ::Int} deriving (Show)

empty = Queue {front=[],lenF=0,rear=[],lenR=0}

queue f lF r lR
  | lR <= lF = Queue f lF r lR
  | otherwise    = Queue {front = f ++ reverse r,
                          lenF = lF + lR,
                          rear = [],
                          lenR = 0}

(Queue f lF r lR) |> x = queue f lF (x:r) (lR+1)

toListQ (Queue f _ r _) = f ++ reverse r

initsQ xs = map toListQ (scanl (|>) empty xs)


On Sat, Jul 19, 2014 at 5:25 AM, Edward Kmett  wrote:
...
The half-baked thought was that reaching the kth list is O(k), and reversing _each list_ is O(k), but with a smarter structure you might share some of the work of construction across multiple lists. e.g. if you had an inits like function that returned a list of queues rather than a list of lists you could actually do less work overall setting up the inits asymptotically, sharing the work. If you only take a constant prefix off each queue then this can change the overhead from O(n^2) to = O(kn).
A reversal or difference list requires you to pay separately for each list.
I we are in "violent agreement", however, as I indicated in my comment that the constant factors almost definitely make it a non-starter for something like Data.List. ;)
-Edward
On Sat, Jul 19, 2014 at 4:30 AM, David Feuer  wrote:
...
Edward Kmett, reaching the kth list is O(k). Reversing it is also O(k). I don't know that anything as fancy as something by Okasaki could improve matters—such data structures tend to have large constant factors. Something much worse than that seems to be wrong with the Data.List version at the moment.
On Sat, Jul 19, 2014 at 4:25 AM, Edward Kmett  wrote:
...
Good catch, the common inits and tails version of subsequences makes pretty heavy use of that, as do many of the uses of inits where you use it to get k-element substrings.
I'd be curious if an okasaki-style catenable output restricted deque could recover both efficient head access and reasonably efficient appends, but the overhead of that is probably way out of line with the performance at stake!
-Edward
On Sat, Jul 19, 2014 at 4:14 AM, Henning Thielemann  wrote:
...
Am 19.07.2014 09:51, schrieb David Feuer:
...
Background: I was looking at the code for Data.List.inits in
base-4.7.0.0 (function renamed for clarity):
initsDL                   :: [a] -> [[a]]
initsDL xs                =  [] : case xs of
                                   []      -> []
                                   x : xs' -> map (x :) (initsDL xs')
The recursive maps made me suspicious. I decided to try writing a
different version:
initsHO xs = map ($ []) (scanl q id xs)
   where
     q f x = f . (x:)
I mentioned this on #haskell and noted that it would be a nice exercise
to write it with less fancy function footwork using map reverse. rasfar
responded (modulo naming) with
initsR xs = map reverse (scanl q [] xs)
   where
     q acc x = x : acc
The 'map' gives efficient access to the first elements of the sub-lists, thus it seems that initsDL is faster than initsR when you access only prefixes of the sub-lists. E.g.
Prelude> head $ last $ inits [0..10000000::Int]
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries

Re: Speeding up Data.List.inits

David Feuer

David Feuer

tags

participants (1)