
On Fri, Jan 17, 2020 at 07:08:37PM -0500, Viktor Dukhovni wrote:
An alternative implementation of splitWeight is lazier, and produces the first element of the initial chunk in constant time even when the target weight is large and requires billions of elements (on a 64bit system):
-- import Data.Bifunctor
splitWeight' :: Int -> (a -> Int) -> [a] -> ([a], [a]) splitWeight' target weight xs = go 0 xs where go _ [] = ([], []) go n (h:ts) | let w = n + weight h , w < target = first (h:) $ go w ts | otherwise = ([h], ts)
-- Define directly, or import from Data.Bifunctor first f ~(a, b) = (f a, b)
And with the below all-in-one version: chunks :: Int -> (a -> Int) -> [a] -> [[a]] chunks target weight xs = go 0 xs where go _ [] = [] go n (h:ts) | null ts = [h] : [] | let w = n + weight h , w < target = cons1 h $ go w ts | otherwise = [h] : go 0 ts cons1 h ~(a:b) = (h : a) : b a strict fold of the generated list of chunks compiled with optimization runs in: * constant space regardless of the input list length * ~constant space as the chunk size varies over 7 orders of magnitude. * ~constant time as the chunk size varies over 7 orders of magnitude. The folds I'm testing look like: foldl' (\a l -> a + foldl' (+) 0 l) 0 $ chunks 1_000_000 (const 1) [1..100_000_000] (with -XNumericUnderscores for readable large decimals). For substantially smaller chunks (size 1 or 100 instead of 1_000_000), this runs less than a factor of 2 slower than the version using my first splitWeight, which benefits from tail call optimization and does not significantly suffer for having to reverse the constructed chunks. The lazier splitWeight' would be mostly useful in other contexts where one might not consume most of the constructed chunk. It is slower when the result is consumed strictly. Of course if one bothers with weighted chunking, it seems likely that the subsequent processing will be strict. -- Viktor.