Oops, that was sloppy. Yes, your version does get the job done without allocating. The corrected test is attached to this email.

The lens interface does look quite full featured! And it's nice to see that it consistently includes '_' variants. I cite that as additional evidence for the norm ;-).

Personally, I still want traverseWithKey_ for convenience, especially because the solution you used is non-obvious. I imagine many Data.Map users would not come up with it (as the rest of us on this thread didn't).

Best,

-Ryan

-----------------------------

import Control.Applicative

import Control.Monad

import Control.DeepSeq

import Control.Exception

import GHC.Stats

import qualified Data.Map.Strict as M

import Data.Time.Clock

import Data.Monoid

import System.Mem

import System.Environment

main :: IO ()

main = do

args <- getArgs

let size = case args of

[] -> 1000000::Int

[n] -> read n

let m0 = M.fromList (map (\i -> (i,i)) [1..size])

let fn 500000 v = putStrLn "Got it!"

fn _ _ = return ()

-- let fn i v = putStrLn$"fn: "++show(i,v)

st <- getCurrentTime

evaluate$ rnf m0

en <- getCurrentTime

performGC

s1 <- getGCStats

putStrLn$"Constructed map in "++show (diffUTCTime en st)++"\n "++ show s1++"\n"

------------------------------------------------------------

-- Regular traverseWithKey uses 48MB

-- traverseWithKey_ uses 200K of allocation:

st <- getCurrentTime

M.traverseWithKey_ fn m0

en <- getCurrentTime

performGC

s2 <- getGCStats

putStrLn$"[traverseWithKey_] Consumed map in "++show (diffUTCTime en st)++"\n "++ show s2++"\n"

putStrLn$"Bytes allocated during consume: "++show (bytesAllocated s2 - bytesAllocated s1)

------------------------------------------------------------

-- foldrWithKey uses 32MB allocation:

st <- getCurrentTime

M.foldrWithKey (\k a -> (fn k a >>)) (return ()) m0

en <- getCurrentTime

performGC

s3 <- getGCStats

putStrLn$"[foldrWithKey] Consumed map in "++show (diffUTCTime en st)++"\n "++ show s3++"\n"

putStrLn$"Bytes allocated during consume: "++show (bytesAllocated s3 - bytesAllocated s2)

------------------------------------------------------------

-- An alternate version was proposed by Shachaf Ben-Kiki:

st <- getCurrentTime

traverseWithKey_ fn m0

en <- getCurrentTime

performGC

s4 <- getGCStats

putStrLn$"[alternate traverseWithKey_] Consumed map in "++show (diffUTCTime en st)++"\n "++ show s4++"\n"

putStrLn$"Bytes allocated during consume: "++show (bytesAllocated s4 - bytesAllocated s3)

return ()

foldMapWithKey :: Monoid r => (k -> a -> r) -> M.Map k a -> r

foldMapWithKey f = getConst . M.traverseWithKey (\k x -> Const (f k x))

-- Since the Applicative used is Const (newtype Const m a = Const m), the

-- structure is never built up.

--(b) You can derive traverseWithKey_ from foldMapWithKey, e.g. as follows:

newtype Traverse_ f = Traverse_ { runTraverse_ :: f () }

instance Applicative f => Monoid (Traverse_ f) where

mempty = Traverse_ (pure ())

Traverse_ a `mappend` Traverse_ b = Traverse_ (a *> b)

traverseWithKey_ :: Applicative f => (k -> a -> f ()) -> M.Map k a -> f ()

traverseWithKey_ f = runTraverse_ .

foldMapWithKey (\k x -> Traverse_ (void (f k x)))

On Mon, Jul 29, 2013 at 3:08 PM, Shachaf Ben-Kiki <shachaf@gmail.com> wrote:

On Mon, Jul 29, 2013 at 11:47 AM, Ryan Newton <rrnewton@gmail.com> wrote:
> Shachaf, I checked and Milan's commits that improve traverseWithKey were
> already incorporated when I ran my tests above. The extra speedup is good
> but doesn't change the O(1) vs. O(N) allocation situation.

Wait, are you saying you couldn't get it to work in constant memory at
all without modifying containers? I didn't actually look at the
benchmarks in much detail before. I just looked at the code -- it
looks like in your latest posted benchmark you don't actually use the
alternate traverseWithKey_ anywhere -- instead you use foldrWithKey
twice (the perils of not compiling with -Wall!). I just ran the
benchmark with the alternate version and it looks like it uses
constant memory.

Maybe I'm misunderstanding. I'm not against adding this to containers,
but it should be clear with such a patch whether it's being added for
necessity or convenience. It looks to me like it's the latter.

(By the way: lens also provides this function, with the name
"itraverse_" (i for indexed). I tried it and it looks like it also
uses constant memory.)

Shachaf