Ok, here's a little test to confirm what happens when you try to use foldrWithKey for this.

--------------------------------------------------------
import Control.DeepSeq
import GHC.Stats
import qualified Data.Map.Strict as M
import Data.Time.Clock
import Control.Exception
import System.Mem

main :: IO ()
main = do
  t0 <- getCurrentTime
  let m0 = M.fromList (map (\i -> (i,i)) [1..1000000::Int])
  evaluate$ rnf m0 
  t1 <- getCurrentTime
  performGC
  s1 <- getGCStats  
  putStrLn$"Constructed map in "++show (diffUTCTime t1 t0)++"\n "++ show s1++"\n"
  let fn 500000 v = putStrLn "Got it!"
      fn _      _ = return ()
  
  M.foldrWithKey (\k a -> (fn k a >>)) (return ()) m0
  t2 <- getCurrentTime
  performGC
  s2 <- getGCStats 
  putStrLn$"Consumed map in "++show (diffUTCTime t2 t0)++"\n "++ show s2++"\n"
  putStrLn$"Bytes allocated during consume:  "++show (bytesAllocated s2 - bytesAllocated s1)
  return ()
--------------------------------------------------------  

It's also at this Gist:  https://gist.github.com/rrnewton/5912513#file-maptest-hs

And here is the loop ("go10") generated "fn":
https://gist.github.com/rrnewton/5912513#file-maptest-ddump-simple-L214

Ok, empirically, in -O2, the consumption phase allocates 32MB additional data (for a 1 million element map), and in -O0 it allocates 200MB.  Here's the recursive case of the loop:


((\ (eta_B1 :: GHC.Prim.State# GHC.Prim.RealWorld) ->
case x1_s2rr of _ {
__DEFAULT ->
((go10_r2uL z'_a10K r_a10S)
`cast` (<GHC.Types.NTCo:IO <()>>
~#
(GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, () #))))
eta_B1;

Ok, so I didn't yet look at the STG where there's a clear allocation story.  So, actually, I'm not sure if this recursive case forces GHC to build some O(1M) first class representation of the IO action here, since eta is abstract?  At least, I'm assuming that's what the 32Mb-200Mb allocated is (though 32MB is actually rather skimpy for such a thing... it would have to spend only 4 words per closure...)




On Tue, Jul 2, 2013 at 3:32 PM, Ryan Newton <rrnewton@gmail.com> wrote:
Hi all,

Thanks for the responses.  I want to go through and make sure I understand these.

--------------------------------------------------------
First, Henning, won't both of these allocate in proportion to the size of the map?

    Map.foldrWithKey (\k a -> f k a >>) (return ())
    Foldable.sequence_ . Map.mapWithKey f

In particular, will the compiler be able to avoid allocating when building up that large monadic computation in the foldrWithKey?

--------------------------------------------------------
Edward said to use foldMapWithKey, which hasn't been released yet, but it sounds like it will be.

   https://github.com/ekmett/containers/commit/40187f32a43689ff02ca2b97465aa4fcd9f9d150

Even then, I might argue it is somewhat non-obvious to the programmers how to use this to do a non-allocating "for-each".  For example, I am not totally clear on what will happen with that tree of mappend calls -- will it allocate thunks?  Further, IO is not a monoid, so am I to create an instance of "Monoid (IO ())" in order to use foldMapWithKey to iterate over the Map?

--------------------------------------------------------
On Tue, Jul 2, 2013 at 10:29 AM, Shachaf Ben-Kiki <shachaf@gmail.com> wrote:
> Is there a reason you couldn't implement this just as well using traverseWithKey, à la
http://hackage.haskell.org/packages/archive/lens/3.9.0.2/doc/html/Control-Lens-Fold.html#v:traverseOf_

That function looks more overloaded than the traverse in Data.Map that I'm referring to, e.g. here:
  http://www.haskell.org/ghc/docs/latest/html/libraries/containers/Data-Map-Strict.html#g:13

I'm afraid I don't understand the proposal then -- is it to use lens somehow?  For the traversal I need to do over a Data.Map.Map, I need to fix 't' to be IO or Par or whatever I'm working with, so that the (k -> a -> t b) function I'm passing in can do the effects I need.  

To be specific I'm proposing providing these variants:

   traverseWithKey :: Applicative t => (k -> a -> t b) -> Map k a -> t (Map k b)
   traverseWithKey_ :: Applicative t => (k -> a -> t ()) -> Map k a -> t ()

And without exposing the latter natively, I still don't understand how to trick the former into not allocating, if that's the proposal.

   -Ryan


On Tue, Jul 2, 2013 at 2:54 PM, Edward Kmett <ekmett@gmail.com> wrote:
On Tue, Jul 2, 2013 at 2:45 PM, Henning Thielemann <lemming@henning-thielemann.de> wrote:
  Foldable.sequence_ . Map.mapWithKey f 
 
which looks more elegant.
 
This has the unfortunate consequence that it builds an entire new map strictly before sequencing it, due to the fact that Data.Map is spine-strict. =(

-Edward