In particular, will the compiler be able to avoid allocating when building up that large monadic computation
in the foldrWithKey?

Since it is a foldr, the first action can be run without knowing the following ones. That is, at no time all actions must be allocated.

Hi all,

Well, to test it out I went ahead and made a patch to containers, here:

   https://github.com/rrnewton/containers/commit/9b1a913c923fd9409932434894cca21cd4e313de

Henning, while I agree with you that GHC shouldn't truly *need* to generate a first class representation of that 10M-step monadic action... it remains the case that the updated test (attached to the end of this email), shows the traverseWithKey_ version allocating only 200K, whereas the foldrWithKey allocates 32M.

Further, I argue that the traverseWithKey_ version is clearer to programmers that don't yet grok the first class nature of monadic actions (e.g. a fold to build up one big monadic action).  And again, if we are not sure of the performance implications of that in this thread, I imagine many Haskell programmers would not be.

So unless there's a strong counterargument, I propose the above patch for acceptance.

   -Ryan

P.S. Using the HEAD version of cabal the allocation for -O0 foldrWithKey actually went up from 200M to 300M.

Appendix: updated code below and at this URL:
https://gist.github.com/rrnewton/5912513#file-maptest-hs
--------------------------------------------
import Control.DeepSeq
import GHC.Stats
import qualified Data.Map.Strict as M
import Data.Time.Clock
import Control.Exception
import System.Mem

main :: IO ()
main = do
  t0 <- getCurrentTime
  let m0 = M.fromList (map (\i -> (i,i)) [1..1000000::Int])
  evaluate$ rnf m0 
  t1 <- getCurrentTime
  performGC
  s1 <- getGCStats  
  putStrLn$"Constructed map in "++show (diffUTCTime t1 t0)++"\n "++ show s1++"\n"
  let fn 500000 v = putStrLn "Got it!"
      fn _      _ = return ()

  -- Regular traverseWithKey uses 48MB
  -- traverseWithKey_ usse 200K of allocation:
  M.traverseWithKey_ fn m0
  t2 <- getCurrentTime
  performGC
  s2 <- getGCStats 
  putStrLn$"[traverseWithKey_] Consumed map in "++show (diffUTCTime t2 t1)++"\n "++ show s2++"\n"
  putStrLn$"Bytes allocated during consume:  "++show (bytesAllocated s2 - bytesAllocated s1)

  -- foldrWithKey uses 32MB allocation:
  M.foldrWithKey (\k a -> (fn k a >>)) (return ()) m0
  t3 <- getCurrentTime
  performGC
  s3 <- getGCStats 
  putStrLn$"[foldrWithKey] Consumed map in "++show (diffUTCTime t3 t2)++"\n "++ show s3++"\n"
  putStrLn$"Bytes allocated during consume:  "++show (bytesAllocated s3 - bytesAllocated s2)
  return ()