a faster, accumulating mapM (was Re: [Haskell-cafe] mapM vs mapM_ performance)

24 Apr 2008

      Hello Luke and other Haskellers,

Thanks for the reply, but as I noted before, the amount of memory
allocated (and resident) is roughly the same.  Anyhow it's definitely
not a GC issue because I wrote an accumulating version of mapM and got
close to mapM_ 's performance.

In the code below, main1 is mapM_, main2 is the current mapM
(basicallly sequence . map), map3 is a hand-coded accumulating
parameter version, mapM2 is the accumulating parameter mapM and main4
uses mapM2.  The timings I get are about 15, 175, 20 and 20 seconds
for main1, main2, main3 and main4 respectively.  main2 uses about 2%
less memory than main3 or main4 on this particular run, though I don't
know if that is true generally.

Unless someone can see a reason why mapM2 is not as good as mapM, can
I suggest replacing the implementation of mapM by the implementation
of mapM2.  A 10x speedup seems to be a bigger deal than GCing 2% more
memory.

best regards, Ben

module Main where

import System.IO (openFile, IOMode(..), hPutStr)

testlst = let ls = [(i, [(j, (fromIntegral j)::Float) | j <-
[1..5]::[Int]]) | i <- [1..500000]::[Int]]
          in ls

main = do
  h <- openFile "bardump" WriteMode
  mapM_ ((hPutStr h) . show) testlst

main2 = do
  h <- openFile "bardump2" WriteMode
  result <- mapM ((hPutStr h) . show) testlst
  print $ length result

main3 = do
  h <- openFile "bardump3" WriteMode
  result <- dump h testlst []
  print $ length result
    where dump h (x:xs) accum = do
            hPutStr h $ show x
            dump h xs $ ():accum
          dump _ [] accum = return accum

mapM2 :: Monad m => (a -> m b) -> [a] -> m [b]
{-# INLINE mapM2 #-}
mapM2 fn lst = mapM2accum fn lst []
    where mapM2accum _ [] accum = return accum
          mapM2accum fn (x:xs) accum = do
            r <- fn x
            mapM2accum fn xs (r:accum)

main4 = do
  h <- openFile "bardump2" WriteMode
  result <- mapM2 ((hPutStr h) . show) testlst
  print $ length result

On Thu, Apr 24, 2008 at 1:37 AM, Luke Palmer  wrote:
...
On Tue, Apr 22, 2008 at 11:32 AM, Ben  wrote:
...
Hello Haskellers,
I'm running ghc 6.8.2 on vista 64.  Consider the following program,
 which is compiled with -02 -prof -auto-all:
module Main where
import System.IO (openFile, IOMode(..), hPutStr)
testlst = let ls = [(i, [(j, (fromIntegral j)::Float) | j <-
 [1..5]::[Int]]) | i <- [1..500000]::[Int]]
          in ls
main2 = do
  h <- openFile "bardump" WriteMode
  mapM_ ((hPutStr h) . show) testlst
main = do
  h <- openFile "bardump2" WriteMode
  mapM ((hPutStr h) . show) testlst
  return ()
main and main2 are different in only that mapM_ versus mapM_ are used.
  But the mapM version runs about 20x slower!  I'm running with +RTS -p
 -hc -RTS and I see that the amount of memory allocated is about the
 same, and I think the resident memory is about the same too.  But the
 mapM_ version runs in about 8.7 seconds, and the mapM version takes
 167 seconds.
My first guess is that the garbage collector is not running at all in
 the mapM_ version, but is working it's ass off in the mapM version
 cleaning up the list that will never be used.
...
You may ask, why use mapM if you're discarding the values?
 Unfortunately in my real app I need the values, which are more
 interesting than IO ().
If you need the values, then you've got to pay that price I suppose.
 If you need the values, I'm going to take a stab that in your real app
 you use a lot of memory because of this (because presumably you're
 keeping the values around), whereas you're just seeing a speed hit on
 this small test program.
Luke