
The performance of mapM appears to be supralinear in the length of the list it is mapping on. Does it need to be this way? As a comparison, both mapM_ and map are linear in the length of the list. To wit: travis@PW:~/Documents/insurer$ ghci GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. Prelude> :set +s Prelude> :set +t Prelude> :m Data.List Data.Maybe Prelude Data.List Data.Maybe> foldl' (+) 0 $ fromJust $ mapM (\x -> Just x) [1..500000] 125000250000 it :: Integer (2.23 secs, 112875864 bytes) Prelude Data.List Data.Maybe> foldl' (+) 0 $ fromJust $ mapM (\x -> Just x) [1..1000000] 500000500000 it :: Integer (6.86 secs, 214002204 bytes) Prelude Data.List Data.Maybe> foldl' (+) 0 $ fromJust $ mapM (\x -> Just x) [1..2000000] 2000001000000 it :: Integer (24.39 secs, 429299748 bytes) Prelude Data.List Data.Maybe> foldl' (+) 0 $ map (\x -> x - 1) [1..1000000] 499999500000 it :: Integer (0.77 secs, 171213436 bytes) Prelude Data.List Data.Maybe> foldl' (+) 0 $ map (\x -> x - 1) [1..10000000] 49999995000000 it :: Integer (7.42 secs, 1723399472 bytes) Prelude Data.List Data.Maybe> foldl' (+) 0 $ map (\x -> x - 1) [1..40000000] 799999980000000 it :: Integer (30.46 secs, 6894835952 bytes) Prelude Data.List Data.Maybe> mapM_ (\x -> Just x) [1..1000000] Just () it :: Maybe () (0.42 secs, 82761248 bytes) Prelude Data.List Data.Maybe> mapM_ (\x -> Just x) [1..10000000] Just () it :: Maybe () (3.87 secs, 808012660 bytes) Prelude Data.List Data.Maybe> mapM_ (\x -> Just x) [1..100000000] Just () it :: Maybe () (38.40 secs, 8054769564 bytes) Prelude Data.List Data.Maybe>

On Wednesday 07 September 2011, 01:01:08, Travis Erdman wrote:
The performance of mapM appears to be supralinear in the length of the list it is mapping on.
Hmm. Could reproduce with 6.12.3 and 7.0.4, but not with 7.2.1.
Does it need to be this way?
Apparently it doesn't, and it seems to be fixed now.

Travis Erdman
The performance of mapM appears to be supralinear in the length of the list it is mapping on. Does it need to be this way? As a comparison, both mapM_ and map are linear in the length of the list.
It needs to be this way in most monads. It's not a problem of mapM itself, but of its definition in the particular monad. In general it's a bad idea to use mapM over IO. For [] it will eat lots of memory quickly and by its mere definition there is nothing you can do about that. mapM_ is linear, because it can throw away the results, so no complicated accumulation occurs. map is usually linear, because used properly it will be optimized away leaving just a loop, which doesn't produce any data structures in memory and is just run element by element. Greets, Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/

* Ertugrul Soeylemez
In general it's a bad idea to use mapM over IO.
Could you explain why? Thanks. -- Roman I. Cheplyaka :: http://ro-che.info/

On Fri, 2011-09-09 at 00:41 +0200, Roman Cheplyaka wrote:
* Ertugrul Soeylemez
[2011-09-07 16:20:03+0200] In general it's a bad idea to use mapM over IO.
Could you explain why?
Thanks.
Hmm. Isn't it explained by next sentence ("For [] it will eat lots of memory quickly and by its mere definition there is nothing you can do about that.")? Regards

On Friday 09 September 2011, 00:41:11, Roman Cheplyaka wrote:
* Ertugrul Soeylemez
[2011-09-07 16:20:03+0200] In general it's a bad idea to use mapM over IO.
Could you explain why?
Take it with a grain of salt, there's nothing necessarily wrong with using mapM over IO on short lists. The problem is that IO's semantics imply that nothing can be made available before the entire list has been consumed and a large thunk is built on the way. Thus for longish lists there's a serious risk of stack overflows (or even heap exhaustion if you mapM the right [wrong] functions). The same applies to replicateM, and to other monads with a (>>=) which isn't lazy enough.

Roman Cheplyaka
In general it's a bad idea to use mapM over IO.
Could you explain why?
Most applications don't require loading the entire result into memory, so a combinator like foldM is more appropriate. You should use mapM over IO only, when the list is short, or when there is really no way around loading everything into memory. Greets, Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/
participants (5)
-
Daniel Fischer
-
Ertugrul Soeylemez
-
Maciej Marcin Piechotka
-
Roman Cheplyaka
-
Travis Erdman