
11.06.2012, 14:17, "Malcolm Wallace"
that there are no side-effects
There are — PRNG state is updated for RealWorld, that's why monadic replicateM is used. You can add something like print $ (VU.!) e 500000 after e is bound and still get 0.057 sec with do-less version. This quite matches the performance claimed by mwc-random package and seems reasonable since modern hardware shouldn't have any problem with generating twenty million random variates in a second with one execution thread. Your note on laziness would be correct in case like ------ 8< ------ import qualified Data.Vector.Unboxed as VU import Data.Functor import System.Random.MWC import System.Random.MWC.Distributions (standard) count = 100000000 main = do g <- create e <- return $ VU.replicate count (212.8506 :: Double) return () ------ >8 ------- Where unused `e` is truly left unevaluated (you could force it by matching with `!e` for example). Profiling indicates that random number sampling really occurs for both of original versions with `replicateM`, expectedly taking most of time: Mon Jun 11 14:24 2012 Time and Allocation Profiling Report (Final) slow-mwc-vector +RTS -p -RTS total time = 5.45 secs (5453 ticks @ 1000 us, 1 processor) total alloc = 3,568,827,856 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc uniform2 System.Random.MWC 45.0 53.7 uniformWord32 System.Random.MWC 31.3 31.5 standard.loop System.Random.MWC.Distributions 4.1 1.1 uniform1 System.Random.MWC 3.9 4.5 nextIndex System.Random.MWC 3.6 1.4 uniform System.Random.MWC 2.8 3.3 uniform System.Random.MWC 2.5 1.4 wordsToDouble System.Random.MWC 2.1 0.5 I could drop do notation and go with the simpler version if I wanted just a vector of variates. But in reality I want a vector of tuples with random components: ------ 8< ------ import qualified Data.Vector.Unboxed as VU import Control.Monad import System.Random.MWC import System.Random.MWC.Distributions (standard) count = 1000000 main = do g <- create e <- VU.replicateM count $ do v1 <- standard g v2 <- standard g v3 <- standard g return (v1, v2, v3) return () ------ >8 ------- which runs for the same 11.412 seconds. Since three times more variates are generated and run time stays the same, this implies that perhaps some optimizations of vector package interfere with mwc-random — can this be the case? This becomes quite a bottleneck in my application. On the other hand, mwc-random has `normal` function implemented as follows: ------ 8< ------ normal m s gen = do x <- standard gen return $! m + s * x ------ >8 ------- which again uses explicit `do`. Both standard and normal are marked with INLINE. Now if I try to write ------ 8< ------ e <- VU.replicateM count $ normal 0 1 g ------ >8 ------- in my test case, quite expectedly I get horrible performance of 11 seconds, even though I'm not using do myself.