
Am Samstag 01 August 2009 20:50:10 schrieb Sterling Clover:
On Aug 1, 2009, at 2:06 PM, Paul Moore wrote:
Is the issue with random numbers just the implementation, or is it something inherent in the non-pure nature of a random number generator that makes it hard for Haskell to implement efficiently? If the latter, that probably makes Haskell a somewhat poor choice for simulation-type programs.
If you view a PRNG as a function from the seed to the sequence of generated numbers or as a function state -> (bitpattern, newstate), PRNGs are pure (at least, I know no counterexample), so it's not inherently inefficient in Haskell, though it's probably still faster in C. One thing that makes StdGen slow is splittability, as Sterling points out below. For a simulation programme where you don't need splittability, choose a different PRNG.
Well, I'm not sure of the details, but in your original implementation, you're performing IO to pull the seed out of a ref at every iteration. Pekka Karjalainen's doesn't do that, which probably helps with the speedup. Along with that, Haskell has a fairly slow random implementation. As I recall however, this is partially because it hasn't received a great deal of optimization, but mainly because the algorithm itself fulfills some rather strong properties -- in particular it must be fairly statistically robust, and it must provide a "split" function which produces generators that are independently robust [1]. This limits algorithm choice quite a bit.
For other random numbers, with different properties (faster, but with tradeoffs in robustness, or ability to split, or both), you can check hackage for at least mersenne-random and gsl-random.
I didn't get much speedup with System.Random.Mersenne.Pure64 (might be because I have a 32-bit system and Word64 goes through foreign calls, if that is still the case), but GSL.Random.Gen reduced the time by a factor of over 5. It forces you back into IO and is a little less convenient, but if speed is a concern, it's a price worth to pay. --------------------------------------- module Main (main) where import GSL.Random.Gen import qualified Data.Map as Map import Data.Map (Map) import Data.List import System.IO.Unsafe import System.Time import Data.Word dice :: RNG -> Int -> Int -> IO Int dice _ 0 n = return 0 dice rng m n = do total <- dice rng (m - 1) n roll <- fmap (+1) $ getUniformInt rng n return (total + roll) simulate _ 0 _ _ = return [] simulate rng count m n = unsafeInterleaveIO $ do val <- dice rng m n tl <- simulate rng (count-1) m n return (val:tl) histogram :: Ord a => [a] -> [(a,Int)] histogram = Map.assocs . foldl' f Map.empty where f m k = Map.insertWith' (+) k 1 m simulation rng = do lst <- simulate rng 1000000 3 6 return (histogram lst) main = do rng <- newRNG mt19937 sd <- getTimeSeed setSeed rng sd -- omit seeding for reproducible results s <- simulation rng putStrLn (show s) getTimeSeed :: IO Word64 getTimeSeed = do TOD a b <- getClockTime return . fromInteger $ 10^6*a + b `quot` (10^6) --------------------------------------------------
There may be others that I don't recall.
Cheers, Sterl.