
Hi Alex, You wrote:
I'm new to Haskell, and don't quite understand how IO and lazy evaluation work together yet. In order to improve my understanding, I thought I'd try to knock together a translation of a fairly simple problem I can code up in C... Now, I have two questions. The easy question is, how can I make this more idiomatic? ...The second question is performance-related. The Haskell code above overflows the stack when run with an argument of 1000000
These are great questions, and a really fun example! Let me focus back on your original two questions. When you are just playing around with an algorithm, it is not so idiomatic in Haskell to write a program that takes a command-line argument and prints to stdout like in C. Instead, you would more likely write a small module containing some functions. Then you load them into GHCi or Hugs and play with them, and reload as you make changes. So I might write something simple like this into RandomPi.hs: module RandomPi where import System.Random randMax, unitRadius :: Int randMax = floor $ sqrt $ fromIntegral (maxBound `div` 2 :: Int) unitRadius = randMax * randMax calcPi :: RandomGen g => g -> Int -> (Int, Double) calcPi g total = (count, fromIntegral count / fromIntegral total * 4) where count = length $ filter isInCircle randPairs isInCircle (x, y) = x*x + y*y < unitRadius randPairs = take total $ pairs $ randomRs (0, randMax) g -- Group a list into pairs pairs :: [a] -> [(a,a)] pairs (x:x':xs) = (x, x') : pairs xs pairs _ = [] I think this pretty faithfully translates your ideas from C into Haskell. (I was a little more careful than you were regarding MAX_RAND, but that has nothing to do with Haskell. In fact, I'm not sure your C code would work using GNU libc, because you'd get unit_radius = 1 I think due to int overflow.) If you really, really, want a command-line/stdout thing, you would then add in stuff like this to do the required plumbing work: import System.Environment import System.Exit randomPiMain :: IO () randomPiMain = do args <- getArgs when (length args /= 2) $ usage args let total = read $ args !! 1 :: Int g <- newStdGen let (count, myPi) = calcPi g total putStrLn $ "Count: " ++ show count putStrLn $ "Total: " ++ show total putStrLn $ show myPi usage :: [String] -> IO () usage (progname:_) = do putStrLn $ concat ["usage: ", progname, " iteration_count"] exitWith $ ExitFailure 1 Then you would create a separate file containing only two lines, and compile it: import RandomPi main = RandomPiMain So the answer to your first question is that you isolate out the IO stuff - usually not very much - and write the rest of your program entirely pure. Then the IO stuff you either do by hand at the prompt, or you write separate functions for it. In this case, the only IO is: o Reading 'total' o Printing the answer o Giving the usage message o Getting an initial random generator You second question, about blowing the stack, is settled in the above the same way that the previous posters suggested. Represent your big iteration as a list. Try to stick to fairly simple operations on the list. Then the compiler will figure out how to keep things lazy and also release memory as it goes along. Another point (where I disagree a bit with some previous posters): I think that it is not very idiomatic to use split on the random generator here. For several reasons, I think split should usually be avoided, except for things like sharing a generator across multiple threads. Instead, I just pull pairs of random numbers off of a single stream. There is still a problem here - my code (and also all of the previous posters, I think) clobbers the random generator, rendering it unusable for future calculations. In this case that is probably not a problem, but it is a bad habit to get into, and not very polite. But now we become entangled with your second question. Because it is tricky to generate a huge list of random numbers lazily, while still keeping track of the last value of the generator, without blowing the stack. I will leave others to post a traditional solution to that. I avoid that problem entirely by using yet another Haskell idiom - I *always* (OK, almost always) do random calculations inside a lazy State monad. You don't need to understand the theory of monads to understand the following code. You can see that it just generates a big list of random pairs and uses it. The type says to quietly keep the current value of the random generator as state in the background. The idiom "State $ randomR range" is a trick that makes this possible. module RandomPi where import System.Random import Control.Monad.State randMax, unitRadius :: Int randMax = floor $ sqrt $ fromIntegral (maxBound `div` 2 :: Int) unitRadius = randMax * randMax calcPiM :: RandomGen g => Int -> State g (Int, Double) calcPiM total = do randPairs <- replicateM total $ randPairM (0, randMax) let count = length $ filter isInCircle randPairs return (count, fromIntegral count / fromIntegral total * 4) where isInCircle (x, y) = x*x + y*y < unitRadius -- Generate a pair of random numbers randPairM :: (RandomGen g, Random a) => (a, a) -> State g (a, a) randPairM range = do x <- State $ randomR range y <- State $ randomR range return (x, y) To use this at the GHCi or Hugs command prompt, or in randomPiMain, you write: g <- newStdGen evalState (calcPiM 100000) g Hope this helps, Yitz