
The `mwc-random` package solves this specific problem by providing a function that creates a vector of random integers. The `vector` package solves this in general by letting you use mutation to store intermediate results in order to avoid a space leak. On 08/26/2013 01:46 AM, Niklas Hambüchen wrote:
On #haskell we recently had a discussion about the following:
import System.Random
list <- replicateM 1000000 randomIO :: IO [Int]
I would think that this gives us a list of a million random Ints. In fact, this is what happens in ghci. But with ghc we get:
Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it.
This is because sequence is implemented as
sequence (m:ms) = do x <- m xs <- sequence ms return (x:xs)
and uses stack space when used on some [IO a].
From a theoretical side, this is an implementation detail. From the software engineering side this disastrous because the code is
* obviously correct by itself * the first thing people would come up with * not exaggerating: a million elements is not much * used a lot of places: mapM, replicateM are *everywhere*
and yet it will kill our programs, crash our airplanes, and give no helpful information where the problem occurred.
Effectively, sequence is a partial function.
(Note: We are not trying to obtain a lazy list of random numbers, use any kind of streaming or the likes. We want the list in memory and use it.)
We noticed that this problem did not happen if sequence were implemented with a difference list.
What do you think about this? Should we "fix" functions like this, probably trading off a small performance hit, or accept that idiomatic Haskell code can crash at any time?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries