
Hello, I need to be able to use strict bytestrings to efficiently build a lazy bytestring, so I'm using putByteString in Data.Binary. But I also need random numbers, so I'm using mwc-random. I end up in the "IO Put" monad, and it's giving me some issues. To build a random document, I need a random length, and a collection of random words. So I have docLength :: IO Int word :: IO Put Oh, also putSpace :: Put My first attempt: doc :: IO Put doc = docLength >>= go where go 1 = word go n = word >> return putSpace >> go (n-1) Unfortunately, with this approach, you end up with a one-word document. I think this makes sense because of the monad laws, but I haven't checked it. Second attempt: doc :: IO Put doc = docLength >>= go where go 1 = word go n = do w <- word ws <- go (n-1) return (w >> putSpace >> ws) This one actually works, but it holds onto everything in memory instead of outputting as it goes. If docLength tends to be large, this leads to big problems. Oh, yes, and my main is currently main = L.writeFile "out.txt" =<< fmap runPut doc This needs to be lazier so disk writing can start sooner, and to avoid eating up tons of memory. Any ideas? Thanks! Chad

Chad Scherrer
Second attempt: doc :: IO Put doc = docLength >>= go where go 1 = word go n = do w <- word ws <- go (n-1) return (w >> putSpace >> ws)
This one actually works, but it holds onto everything in memory instead of outputting as it goes. If docLength tends to be large, this leads to big problems.
Sorry to answer my own post, but I've got a kludgy work-around now. I tried a WriterT approach, and also building my own "PutT" like Data.Binary.Put, neither with any luck. Instead I just changed the type to word, doc :: IO () and had it write standard out as is goes. Not nearly as elegant, but at least it works now. Thanks, Chad

On Wed, Sep 15, 2010 at 12:45 AM, Chad Scherrer
Hello,
I need to be able to use strict bytestrings to efficiently build a lazy bytestring, so I'm using putByteString in Data.Binary. But I also need random numbers, so I'm using mwc-random. I end up in the "IO Put" monad, and it's giving me some issues.
To build a random document, I need a random length, and a collection of random words. So I have docLength :: IO Int word :: IO Put
Oh, also putSpace :: Put
My first attempt: doc :: IO Put doc = docLength >>= go where go 1 = word go n = word >> return putSpace >> go (n-1)
I think you misunderstand, here, what return does, or possibly >>. This function generates docLength random words, but discards all of them except for the last one. That's what the >> operator does: run the IO involved in the left action, but discard the result before running the right action. The IO action 'return x' doesn't do any IO, so 'return x >> a' does nothing, discards x, and then does a, i.e. return x >> a = a
Unfortunately, with this approach, you end up with a one-word document. I think this makes sense because of the monad laws, but I haven't checked it.
Yes, the above equation is required to hold for any monad (it is a consequence of the law that 'return x >>= f = f x')
Second attempt: doc :: IO Put doc = docLength >>= go where go 1 = word go n = do w <- word ws <- go (n-1) return (w >> putSpace >> ws)
This one actually works, but it holds onto everything in memory instead of outputting as it goes. If docLength tends to be large, this leads to big problems.
Here you're using the >> from the Put monad, which appends lazy ByteStrings rather than sequencing IO actions. The problem is that the ordering of IO is strict, which means that 'doc' must generate all the random words before it returns, i.e. it must be completely done before L.writeFile gets a look-in. It turns out the problem you're trying to solve isn't actually simple at all. Some of the best approaches to efficient incremental IO are quite involved - e.g. Iteratees. But your case could be made a great deal easier if you used a pure PRNG instead of one requiring IO. If you could make word a pure function, something like word :: StdGen -> (StdGen, Put) (which is more or less the same as word :: State StdGen Put), then you'd be able to use it lazily and safely.
participants (2)
-
Ben Millwood
-
Chad Scherrer