
On 7/26/06, SevenThunders
Bulat Ziganshin-2 wrote:
Hello Matthew,
Sunday, July 23, 2006, 10:35:41 AM, you wrote:
sequence $ [ reffill b s | s <- [0..(fi temits)-1], b <- [0..(fi
nc)-1]]
Now thats interesting. I can see that this function is more appropriate since I do not need to retrieve data from the IO monad, but what I don't understand is why it's actually faster. I will give it a try and test it on a large set to see if things change.
let's see at their (possible) definitions:
sequence [] = return [] sequence (action:actions) = do x <- action xs <- sequence actions return (x:xs)
sequence_ [] = return () sequence_ (action:actions) = do action sequence_ actions
sequence_ is TAIL-RECURSIVE function. moreover, when it's inlined, the result is what just all the action in list are sequentially performed without creating any intermediate data structures. so, using sequence_ in above example is equivalent to implementing this cycle by hand.
but when you use sequence, result of each action is saved and list of all results is built. this list requires 12 bytes per element, so you got 600 mb of garbage and much slower execution
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Well lo and behold I also need decent performance when doing a lot of IO actions that do not discard their arguments. For example suppose I want to generate a huge array of random numbers.
Well, why would you want a huge array of random numbers? In Haskell there are two big ways to gain efficiency, strictness and laziness. In this case I think laziness is a big candidate (the "huge array" part gives it away). Also there is no reason generating random numbers should be in the IO monad - in fact the built-ins for random numbers are mainly regular pure functions. You only need IO in the very beginning to get a seed. So basically, try to use the extra modularity options that Haskell gives you to separate things that *need* to be IO and things that can just be regular lazy structures (like an infinite list of random numbers occupying memory the size of a pointer). In your case you'd use newStdGen in the IO monad to get a generator, and then produce a "huge" (e.g. infinite) list of random numbers using that (in a pure non-side-effect way), where only the numbers actually needed will ever be computed. If however you do need to do actual IO actions and you find it takes too much space, you can add some extra laziness using unsafeIntereaveIO. It basically just postpones an action until its result is needed. This is what readFile uses, btw, to get lazy IO. I think that's the way to go (rather than unsafePerformIO) because it's still IO so you don't end up in the situation where you have a pure function returning different results everytime you call it, and you still get laziness. Be careful, though, that your particular IO action won't behave strangely if it gets called at some random time or not at all. /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862