Strange performance effects with unsafePerformIO

Hello, we have a strange performance behaviour when we use unsafePerformIO, at least with GHC 6.12.3 and 7.0.1. Please consider the example program following at the end of this post. Running the original code the execution time is about 26 seconds, while uncommenting one (or both) of the comments shrinks it to about 0.01 seconds on our machine. Is there an explanation for this effect? Regards, Bjoern -- --------------- module Main where import System.IO.Unsafe traverse [] = return () -- traverse (_:xs) = traverse xs traverse (_:xs) = traverse xs >> return () makeList 0 = [] -- makeList n = () : (makeList (n - 1)) makeList n = () : (unsafePerformIO . return) (makeList (n - 1)) main = traverse $ makeList (10^5)

unsafePerformIO traverses the stack to perform blackholing. It could
be that your code uses a deep stack and unsafePerformIO is repeatedly
traversing it. Just a guess, though.
2011/3/24 Björn Peemöller
Hello,
we have a strange performance behaviour when we use unsafePerformIO, at least with GHC 6.12.3 and 7.0.1.
Please consider the example program following at the end of this post. Running the original code the execution time is about 26 seconds, while uncommenting one (or both) of the comments shrinks it to about 0.01 seconds on our machine.
Is there an explanation for this effect?
Regards, Bjoern
-- ---------------
module Main where
import System.IO.Unsafe
traverse [] = return () -- traverse (_:xs) = traverse xs traverse (_:xs) = traverse xs >> return ()
makeList 0 = [] -- makeList n = () : (makeList (n - 1)) makeList n = () : (unsafePerformIO . return) (makeList (n - 1))
main = traverse $ makeList (10^5)
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Push the envelope. Watch it bend.

2011/3/25 Thomas Schilling
unsafePerformIO traverses the stack to perform blackholing. It could be that your code uses a deep stack and unsafePerformIO is repeatedly traversing it. Just a guess, though.
Sounds reasonable. Here is a variant of the program without intermediate lists. import System.IO.Unsafe main = run (10^5) run 0 = return () run n = (unsafePerformIO . return) (run (n - 1)) >> return () I think it does not do much more than producing a large stack and (like the original program) is much faster if the unsafe-return combination or the final return (which probably prohibits tail-call optimization) is removed. Sebastian
2011/3/24 Björn Peemöller
: Hello,
we have a strange performance behaviour when we use unsafePerformIO, at least with GHC 6.12.3 and 7.0.1.
Please consider the example program following at the end of this post. Running the original code the execution time is about 26 seconds, while uncommenting one (or both) of the comments shrinks it to about 0.01 seconds on our machine.
Is there an explanation for this effect?
Regards, Bjoern
-- ---------------
module Main where
import System.IO.Unsafe
traverse [] = return () -- traverse (_:xs) = traverse xs traverse (_:xs) = traverse xs >> return ()
makeList 0 = [] -- makeList n = () : (makeList (n - 1)) makeList n = () : (unsafePerformIO . return) (makeList (n - 1))
main = traverse $ makeList (10^5)
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Push the envelope. Watch it bend.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 25/03/2011 08:56, Sebastian Fischer wrote:
2011/3/25 Thomas Schilling
: unsafePerformIO traverses the stack to perform blackholing. It could be that your code uses a deep stack and unsafePerformIO is repeatedly traversing it. Just a guess, though.
Sounds reasonable. Here is a variant of the program without intermediate lists.
import System.IO.Unsafe
main = run (10^5)
run 0 = return () run n = (unsafePerformIO . return) (run (n - 1))>> return ()
I think it does not do much more than producing a large stack and (like the original program) is much faster if the unsafe-return combination or the final return (which probably prohibits tail-call optimization) is removed.
Incidentally this will be faster with GHC 7.2, because we implemented chunked stacks, so unsafePerformIO never has to traverse more than 32k of stack (you can tweak the chunk size with an RTS option). This is still quite a lot of overhead, but at least it is bounded. The example above runs in 1.45s for me with current HEAD, and I gave up waiting with 7.0. Cheers, Simon
Sebastian
2011/3/24 Björn Peemöller
: Hello,
we have a strange performance behaviour when we use unsafePerformIO, at least with GHC 6.12.3 and 7.0.1.
Please consider the example program following at the end of this post. Running the original code the execution time is about 26 seconds, while uncommenting one (or both) of the comments shrinks it to about 0.01 seconds on our machine.
Is there an explanation for this effect?
Regards, Bjoern
-- ---------------
module Main where
import System.IO.Unsafe
traverse [] = return () -- traverse (_:xs) = traverse xs traverse (_:xs) = traverse xs>> return ()
makeList 0 = [] -- makeList n = () : (makeList (n - 1)) makeList n = () : (unsafePerformIO . return) (makeList (n - 1))
main = traverse $ makeList (10^5)
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Push the envelope. Watch it bend.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Simon Marlow schrieb:
Incidentally this will be faster with GHC 7.2, because we implemented chunked stacks, so unsafePerformIO never has to traverse more than 32k of stack (you can tweak the chunk size with an RTS option). This is still quite a lot of overhead, but at least it is bounded.
The example above runs in 1.45s for me with current HEAD, and I gave up waiting with 7.0.
Thank you all for your explanations, the blackholing indeed seems to be the cause for the slowdown. Is there any documentation available about the blackholing process? Maybe we can find a hint on how to change our code to avoid the problem. Regards, Bjoern

On 07/04/2011 13:51, Björn Peemöller wrote:
Simon Marlow schrieb:
Incidentally this will be faster with GHC 7.2, because we implemented chunked stacks, so unsafePerformIO never has to traverse more than 32k of stack (you can tweak the chunk size with an RTS option). This is still quite a lot of overhead, but at least it is bounded.
The example above runs in 1.45s for me with current HEAD, and I gave up waiting with 7.0.
Thank you all for your explanations,
the blackholing indeed seems to be the cause for the slowdown. Is there any documentation available about the blackholing process?
Unfortunately no. But the main point is that unsafePerformIO needs to traverse the stack down to the most recent thunk evaluation, so the bad case happens when you're in a non-tail-recursive loop with no intervening thunk evaluations.
Maybe we can find a hint on how to change our code to avoid the problem.
If you don't mind your unsafePerformIO being performed multiple times when running in parallel, then you can use unsafeDupablePerformIO to avoid the overhead. Apart from that, there's really no way around it. Cheers, Simon
participants (4)
-
Björn Peemöller
-
Sebastian Fischer
-
Simon Marlow
-
Thomas Schilling