
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 So, the Hint library was recently updated and I was editing Mueval to run with (i386) 6.10, when I discovered that for some reason, DoS'ing expressions were succeeding in rendering my machine unusable. Eventually, I discovered that my watchdog thread didn't seem to be running. But with +RTS -N2 -RTS all my tests did pass! Here's a simple example of what I mean; the following is basically a very lightly adapted version of the actual Mueval code: - ---------------- import Control.Concurrent (forkIO, killThread, myThreadId, threadDelay, throwTo, yield, ThreadId) import System.Posix.Signals (sigXCPU, installHandler, Handler(CatchOnce)) import Control.OldException (Exception(ErrorCall)) main :: IO () main = do tid ThreadId -> IO () watchDog tout tid = do installHandler sigXCPU (CatchOnce $ throwTo tid $ ErrorCall "Time limit exceeded.") Nothing forkIO $ do threadDelay (tout * 100000) -- Time's up. It's a good day to die. throwTo tid (ErrorCall "Time limit exceeded") yield -- give the other thread a chance killThread tid -- Die now, srsly. error "Time expired" return () -- Never reached. Either we error out in -- watchDog, or the evaluation thread finishes. - -------------- Now, from the looks of it, this should always error out with "Time limit exceeded." And the threading is done via forkIO, so it shouldn't matter how it's compiled (be it threaded or no) or run; nor do we need to worry about optimizations, since x has no sensible value for any input - it's always bottom. But the results are very different: gwern@craft:31542~>=ghc -fforce-recomp -O0 example.hs && ./a.out ^C^C gwern@craft:31532~>=ghc -fforce-recomp -O0 -threaded example.hs && ./a.out ^C^C gwern@craft:31536~>=ghc -threaded -fforce-recomp -O0 -threaded example.hs && ./a.out +RTS -N1 -RTS [a minute later] ^C^C^C^C^C^C gwern@craft:31540~>=ghc -threaded -fforce-recomp -O0 -threaded example.hs && ./a.out +RTS -N2 -RTS a.out: Time limit exceeded gwern@craft:31543~>=ghc -fforce-recomp -O2 example.hs && ./a.out a.out: Time limit exceeded gwern@craft:31544~>=ghc -threaded -fforce-recomp -O2 example.hs && ./a.out a.out: Time limit exceeded gwern@craft:31545~>=ghc -threaded -fforce-recomp -O2 example.hs && ./a.out +RTS -N1 -RTS a.out: Time limit exceeded gwern@craft:31546~>=ghc -threaded -fforce-recomp -O2 example.hs && ./a.out +RTS -N2 -RTS a.out: Time limit exceeded So it seems that without optimizations, or without explicit/multiple OS threads, the watchdog thread never gets called! I'm not any sort of expert on parallelism or GHC, but this seems like a bad thing to me. One final note: I suspect there are further ways this can manifest. When compiled with -O2, example.hs terminates (as it should). But Mueval does in fact set -O2 as a ghc-option:, and yet I still found it looping. So, does anyone know whether this is an already known bug or whether it's something else? - -- gwern -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEAREKAAYFAkk63+EACgkQvpDo5Pfl1oITZgCdG2p4VcB6m5IhfqzOT0fi5qrI VagAnRILZSxBzadSj2wlzoOWfBuwdbJo =X61b -----END PGP SIGNATURE-----

Gwern Branwen wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
So, the Hint library was recently updated and I was editing Mueval to run with (i386) 6.10, when I discovered that for some reason, DoS'ing expressions were succeeding in rendering my machine unusable. Eventually, I discovered that my watchdog thread didn't seem to be running. But with +RTS -N2 -RTS all my tests did pass!
Here's a simple example of what I mean; the following is basically a very lightly adapted version of the actual Mueval code:
- ---------------- import Control.Concurrent (forkIO, killThread, myThreadId, threadDelay, throwTo, yield, ThreadId) import System.Posix.Signals (sigXCPU, installHandler, Handler(CatchOnce)) import Control.OldException (Exception(ErrorCall))
main :: IO () main = do tid ThreadId -> IO () watchDog tout tid = do installHandler sigXCPU (CatchOnce $ throwTo tid $ ErrorCall "Time limit exceeded.") Nothing forkIO $ do threadDelay (tout * 100000) -- Time's up. It's a good day to die. throwTo tid (ErrorCall "Time limit exceeded") yield -- give the other thread a chance killThread tid -- Die now, srsly. error "Time expired" return () -- Never reached. Either we error out in -- watchDog, or the evaluation thread finishes.
This particular example illustrates a bug in 6.10.1 that we've since fixed: http://hackage.haskell.org/trac/ghc/ticket/2783 However in general you can still write expressions that don't allocate anything (e.g. nfib 1000), and your watchdog thread won't get a chance to run unless there's a spare CPU available (+RTS -N2). Cheers, Simon

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Mon, Dec 15, 2008 at 9:00 AM, Simon Marlow wrote: -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEAREKAAYFAklIL3wACgkQvpDo5Pfl1oIqAgCdHruLx+LQ1j7vaoJM3VD2vpNr rG0An0IL9ZXTmwd0bcp0V9clBy6UYeGt =UHCv -----END PGP SIGNATURE-----
This particular example illustrates a bug in 6.10.1 that we've since fixed:
OK, that's good...
However in general you can still write expressions that don't allocate anything (e.g. nfib 1000), and your watchdog thread won't get a chance to run unless there's a spare CPU available (+RTS -N2).
Cheers, Simon
But that's bad. What are my options here? Will this threads-not-running issue be fixed in the next release? Since it worked fine in 6.8 as far as I could tell, that makes me think that it must not be anything completely fundamental and unfixable. -- gwern

Gwern Branwen wrote:
On Mon, Dec 15, 2008 at 9:00 AM, Simon Marlow wrote:
This particular example illustrates a bug in 6.10.1 that we've since fixed:
OK, that's good...
However in general you can still write expressions that don't allocate anything (e.g. nfib 1000), and your watchdog thread won't get a chance to run unless there's a spare CPU available (+RTS -N2).
Cheers, Simon
But that's bad. What are my options here? Will this threads-not-running issue be fixed in the next release? Since it worked fine in 6.8 as far as I could tell, that makes me think that it must not be anything completely fundamental and unfixable.
I'm afraid the underlying problem is one that GHC has always had - that we can't preempt threads that aren't allocating. It's not easily fixable, we would have to inject dummy heap checks into every non-allocating loop, which would seriously hurt performance for those tight loops. In general you can't rely on being able to kill a thread; however, this only applies to compiled code, interpreted code should always be preemptable, even if it isn't allocating. Cheers, Simon

Hello Simon, Wednesday, December 17, 2008, 4:05:48 PM, you wrote:
I'm afraid the underlying problem is one that GHC has always had - that we can't preempt threads that aren't allocating. It's not easily fixable, we would have to inject dummy heap checks into every non-allocating loop, which would seriously hurt performance for those tight loops.
just technical note - if we unroll such loops and insert one check per 10 repetitions, it may be ok. although conditional execution may be a problem for such solution -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (3)
-
Bulat Ziganshin
-
Gwern Branwen
-
Simon Marlow