Re: Thread behavior in 7.8.3

30 Oct 2014

      My understanding is that -fno-omit-yields is subtly different.  I think
that's for the case when a function loops without performing any heap
allocations, and thus would never yield even after the context switch
timeout.  In my case the looping function does perform heap allocations and
does eventually yield, just not until after the timeout.

Is that understanding correct?

(technically, doesn't it change to yielding after stack checks or something
like that?)

On Thu, Oct 30, 2014 at 8:24 AM, Edward Z. Yang  wrote:
...
I don't think this is directly related to the problem, but if you have a
thread that isn't yielding, you can force it to yield by using
-fno-omit-yields on your code.  It won't help if the non-yielding code
is in a library, and it won't help if the problem was that you just
weren't setting timeouts finely enough (which sounds like what was
happening). FYI.
Edward
...
I guess I should explain what that flag does...
The GHC RTS maintains capabilities, the number of capabilities is
specified
by the `+RTS -N` option.  Each capability is a virtual machine that
executes Haskell code, and maintains its own runqueue of threads to
Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700:
process.
...
A capability will perform a context switch at the next heap block
allocation (every 4k of allocation) after the timer expires.  The timer
defaults to 20ms, and can be set by the -C flag.  Capabilities perform
context switches in other circumstances as well, such as when a thread
yields or blocks.
My guess is that either the context switching logic changed in ghc-7.8,
...
possibly your code used to trigger a switch via some other mechanism
(stack
overflow or something maybe?), but is optimized differently now so
instead
it needs to wait for the timer to expire.
The problem we had was that a time-sensitive thread was getting scheduled
on the same capability as a long-running non-yielding thread, so the
time-sensitive thread had to wait for a context switch timeout (even
...
there were free cores available!).  I expect even with -N4 you'll still
see
occasional delays (perhaps <5% of calls).
We've solved our problem with judicious use of `forkOn`, but that won't
help at N1.
We did see this behavior in 7.6, but it's definitely worse in 7.8.
Incidentally, has there been any interest in a work-stealing scheduler?
There was a discussion from about 2 years ago, in which Simon Marlow
noted
it might be tricky, but it would definitely help in situations like this.
John L.
On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones 
wrote:
...
John,
Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
like -N4.
Thanks. This saves my project, as I need to deploy on a single core
Atom
and was stuck.
Mike
On Oct 29, 2014, at 5:12 PM, John Lato  wrote:
By any chance do the delays get shorter if you run your program with
`+RTS
-C0.005` ?  If so, I suspect you're having a problem very similar to
one
that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
reason), involving possible misbehavior of the thread scheduler.
On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones 
wrote:
...
I have a general question about thread behavior in 7.8.3 vs 7.6.X
I moved from 7.6 to 7.8 and my application behaves very differently. I
have three threads, an application thread that plots data with
wxhaskell or
sends it over a network (depends on settings), a thread doing usb bulk
writes, and a thread doing usb bulk reads. Data is moved around with
TChan,
and TVar is used for coordination.
When the application was compiled with 7.6, my stream of usb traffic
was
smooth. With 7.8, there are lots of delays where nothing seems to be
running. These delays are up to 40ms, whereas with 7.6 delays were a
1ms or
so.
When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
fine
without with -N2/4.
The program is compiled -O2 with profiling. The -N2/4 version uses
more
memory,  but in both cases with 7.8 and with 7.6 there is no space
leak.
I tired to compile and use -ls so I could take a look with
...
...
...
but the application hangs and writes no data to the file. The CPU
fans run
wild like it is in an infinite loop. It at least pops an unpainted
wxhaskell window, so it got partially running.
One of my libraries uses option -fsimpl-tick-factor=200 to get around
...
...
...
compiler.
What do I need to know about changes to threading and event logging
between 7.6 and 7.8? Is there some general documentation somewhere
or
though
threadscope,
the
that
...
...
...
might help?
I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
installed myself, after removing 7.6 with apt-get.
Any hints appreciated.
Mike
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users