The blocked on black hole message is very suspicious. It means that thread 7 is blocked waiting for another thread to evaluate a thunk. But in this case, it's thread 7 that created that thunk and is supposed to be doing the evaluating. This is some evidence that Gregory's theory is correct and your encode function loops somewhere.
Gregory,The options in the 7.8.3 user guide says in the -Msize option that by default the heap is unlimited. I have several applications, and they all have messages like:7fddc7bcd700: cap 2: waking up thread 7 on cap 27fddc7bcd700: cap 2: thread 4 stopped (yielding)7fddcaad6740: cap 2: running thread 7 (ThreadRunGHC)7fddcaad6740: cap 2: thread 7 stopped (heap overflow)7fddcaad6740: cap 2: requesting parallel GC7fddc5ffe700: cap 0: starting GC7fddc57fd700: cap 1: starting GC7fdda77fe700: cap 3: starting GC7fddcaad6740: cap 2: starting GCI assumed that when the heap ran out of space, it caused a GC, or it enlarged the heap. The programs that have these messages run for very long periods of time, and when I heap profile them, they use about 500KM to 1MB over long periods of time, and are quite stable.As a test, I ran the hang application with profiling to see if memory jumps up before or after the hang.What I notice is the app moves along using about 800KB, then there is a spike to 2MB at the hang. So I believe you, but I am confused about the RTS behavior and how I can have all these overflow messages in a normal application and how to tell the difference between these routine messages vs a real heap problem.So, I dug deeper into the log. A normal execution for sending a command looks like:7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)7f99e6ffd700: cap 0: requesting parallel GC7f99e6ffd700: cap 0: starting GC7f99e6ffd700: cap 0: GC working7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: all caps stopped for GC7f99e6ffd700: cap 0: finished GC7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: sendCommand7f99e6ffd700: cap 0: sendCommand: encoded7f99e6ffd700: cap 0: sendCommand: size 47f99e6ffd700: cap 0: sendCommand: unpacked7f99e6ffd700: cap 0: Sending command of size 47f99e6ffd700: cap 0: Sending command of size "\NUL\EOT"7f99e6ffd700: cap 0: sendCommand: sent7f99e6ffd700: cap 0: sendCommand: flushed7f99e6ffd700: cap 0: thread 7 stopped (blocked on an MVar)7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 2 stopped (yielding)7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC)7f99e6ffd700: cap 0: fetchTelemetryServer7f99e6ffd700: cap 0: fetchTelemetryServer: got lockThe thread is run, overflows, GC, runs, then blocks on an MVAr.For a the hang case:7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: sendCommand7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)7f99e6ffd700: cap 0: requesting parallel GC7f99e6ffd700: cap 0: starting GC7f99e6ffd700: cap 0: GC working7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: all caps stopped for GC7f99e6ffd700: cap 0: finished GC7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)7f9a05362a40: cap 0: thread 1408 stopped (yielding)7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)7f99e6ffd700: cap 0: requesting parallel GC7f99e6ffd700: cap 0: starting GC7f99e6ffd700: cap 0: GC working7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: GC idle7f99e6ffd700: cap 0: GC done7f99e6ffd700: cap 0: all caps stopped for GC7f99e6ffd700: cap 0: finished GC7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 7 stopped (yielding)7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 2 stopped (yielding)7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC)7f99e6ffd700: cap 0: fetchTelemetryServer7f99e6ffd700: cap 0: fetchTelemetryServer: got lock...7f99e6ffd700: cap 0: fetchTelemetryServer: got lock7f99e6ffd700: cap 0: fetchTelemetryServer: unlock7f99e6ffd700: cap 0: fetchTelemetryServer7f99e6ffd700: cap 0: fetchTelemetryServer: got lock7f99e6ffd700: cap 0: fetchTelemetryServer: unlock7f99e6ffd700: cap 0: fetchTelemetryServer7f99e6ffd700: cap 0: thread 45 stopped (yielding)7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)7f9a05362a40: cap 0: thread 1408 stopped (suspended while making a foreign call)7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)7f9a05362a40: cap 0: thread 1408 stopped (suspended while making a foreign call)7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)7f99e6ffd700: cap 0: thread 7 stopped (blocked on black hole owned by thread 7)7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)7f9a05362a40: cap 0: thread 1408 stopped (suspended while making a foreign call)run, overflow, GC, run, overflow, GC, run, yield, other stuff, run blocked on black holeAnd that is the last activity for thread 7.I found a few links about black holes and such, so I’ll go off and read those and try to learn what the RTS is doing and why it can hang on a black hole. But if you have some hits, let me know.Mike_______________________________________________On Nov 11, 2014, at 4:01 PM, Gregory Collins <greg@gregorycollins.net> wrote:On Tue, Nov 11, 2014 at 2:06 PM, Michael Jones <mike@proclivis.com> wrote:Those are all over the log even when it runs properly. So I assume the runtime is resizing the heap or something.
No, it means you're exhausting the heap (maybe the runtime stack for the thread running "encode"), probably because "encode" is infinite-looping. I think Occam's razor applies here, check that any recursion you're doing is actually reducing the recursive argument. Perhaps you could post the code (e.g. http://gist.github.com/)?
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users