
Thanks for looking, Adrian, It'd be great if someone was able to find out more about what's going on here. Bandwidth at GHC HQ is always tight, so the more precisely someone can pinpoint what's happening, the faster we can fix it. Joel has done a lot by making a repro case. Maybe others can help to narrow it down? Simon | -----Original Message----- | From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of | Adrian Hey | Sent: 29 December 2005 13:03 | To: haskell-cafe@haskell.org | Subject: [Haskell-cafe] Joels Time Leak | | Hello, | | I haven't followed everything that's happened on the Binary IO | thread, but has anybody else actually tried Joels code? .. | | http://wagerlabs.com/timeleak.tgz | | I can reproduce the problem (ghc/Linux), but can't explain it. It | seems very strange that friggin about with an otherwise irrelevant | (AFAICT) MVar fixes the problem. | | Regards | -- | Adrian Hey | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

Well, there's actually a more interesting problem hidden in here too.
The issue with it taking too long seems to basically be as Joel said,
only one of the threads can take that MVar at a time. Even if the time
that it's taken is fairly short, if one is running faster than the
others, it tries to take the MVar more often, which means that it runs
a higher risk of being blocked and slowed down, letting other threads
take its place. It essentially just forces the scheduler to be more
fair.
The more interesting issue seems to be when one replaces the forkIO in
Util.hs, line 36, with a forkOS and compile with -threaded. You'll get
no alerts for quite some time, and just when you think it's working:
unstuff: user error (ORANGE ALERT: 0s, 4s, SrvServerInfo, ix1: 6, size: 49722)
unstuff: internal error: scavenge_stack: weird activation record found
on stack: 63280
Please report this as a bug to glasgow-haskell-bugs@haskell.org,
or http://www.sourceforge.net/projects/ghc/
cale@zaphod[~/timeleak]$
This seems to happen consistently on at least 3 platforms (Linux,
OpenBSD, Windows) (with sometimes a red alert rather than orange). I
filed a bug in trac corresponding to it.
(http://cvs.haskell.org/trac/ghc/ticket/641)
- Cale
On 29/12/05, Simon Peyton-Jones
Thanks for looking, Adrian,
It'd be great if someone was able to find out more about what's going on here. Bandwidth at GHC HQ is always tight, so the more precisely someone can pinpoint what's happening, the faster we can fix it. Joel has done a lot by making a repro case. Maybe others can help to narrow it down?
Simon
| -----Original Message----- | From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of | Adrian Hey | Sent: 29 December 2005 13:03 | To: haskell-cafe@haskell.org | Subject: [Haskell-cafe] Joels Time Leak | | Hello, | | I haven't followed everything that's happened on the Binary IO | thread, but has anybody else actually tried Joels code? .. | | http://wagerlabs.com/timeleak.tgz | | I can reproduce the problem (ghc/Linux), but can't explain it. It | seems very strange that friggin about with an otherwise irrelevant | (AFAICT) MVar fixes the problem. | | Regards | -- | Adrian Hey | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, Dec 29, 2005 at 03:34:10PM -0500, Cale Gibbard wrote:
The issue with it taking too long seems to basically be as Joel said, only one of the threads can take that MVar at a time. Even if the time that it's taken is fairly short, if one is running faster than the others, it tries to take the MVar more often, which means that it runs a higher risk of being blocked and slowed down, letting other threads take its place. It essentially just forces the scheduler to be more fair.
I get results that confirm scheduler unfairness. I have numbered the threads and every thread prints it number before starting "read". When there are ORANGE or RED alerts, the output looks odd - below is the result of "sort -n | uniq -c": 53 1 53 2 ... 21 46 21 47 ... 8 109 8 110 3 111 ... 2 998 2 999 2 1000 So thread number 1 managed to run at least 52 or 53 "read"s, but thread number 1000 only 1 or 2 "read"s. Best regards Tomasz -- I am searching for a programmer who is good at least in some of [Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland

Hello Tomasz, Friday, December 30, 2005, 12:36:37 AM, you wrote: TZ> I get results that confirm scheduler unfairness. I have numbered the TZ> So thread number 1 managed to run at least 52 or 53 "read"s, but thread TZ> number 1000 only 1 or 2 "read"s. it may be just because slowness in threads creation. try to count only from the moment when 1000'th thread prints something -- Best regards, Bulat mailto:bulatz@HotPOP.com

On Thu, Dec 29, 2005 at 10:36:37PM +0100, Tomasz Zielonka wrote:
On Thu, Dec 29, 2005 at 03:34:10PM -0500, Cale Gibbard wrote:
The issue with it taking too long seems to basically be as Joel said, only one of the threads can take that MVar at a time. Even if the time that it's taken is fairly short, if one is running faster than the others, it tries to take the MVar more often, which means that it runs a higher risk of being blocked and slowed down, letting other threads take its place. It essentially just forces the scheduler to be more fair.
I get results that confirm scheduler unfairness.
I am taking it back. It seems that these results were caused by threads starting at different moments. I made every thread wail till all threads are created, and now I can't reproduce it anymore. Best regards Tomasz -- I am searching for a programmer who is good at least in some of [Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
participants (4)
-
Bulat Ziganshin
-
Cale Gibbard
-
Simon Peyton-Jones
-
Tomasz Zielonka