I/O manager: relying solely upon kqueue is not a safe way to go

I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci: % inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported) So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then? After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway. However the situation has changed since the commit e5f5cfcd. The I/O manager now registers fds without passing an incoming event buffer, so kevent(2) no longer successfully delivers an error event instead it directly returns -1 with errno set to ENOTSUP, hence the "Operation not supported" exception. The Darwin's kqueue has started supporting tty since MacOS X 10.7 (Lion), but I heard it still doesn't support some important devices like /dev/random. FreeBSD's kqueue has some difficulties too. It's no doubt kqueue is a great mechanism for sockets, but IMHO it's not something to use for all kinds of file I/O. Should we then try kqueue(2) first and fallback to poll(2) if failed? Sadly no. Darwin's poll(2) is broken too, and select(2) is the only method reliable: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS I wrote a small program to test if a given stdin is supported by kqueue, poll and select: https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp MacOS X 10.5.8 is hopelessly broken. We can't use anything other than select(2) for tty and other devices: https://gist.github.com/phonohawk/5169980#file-powerpc-apple-darwin9-8-0-txt FreeBSD 8.0 does support tty but not /dev/random. I don't know what about the latest FreeBSD 9.1: https://gist.github.com/phonohawk/5169980#file-i386-unknown-freebsd8-0-txt NetBSD 6.99.17 works perfectly here: https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-netbsd6-99-17-... Just for reference, Linux 2.6.16 surely doesn't have kqueue(2) but it supports poll(2)ing on devices: https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-linux2-6-16-tx... I accordingly suggest that we should have some means to run two independent I/O managers on BSD-like systems: KQueue for sockets, pipes and regular file reads, and Select for any other devices and regular file writes. Note also that no implementations of kqueue support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite". Any ideas? Thanks, PHO _______________________________________________________ - PHO - http://cielonegro.org/ OpenPGP public key: 1024D/1A86EF72 Fpr: 5F3E 5B5F 535C CE27 8254 4D1A 14E7 9CA7 1A86 EF72

First, thanks for the very detailed investigation. I've CC:ed Andreas and
Kazu who has the most knowledge of the I/O manager nowadays.
-- Johan
On Fri, Mar 15, 2013 at 12:54 PM, PHO
I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
% inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported)
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway.
However the situation has changed since the commit e5f5cfcd. The I/O manager now registers fds without passing an incoming event buffer, so kevent(2) no longer successfully delivers an error event instead it directly returns -1 with errno set to ENOTSUP, hence the "Operation not supported" exception.
The Darwin's kqueue has started supporting tty since MacOS X 10.7 (Lion), but I heard it still doesn't support some important devices like /dev/random. FreeBSD's kqueue has some difficulties too. It's no doubt kqueue is a great mechanism for sockets, but IMHO it's not something to use for all kinds of file I/O. Should we then try kqueue(2) first and fallback to poll(2) if failed? Sadly no. Darwin's poll(2) is broken too, and select(2) is the only method reliable: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS
I wrote a small program to test if a given stdin is supported by kqueue, poll and select: https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp
MacOS X 10.5.8 is hopelessly broken. We can't use anything other than select(2) for tty and other devices:
https://gist.github.com/phonohawk/5169980#file-powerpc-apple-darwin9-8-0-txt
FreeBSD 8.0 does support tty but not /dev/random. I don't know what about the latest FreeBSD 9.1: https://gist.github.com/phonohawk/5169980#file-i386-unknown-freebsd8-0-txt
NetBSD 6.99.17 works perfectly here:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-netbsd6-99-17-...
Just for reference, Linux 2.6.16 surely doesn't have kqueue(2) but it supports poll(2)ing on devices:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-linux2-6-16-tx...
I accordingly suggest that we should have some means to run two independent I/O managers on BSD-like systems: KQueue for sockets, pipes and regular file reads, and Select for any other devices and regular file writes. Note also that no implementations of kqueue support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite".
Any ideas?
Thanks, PHO _______________________________________________________ - PHO - http://cielonegro.org/ OpenPGP public key: 1024D/1A86EF72 Fpr: 5F3E 5B5F 535C CE27 8254 4D1A 14E7 9CA7 1A86 EF72
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hi PHO,
Thanks for the detailed message. I will look at your report closely and get
back to you soon.
-Andi
On Fri, Mar 15, 2013 at 4:20 PM, Johan Tibell
First, thanks for the very detailed investigation. I've CC:ed Andreas and Kazu who has the most knowledge of the I/O manager nowadays.
-- Johan
On Fri, Mar 15, 2013 at 12:54 PM, PHO
wrote: I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
% inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported)
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway.
However the situation has changed since the commit e5f5cfcd. The I/O manager now registers fds without passing an incoming event buffer, so kevent(2) no longer successfully delivers an error event instead it directly returns -1 with errno set to ENOTSUP, hence the "Operation not supported" exception.
The Darwin's kqueue has started supporting tty since MacOS X 10.7 (Lion), but I heard it still doesn't support some important devices like /dev/random. FreeBSD's kqueue has some difficulties too. It's no doubt kqueue is a great mechanism for sockets, but IMHO it's not something to use for all kinds of file I/O. Should we then try kqueue(2) first and fallback to poll(2) if failed? Sadly no. Darwin's poll(2) is broken too, and select(2) is the only method reliable: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AND_DARWIN_BUGS
I wrote a small program to test if a given stdin is supported by kqueue, poll and select: https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp
MacOS X 10.5.8 is hopelessly broken. We can't use anything other than select(2) for tty and other devices:
https://gist.github.com/phonohawk/5169980#file-powerpc-apple-darwin9-8-0-txt
FreeBSD 8.0 does support tty but not /dev/random. I don't know what about the latest FreeBSD 9.1: https://gist.github.com/phonohawk/5169980#file-i386-unknown-freebsd8-0-txt
NetBSD 6.99.17 works perfectly here:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-netbsd6-99-17-...
Just for reference, Linux 2.6.16 surely doesn't have kqueue(2) but it supports poll(2)ing on devices:
https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-linux2-6-16-tx...
I accordingly suggest that we should have some means to run two independent I/O managers on BSD-like systems: KQueue for sockets, pipes and regular file reads, and Select for any other devices and regular file writes. Note also that no implementations of kqueue support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite".
Any ideas?
Thanks, PHO _______________________________________________________ - PHO - http://cielonegro.org/ OpenPGP public key: 1024D/1A86EF72 Fpr: 5F3E 5B5F 535C CE27 8254 4D1A 14E7 9CA7 1A86 EF72
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hello PHO, First of all, thank you for your hard work! Your test program is really useful for me.
I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
I'm building GHC head everyday on Linux, Mac and FreeBSD because I'm in charge of the parallel IO manager. And I suspect that instability of building GHC head on Mac is due to the parallel IO manager and a broken implementation of pipe on Mac. I know four issues on Mac: 1) http://hackage.haskell.org/trac/ghc/ticket/7651 GHC head already includes a workaround on this. I guess you know this because you are watching this ticket. 2) Building failure on my Mac 3) Instability of "ghc-stage2 --interactive" on your Mac. 4) http://hackage.haskell.org/trac/ghc/ticket/7715
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
I guess your MacOS is too old. My Mac is 10.8.3 whose kevent() does support tty: ---- % ./kqueue-poll-select Type of stdin: tty Checking if kqueue(2) works... Great. kevent(2) successfully handled your stdin. Checking if poll(2) works... Good. poll(2) successfully handled your stdin. Checking if select(2) works... OK. select(2) successfully handled your stdin. ---- So, I guess this is not the source of bug 1) and 2).
The Darwin's kqueue has started supporting tty since MacOS X 10.7 (Lion), but I heard it still doesn't support some important devices like /dev/random.
/dev/random is not supported yet. ---- ./kqueue-poll-select < /dev/random Type of stdin: device Checking if kqueue(2) works... kevent(2) ifself succeeded but delivered us an EV_ERROR. This is how the old I/O manager was fooled because it wasn't (and the newer one still don't) check this situation: Invalid argument kevent(2) expectedly failed now. This time we didn't pass a non-NULL incoming event buffer. This is how the new I/O manager fails on this platform.: Invalid argument Checking if poll(2) works... poll(2) itself succeeded but delivered us a POLLNVAL event. It can't handle your stdin. Checking if select(2) works... OK. select(2) successfully handled your stdin. ----
FreeBSD 8.0 does support tty but not /dev/random. I don't know what about the latest FreeBSD 9.1: https://gist.github.com/phonohawk/5169980#file-i386-unknown-freebsd8-0-txt
According to my test, FreeBSD 9.0 behaves the same.
I accordingly suggest that we should have some means to run two independent I/O managers on BSD-like systems: KQueue for sockets, pipes and regular file reads, and Select for any other devices and regular file writes. Note also that no implementations of kqueue support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite".
I would like to simplify this rule. What about this? - KQueue/EPoll -- sockets only - Select on Mac/Poll -- other things --Kazu

Hi PHO,
On Fri, Mar 15, 2013 at 3:54 PM, PHO
I wrote a small program to test if a given stdin is supported by kqueue, poll and select: https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp
MacOS X 10.5.8 is hopelessly broken. We can't use anything other than select(2) for tty and other devices:
https://gist.github.com/phonohawk/5169980#file-powerpc-apple-darwin9-8-0-txt
I just ran your program on OS X 10.8.2. It works fine with tty $ ./a.out Type of stdin: tty Checking if kqueue(2) works... Great. kevent(2) successfully handled your stdin. Checking if poll(2) works... Good. poll(2) successfully handled your stdin. Checking if select(2) works... OK. select(2) successfully handled your stdin. and fails with /dev/random, as you report: $ ./a.out < /dev/random Type of stdin: device Checking if kqueue(2) works... kevent(2) ifself succeeded but delivered us an EV_ERROR. This is how the old I/O manager was fooled because it wasn't (and the newer one still don't) check this situation: Invalid argument kevent(2) expectedly failed now. This time we didn't pass a non-NULL incoming event buffer. This is how the new I/O manager fails on this platform.: Invalid argument Checking if poll(2) works... poll(2) itself succeeded but delivered us a POLLNVAL event. It can't handle your stdin. Checking if select(2) works... OK. select(2) successfully handled your stdin. Note also that no implementations of kqueue
support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite".
Are you sure about this? Do you have any example program demonstrating this? I just took your kqueue-poll-select.cpp program and tested registering a file for a write event (i.e. changing EVFILT_READ to EVFILT_WRITE, POLLIN to POLLOUT, and use writefds rather than readfds) and it works fine on a file: $ ./a.out < foo Type of stdin: regular Checking if kqueue(2) works... Great. kevent(2) successfully handled your stdin. Checking if poll(2) works... Good. poll(2) successfully handled your stdin. Checking if select(2) works... OK. select(2) successfully handled your stdin. -Andi

From: Andreas Voellmy
Note also that no implementations of kqueue
support monitoring writability of regular file writes, which also endangers the current use of kqueue-based I/O manager namely "threadWaitWrite".
Are you sure about this? Do you have any example program demonstrating this? I just took your kqueue-poll-select.cpp program and tested registering a file for a write event (i.e. changing EVFILT_READ to EVFILT_WRITE, POLLIN to POLLOUT, and use writefds rather than readfds) and it works fine on a file:
Sorry, I just read that in man pages in all platforms but it turned out that most implementations actually supoprt that. Here's my updated program: https://gist.github.com/phonohawk/5169980#file-kqueue-poll-select-cpp NetBSD 6.99.17 is the only platform that prohibits waiting on regular file writes: https://gist.github.com/phonohawk/5169980#file-x86_64-unknown-netbsd6-99-17-... _______________________________________________________ - PHO - http://cielonegro.org/ OpenPGP public key: 1024D/1A86EF72 Fpr: 5F3E 5B5F 535C CE27 8254 4D1A 14E7 9CA7 1A86 EF72

On Fri, Mar 15, 2013 at 3:54 PM, PHO
I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
% inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported)
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway.
However the situation has changed since the commit e5f5cfcd. The I/O manager now registers fds without passing an incoming event buffer, so kevent(2) no longer successfully delivers an error event instead it directly returns -1 with errno set to ENOTSUP, hence the "Operation not supported" exception.
One thing we can easily do is have the new IO manager pass in an incoming event buffer so we can distinguish this case and treat it exactly as the old IO manager did. Then this exception would not occur and the waiting thread would just continue to retry the read until it succeeded. This is inefficient, but is no worse than the old IO manager. Note that there is nothing about the IO manager that would cause the awakened thread to make a blocking read call - that is determined entirely by how the thread performs the read. For example, if you take a look at the code in the network package, you will see that whenever a socket is created, the socket is put in non-blocking mode. Then the code to receive from a socket does a recv() which is now non-blocking and calls threadWaitRead if that would block. Going beyond this immediate fix, we can try to really tackle the problem. The simplest and arguably safest approach is probably to just use select for everything (on os x). That would have the downside of limiting the number of files that programs can wait on to 1024 per capability. A better approach would be to try to register with kqueue and then if it doesn't work, register it with an IO manager thread that is using select for the backend. We can probably reuse the IO manager thread that is watching timers for this purpose. With the parallel IO manager, we no longer use it to wait on files, but we certainly could do that. That would save us from adding more threads. By only failing over to the manager-thread-using-select-backend if kqueue fails, we don't need to maintain a list of files types that kqueue works for, which might be a pain to maintain reliably. -Andi

I started to look into fixing this issue, but HEAD no longer compiles for
me. Here is the build error I get (on os x 10.8.2):
$ "inplace/bin/ghc-stage1" -static -H32m -O -package-name
ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/.
-ilibraries/ghc-prim/dist-install/build
-ilibraries/ghc-prim/dist-install/build/autogen
-Ilibraries/ghc-prim/dist-install/build
-Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/.
-optP-include
-optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package
rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash
-XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples
-XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts
-dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir
libraries/ghc-prim/dist-install/build -stubdir
libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c
libraries/ghc-prim/./GHC/IntWord64.hs -o
libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno
libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o"inplace/bin/ghc-stage1"
-static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i
-ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build
-ilibraries/ghc-prim/dist-install/build/autogen
-Ilibraries/ghc-prim/dist-install/build
-Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/.
-optP-include
-optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package
rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash
-XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples
-XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts
-dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir
libraries/ghc-prim/dist-install/build -stubdir
libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c
libraries/ghc-prim/./GHC/IntWord64.hs -o
libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno
libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o
/var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc66530_0/ghc66530_1.split__2.s:unknown:missing
indirect symbols for section (__DATA,__la_sym_ptr2)
On Sat, Mar 16, 2013 at 11:08 AM, Andreas Voellmy wrote: On Fri, Mar 15, 2013 at 3:54 PM, PHO I found the HEAD stopped working on MacOS X 10.5.8 since the parallel
I/O manager got merged to HEAD. Stage-2 compiler successfully builds
(including Language.Haskell.TH.Syntax contrary to the report by Kazu
Yamamoto) but the resulting binary is very unstable especially for
ghci: % inplace/bin/ghc-stage2 --interactive
GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude>
<stdin>: hGetChar: failed (Operation not supported) So I took a dtruss log and found it was kevent(2) that returned
ENOTSUP. GHC.Event.KQueue was just registering the stdin for
EVFILT_READ, whose type was of course tty, and then kevent(2) said
"tty is not supported". Didn't the old I/O manager do the same thing?
Why was it working then? After a hard investigation, I concluded that the old I/O manager was
not really working. It just looked fine but in fact wasn't. Here's an
explanation: If a fd to be registered is unsupported by kqueue,
kevent(2) returns -1 iff no incoming event buffer is passed
together. Otherwise it successfully returns with an incoming kevent
whose "flags" is EV_ERROR and "data" contains an errno. The I/O
manager has always been passing a non-empty event buffer until the
commit e5f5cfcd, while it wasn't (and still isn't) checking if a
received event in fact represents an error. That is, the KQueue
backend asks the kernel to monitor the stdin's readability. The kernel
then immediately delivers an event saying ENOTSUP. The KQueue backend
thinks "Hey, the stdin is now readable!" so it invokes a callback
associated with the fd. The thread which called "threadWaitRead" is
now awakened and performs a supposedly non-blocking read on the fd,
which in fact blocks but works anyway. However the situation has changed since the commit e5f5cfcd. The I/O
manager now registers fds without passing an incoming event buffer, so
kevent(2) no longer successfully delivers an error event instead it
directly returns -1 with errno set to ENOTSUP, hence the "Operation
not supported" exception. One thing we can easily do is have the new IO manager pass in an incoming
event buffer so we can distinguish this case and treat it exactly as the
old IO manager did. Then this exception would not occur and the waiting
thread would just continue to retry the read until it succeeded. This is
inefficient, but is no worse than the old IO manager. Note that there is nothing about the IO manager that would cause the
awakened thread to make a blocking read call - that is determined entirely
by how the thread performs the read. For example, if you take a look at
the code in the network package, you will see that whenever a socket is
created, the socket is put in non-blocking mode. Then the code to receive
from a socket does a recv() which is now non-blocking and calls
threadWaitRead if that would block. Going beyond this immediate fix, we can try to really tackle the problem.
The simplest and arguably safest approach is probably to just use select
for everything (on os x). That would have the downside of limiting the
number of files that programs can wait on to 1024 per capability. A better approach would be to try to register with kqueue and then if it
doesn't work, register it with an IO manager thread that is using select
for the backend. We can probably reuse the IO manager thread that is
watching timers for this purpose. With the parallel IO manager, we no
longer use it to wait on files, but we certainly could do that. That would
save us from adding more threads. By only failing over to the
manager-thread-using-select-backend if kqueue fails, we don't need to
maintain a list of files types that kqueue works for, which might be a pain
to maintain reliably. -Andi

I created a ticket
http://hackage.haskell.org/trac/ghc/attachment/ticket/7773/
for the problem reported by PHO.
On Sat, Mar 16, 2013 at 5:07 PM, Andreas Voellmy
I started to look into fixing this issue, but HEAD no longer compiles for me. Here is the build error I get (on os x 10.8.2):
$ "inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash -XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples -XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts -dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/ghc-prim/./GHC/IntWord64.hs -o libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o"inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash -XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples -XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts -dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/ghc-prim/./GHC/IntWord64.hs -o libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o /var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc66530_0/ghc66530_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2)
On Sat, Mar 16, 2013 at 11:08 AM, Andreas Voellmy < andreas.voellmy@gmail.com> wrote:
On Fri, Mar 15, 2013 at 3:54 PM, PHO
wrote: I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
% inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported)
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway.
However the situation has changed since the commit e5f5cfcd. The I/O manager now registers fds without passing an incoming event buffer, so kevent(2) no longer successfully delivers an error event instead it directly returns -1 with errno set to ENOTSUP, hence the "Operation not supported" exception.
One thing we can easily do is have the new IO manager pass in an incoming event buffer so we can distinguish this case and treat it exactly as the old IO manager did. Then this exception would not occur and the waiting thread would just continue to retry the read until it succeeded. This is inefficient, but is no worse than the old IO manager.
Note that there is nothing about the IO manager that would cause the awakened thread to make a blocking read call - that is determined entirely by how the thread performs the read. For example, if you take a look at the code in the network package, you will see that whenever a socket is created, the socket is put in non-blocking mode. Then the code to receive from a socket does a recv() which is now non-blocking and calls threadWaitRead if that would block.
Going beyond this immediate fix, we can try to really tackle the problem. The simplest and arguably safest approach is probably to just use select for everything (on os x). That would have the downside of limiting the number of files that programs can wait on to 1024 per capability.
A better approach would be to try to register with kqueue and then if it doesn't work, register it with an IO manager thread that is using select for the backend. We can probably reuse the IO manager thread that is watching timers for this purpose. With the parallel IO manager, we no longer use it to wait on files, but we certainly could do that. That would save us from adding more threads. By only failing over to the manager-thread-using-select-backend if kqueue fails, we don't need to maintain a list of files types that kqueue works for, which might be a pain to maintain reliably.
-Andi

Hi,
I started to look into fixing this issue, but HEAD no longer compiles for me. Here is the build error I get (on os x 10.8.2):
The same problem happens on my Mac since yesterday. Watching this compiling the file by "dtruss" changes its behavior. That is, compiling the file finishes. But typing "make" results in the same problem on another file. --Kazu

I just built HEAD successfully on OS X 10.8.3 (`xcodebuild -version`
reports "Xcode 4.6.1; Build version 4H512) with the traditional routine. I
think this was due to some recent fallout from enabling the dynamic way by
default, and Ian fixed it.
On Sat, Mar 16, 2013 at 6:36 PM, Kazu Yamamoto
Hi,
I started to look into fixing this issue, but HEAD no longer compiles for me. Here is the build error I get (on os x 10.8.2):
The same problem happens on my Mac since yesterday.
Watching this compiling the file by "dtruss" changes its behavior. That is, compiling the file finishes. But typing "make" results in the same problem on another file.
--Kazu
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin

I tried again, after pulling, but the same thing happens. The output of
./configure is:
----------------------------------------------------------------------
Configure completed successfully.
Building GHC version : 7.7.20130317
Build platform : i386-apple-darwin
Host platform : i386-apple-darwin
Target platform : i386-apple-darwin
Bootstrapping using : /usr/bin/ghc
which is version : 7.4.2
Using llvm-gcc : /usr/bin/gcc
which is version : 4.2.1
Building a cross compiler : NO
ld : /usr/bin/ld
Happy : /usr/bin/happy (1.18.10)
Alex : /usr/bin/alex (3.0.2)
Perl : /usr/bin/perl
dblatex :
xsltproc : /usr/bin/xsltproc
Using LLVM tools
llc : /usr/local/bin/llc
opt : /usr/local/bin/opt
HsColour was not found; documentation will not contain source links
Building DocBook HTML documentation : NO
Building DocBook PS documentation : NO
Building DocBook PDF documentation : NO
and the build error:
"inplace/bin/ghc-stage1" -static -H32m -O -package-name
ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/.
-ilibraries/ghc-prim/dist-install/build
-ilibraries/ghc-prim/dist-install/build/autogen
-Ilibraries/ghc-prim/dist-install/build
-Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/.
-optP-include
-optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package
rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash
-XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples
-XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts
-dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir
libraries/ghc-prim/dist-install/build -stubdir
libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c
libraries/ghc-prim/./GHC/IntWord64.hs -o
libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno
libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o
/var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc41160_0/ghc41160_1.split__2.s:unknown:missing
indirect symbols for section (__DATA,__la_sym_ptr2)
make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/IntWord64.o] Error 1
make[1]: *** Deleting file
`libraries/ghc-prim/dist-install/build/GHC/IntWord64.o'
make: *** [all] Error 2
On Sun, Mar 17, 2013 at 12:36 PM, Austin Seipp
I just built HEAD successfully on OS X 10.8.3 (`xcodebuild -version` reports "Xcode 4.6.1; Build version 4H512) with the traditional routine. I think this was due to some recent fallout from enabling the dynamic way by default, and Ian fixed it.
On Sat, Mar 16, 2013 at 6:36 PM, Kazu Yamamoto
wrote: Hi,
I started to look into fixing this issue, but HEAD no longer compiles for me. Here is the build error I get (on os x 10.8.2):
The same problem happens on my Mac since yesterday.
Watching this compiling the file by "dtruss" changes its behavior. That is, compiling the file finishes. But typing "make" results in the same problem on another file.
--Kazu
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Information about my xcode and platform:
$ xcodebuild -version
Xcode 4.6
Build version 4H127
$ uname -a
Darwin a.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 16:32:48 PDT
2012; root:xnu-2050.20.9~2/RELEASE_X86_64 x86_64
I'll update my xcode tools and retry the GHC build.
On Sun, Mar 17, 2013 at 1:06 PM, Andreas Voellmy
I tried again, after pulling, but the same thing happens. The output of ./configure is: ---------------------------------------------------------------------- Configure completed successfully.
Building GHC version : 7.7.20130317
Build platform : i386-apple-darwin Host platform : i386-apple-darwin Target platform : i386-apple-darwin
Bootstrapping using : /usr/bin/ghc which is version : 7.4.2
Using llvm-gcc : /usr/bin/gcc which is version : 4.2.1 Building a cross compiler : NO
ld : /usr/bin/ld Happy : /usr/bin/happy (1.18.10) Alex : /usr/bin/alex (3.0.2) Perl : /usr/bin/perl dblatex : xsltproc : /usr/bin/xsltproc
Using LLVM tools llc : /usr/local/bin/llc opt : /usr/local/bin/opt
HsColour was not found; documentation will not contain source links
Building DocBook HTML documentation : NO Building DocBook PS documentation : NO Building DocBook PDF documentation : NO
and the build error:
"inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash -XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples -XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts -dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/ghc-prim/./GHC/IntWord64.hs -o libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o /var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc41160_0/ghc41160_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2) make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/IntWord64.o] Error 1 make[1]: *** Deleting file `libraries/ghc-prim/dist-install/build/GHC/IntWord64.o' make: *** [all] Error 2
On Sun, Mar 17, 2013 at 12:36 PM, Austin Seipp
wrote: I just built HEAD successfully on OS X 10.8.3 (`xcodebuild -version` reports "Xcode 4.6.1; Build version 4H512) with the traditional routine. I think this was due to some recent fallout from enabling the dynamic way by default, and Ian fixed it.
On Sat, Mar 16, 2013 at 6:36 PM, Kazu Yamamoto
wrote: Hi,
I started to look into fixing this issue, but HEAD no longer compiles for me. Here is the build error I get (on os x 10.8.2):
The same problem happens on my Mac since yesterday.
Watching this compiling the file by "dtruss" changes its behavior. That is, compiling the file finishes. But typing "make" results in the same problem on another file.
--Kazu
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On Sun, Mar 17, 2013 at 01:06:10PM -0400, Andreas Voellmy wrote:
and the build error:
"inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/.
[...]
libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o /var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc41160_0/ghc41160_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2) make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/IntWord64.o] Error 1 make[1]: *** Deleting file `libraries/ghc-prim/dist-install/build/GHC/IntWord64.o' make: *** [all] Error 2
Can you "mkdir tmp", rerun the command with "-keep-tmp-files -tmpdir tmp" and send me the temporary files please? Thanks Ian

Hllo,
and the build error:
"inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. [...] libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o /var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc41160_0/ghc41160_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2) make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/IntWord64.o] Error 1 make[1]: *** Deleting file `libraries/ghc-prim/dist-install/build/GHC/IntWord64.o' make: *** [all] Error 2
Can you "mkdir tmp", rerun the command with "-keep-tmp-files -tmpdir tmp" and send me the temporary files please?
This problem happens if a HS file is compiled though "make". If I compiled it by copy-pasting "inplace/bin/ghc-stage1", it works. I said that "dtruss" changes the behavior but it appeared that "dtruss" does not matter. Should I put "-keep-tmp-files -tmpdir tmp" to a makefile? If so, please give me a patch. --Kazu

On Mon, Mar 18, 2013 at 11:28:17AM +0900, Kazu Yamamoto wrote:
"inplace/bin/ghc-stage1" -static -H32m -O -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. [...] libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o /var/folders/_c/4n2x0zfx7mx5gk_46pdxn3pm0000gn/T/ghc41160_0/ghc41160_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2) make[1]: *** [libraries/ghc-prim/dist-install/build/GHC/IntWord64.o] Error 1 make[1]: *** Deleting file `libraries/ghc-prim/dist-install/build/GHC/IntWord64.o' make: *** [all] Error 2
Can you "mkdir tmp", rerun the command with "-keep-tmp-files -tmpdir tmp" and send me the temporary files please?
This problem happens if a HS file is compiled though "make". If I compiled it by copy-pasting "inplace/bin/ghc-stage1", it works.
I said that "dtruss" changes the behavior but it appeared that "dtruss" does not matter.
Should I put "-keep-tmp-files -tmpdir tmp" to a makefile? If so, please give me a patch.
If you put SRC_HC_OPTS += -keep-tmp-files -tmpdir tmp in mk/build.mk or mk/validate.mk (depending on whether or not you are validating) then it will be used for all compilations. Or you can make it more specific, e.g. libraries/ghc-prim_dist-install_EXTRA_HC_OPTS += -keep-tmp-files -tmpdir tmp will use it only for ghc-prim. Could you also send me your complete mk/build.mk and mk/validate.mk, and the commands you're using to compile GHC, please? Thanks Ian

If you put SRC_HC_OPTS += -keep-tmp-files -tmpdir tmp in mk/build.mk or mk/validate.mk (depending on whether or not you are validating) then it will be used for all compilations.
I did this for "mk/build.mk" but nothing changed. "inplace/bin/ghc-stage1" -static -H32m -O -keep-tmp-files -tmpdir tmp -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash -XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples -XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts -dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/ghc-prim/./GHC/IntWord64.hs -o libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o tmp/ghc63821_0/ghc63821_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2)
Could you also send me your complete mk/build.mk and mk/validate.mk, and the commands you're using to compile GHC, please?
I don't create "mk/build.mk" so far. --Kazu

On Wed, Mar 20, 2013 at 02:03:21PM +0900, Kazu Yamamoto wrote:
If you put SRC_HC_OPTS += -keep-tmp-files -tmpdir tmp in mk/build.mk or mk/validate.mk (depending on whether or not you are validating) then it will be used for all compilations.
I did this for "mk/build.mk" but nothing changed.
"inplace/bin/ghc-stage1" -static -H32m -O -keep-tmp-files -tmpdir tmp -package-name ghc-prim-0.3.1.0 -hide-all-packages -i -ilibraries/ghc-prim/. -ilibraries/ghc-prim/dist-install/build -ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/dist-install/build -Ilibraries/ghc-prim/dist-install/build/autogen -Ilibraries/ghc-prim/. -optP-include -optPlibraries/ghc-prim/dist-install/build/autogen/cabal_macros.h -package rts-1.0 -split-objs -package-name ghc-prim -XHaskell98 -XCPP -XMagicHash -XForeignFunctionInterface -XUnliftedFFITypes -XUnboxedTuples -XEmptyDataDecls -XNoImplicitPrelude -O2 -no-user-package-db -rtsopts -dynamic-too -odir libraries/ghc-prim/dist-install/build -hidir libraries/ghc-prim/dist-install/build -stubdir libraries/ghc-prim/dist-install/build -hisuf hi -osuf o -hcsuf hc -c libraries/ghc-prim/./GHC/IntWord64.hs -o libraries/ghc-prim/dist-install/build/GHC/IntWord64.o -dyno libraries/ghc-prim/dist-install/build/GHC/IntWord64.dyn_o tmp/ghc63821_0/ghc63821_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2)
Can you send me all the files in tmp/ghc63821_0 please?
Could you also send me your complete mk/build.mk and mk/validate.mk, and the commands you're using to compile GHC, please?
I don't create "mk/build.mk" so far.
And you're just running "make", with no -j flag or anything? Thanks Ian -- Ian Lynagh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

Hi Ian,
tmp/ghc63821_0/ghc63821_1.split__2.s:unknown:missing indirect symbols for section (__DATA,__la_sym_ptr2)
I believe I've fixed this now; please let me know if you still have problems.
Build finished on Mac and "ghc" works if executed inplace: ---- % ./inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130323: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. [Prelude]
---- And "otool -L ghc-stage2" says that all necessary libraries are linked. However, "make install" fails: ---- Installing library in /ghc-head/lib/ghc-7.7.20130323/haskell2010-1.1.1.0 "/ghc-head/lib/ghc-7.7.20130323/bin/ghc-pkg" --force --global-package-db "/ghc-head/lib/ghc-7.7.20130323/package.conf.d" update rts/package.conf.install dyld: Library not loaded: @loader_path/../terminfo-0.3.2.5/libHSterminfo-0.3.2.5-ghc7.7.20130323.dylib Referenced from: /ghc-head/lib/ghc-7.7.20130323/bin/ghc-pkg Reason: image not found make[1]: *** [install_packages] Trace/BPT trap: 5 make: *** [install] Error 2 ---- When I executed the installed "ghci", the following error was displayed. ---- dyld: Library not loaded: @loader_path/../haskeline-0.7.0.4/libHShaskeline-0.7.0.4-ghc7.7.20130323.dylib Referenced from: /ghc-head/lib/ghc-7.7.20130323/bin/ghc Reason: image not found zsh: trace trap ghci ---- --Kazu

On Sun, Mar 24, 2013 at 08:40:34PM +0900, Kazu Yamamoto wrote:
However, "make install" fails:
---- Installing library in /ghc-head/lib/ghc-7.7.20130323/haskell2010-1.1.1.0 "/ghc-head/lib/ghc-7.7.20130323/bin/ghc-pkg" --force --global-package-db "/ghc-head/lib/ghc-7.7.20130323/package.conf.d" update rts/package.conf.install dyld: Library not loaded: @loader_path/../terminfo-0.3.2.5/libHSterminfo-0.3.2.5-ghc7.7.20130323.dylib Referenced from: /ghc-head/lib/ghc-7.7.20130323/bin/ghc-pkg Reason: image not found make[1]: *** [install_packages] Trace/BPT trap: 5 make: *** [install] Error 2
I can't reproduce this. Does /ghc-head/lib/ghc-7.7.20130323/terminfo-0.3.2.5/libHSterminfo-0.3.2.5-ghc7.7.20130323.dylib exist? Thanks Ian

From: Andreas Voellmy
A better approach would be to try to register with kqueue and then if it doesn't work, register it with an IO manager thread that is using select for the backend. We can probably reuse the IO manager thread that is watching timers for this purpose. With the parallel IO manager, we no longer use it to wait on files, but we certainly could do that. That would save us from adding more threads. By only failing over to the manager-thread-using-select-backend if kqueue fails, we don't need to maintain a list of files types that kqueue works for, which might be a pain to maintain reliably.
Yeah, I think that is the best approach. _______________________________________________________ - PHO - http://cielonegro.org/ OpenPGP public key: 1024D/1A86EF72 Fpr: 5F3E 5B5F 535C CE27 8254 4D1A 14E7 9CA7 1A86 EF72

On 15/03/13 12:54, PHO wrote:
I found the HEAD stopped working on MacOS X 10.5.8 since the parallel I/O manager got merged to HEAD. Stage-2 compiler successfully builds (including Language.Haskell.TH.Syntax contrary to the report by Kazu Yamamoto) but the resulting binary is very unstable especially for ghci:
% inplace/bin/ghc-stage2 --interactive GHCi, version 7.7.20130313: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> <stdin>: hGetChar: failed (Operation not supported)
So I took a dtruss log and found it was kevent(2) that returned ENOTSUP. GHC.Event.KQueue was just registering the stdin for EVFILT_READ, whose type was of course tty, and then kevent(2) said "tty is not supported". Didn't the old I/O manager do the same thing? Why was it working then?
After a hard investigation, I concluded that the old I/O manager was not really working. It just looked fine but in fact wasn't. Here's an explanation: If a fd to be registered is unsupported by kqueue, kevent(2) returns -1 iff no incoming event buffer is passed together. Otherwise it successfully returns with an incoming kevent whose "flags" is EV_ERROR and "data" contains an errno. The I/O manager has always been passing a non-empty event buffer until the commit e5f5cfcd, while it wasn't (and still isn't) checking if a received event in fact represents an error. That is, the KQueue backend asks the kernel to monitor the stdin's readability. The kernel then immediately delivers an event saying ENOTSUP. The KQueue backend thinks "Hey, the stdin is now readable!" so it invokes a callback associated with the fd. The thread which called "threadWaitRead" is now awakened and performs a supposedly non-blocking read on the fd, which in fact blocks but works anyway.
Interesting. I think this may explain what I saw in this ticket: http://hackage.haskell.org/trac/ghc/ticket/4245#comment:22 Cheers, Simon
participants (7)
-
Andreas Voellmy
-
Austin Seipp
-
Ian Lynagh
-
Johan Tibell
-
Kazu Yamamoto
-
PHO
-
Simon Marlow