strace breaks cabal - how to find the problem?

On my obscure configuration (GHC 6.10.1, with pkgenv activated, on Fedora Linux rawhide running on a VirtualBox x86 VM with hardware virtualisation enabled), running strace on cabal causes it to misbehave, as described below. I don't know whether this is due to a bug in cabal, the GHC runtime, strace, the Linux kernel, VirtualBox, or my processor's virtualization support. Not sure how to proceed to debug this. Perhaps someone else could try this, on their configuration, and report what they find? This command is ran in an untarred copy of hs-bibutils 0.1 (not that it matters): strace -f -e trace=file cabal install --extra-include-dirs=../bibutils_4.1/lib/ 2>&1|cat >cabal.strace Cabal fails to determine the version of ghc-pkg (or sometimes ghc), and stops, complaining that it can't verify that the version of ghc(-pkg) is the required one. ghc-pkg is execve'd, but something goes wrong - not sure what. -- Robin

Robin Green wrote:
On my obscure configuration (GHC 6.10.1, with pkgenv activated, on Fedora Linux rawhide running on a VirtualBox x86 VM with hardware virtualisation enabled), running strace on cabal causes it to misbehave, as described below.
I don't know whether this is due to a bug in cabal, the GHC runtime, strace, the Linux kernel, VirtualBox, or my processor's virtualization support. Not sure how to proceed to debug this. Perhaps someone else could try this, on their configuration, and report what they find?
This command is ran in an untarred copy of hs-bibutils 0.1 (not that it matters):
strace -f -e trace=file cabal install --extra-include-dirs=../bibutils_4.1/lib/ 2>&1|cat >cabal.strace
Cabal fails to determine the version of ghc-pkg (or sometimes ghc), and stops, complaining that it can't verify that the version of ghc(-pkg) is the required one. ghc-pkg is execve'd, but something goes wrong - not sure what.
This sounds slightly familiar. The System.Process library uses vfork() on Unix systems, which as it turns out helps to avoid some race conditions. However, while debugging something in this area recently (using strace) I remember seeing different behaviour when running under strace. I suspect that strace is doing something to vfork(). Perhaps we should bite the bullet and use fork(), and fix the race conditions properly. As I recall, what was happening was that the fork() took so long that it got interrupted by the timer signal (1/50 secs), and restarted, ad infinitum. So in order to use fork() we have to disable timer signals (for the whole process? what if multiple threads are doing fork()? sigh.). Cheers, Simon

On 2009 Mar 25, at 9:12, Simon Marlow wrote:
Robin Green wrote: This sounds slightly familiar. The System.Process library uses vfork() on Unix systems, which as it turns out helps to avoid some race conditions. However, while debugging something in this area recently (using strace) I remember seeing different behaviour when running under strace. I suspect that strace is doing something to vfork().
strace and vfork don't mix; vfork uses a hack that won't work right when tracing is enabled. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH
participants (3)
-
Brandon S. Allbery KF8NH
-
Robin Green
-
Simon Marlow