[GHC] #10298: Infinite loop when shared libraries are unavailable

#10298: Infinite loop when shared libraries are unavailable
-------------------------------------+-------------------------------------
Reporter: snoyberg | Owner: simonmar
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime | Version: 7.10.1
System | Operating System: Linux
Keywords: | Type of failure: Runtime crash
Architecture: x86_64 | Blocked By:
(amd64) | Related Tickets:
Test Case: |
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Originally discussed at: https://groups.google.com/d/msg/haskell-
cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as
a static linking and Docker issue, but in fact affects dynamically linked
executables without any containerization.
I've put together the following script that reproduces my problem:
{{{
cat > hello.hs <

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I was able to reproduce the loop, but rather than copying `/usr/lib/x86_64 -linux-gnu/gconv/` into the chroot, I tried copying in `/usr/lib/locale/` instead, and that was also sufficient to let the program run normally. It's to be expected that a program built with GHC needs those locale files, since String IO is locale-aware. Of course an infinite loop is not so easy to debug, and it'd be nice to have an error message. In fact the error status from `iconv_open` is being correctly checked, and converted to an exception, which is then caught and displayed by the default exception handler. The trouble is that the display of exceptions is also locale-aware... Curiously even an empty `main = return ()` triggers this behavior with 7.8.4, but it runs successfully on 7.10.1. I couldn't figure out why, perhaps some change in the IO manager? I don't have any good ideas about how to improve this situation. Maybe try to set up the locale for IO at some point, catch the exception if it fails and `barf()` rather than using regular IO to display the exception. But when exactly? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): Actually there are more hidden traps. The `IOError` that is raised from the failure of `iconv_open` is produced by `errnoToIOError`, which is defined as {{{ errnoToIOError loc errno maybeHdl maybeName = unsafePerformIO $ do str <- strerror errno >>= peekCString return (IOError maybeHdl errType loc str (Just errno') maybeName) where -- ... }}} This means that matching on the `IOError` constructor will recursively raise another exception, since `peekCString` uses the user's locale. In `GHC.TopHandler`, `real_handler` matches on the `IOError` constructor to decide how to exit. So a possible fix is {{{ diff --git a/libraries/base/GHC/TopHandler.hs b/libraries/base/GHC/TopHandler.hs index d7c0038..5d4094a 100644 --- a/libraries/base/GHC/TopHandler.hs +++ b/libraries/base/GHC/TopHandler.hs @@ -157,14 +157,25 @@ real_handler exit se = do Just (ExitFailure n) -> exit n -- EPIPE errors received for stdout are ignored (#2699) - _ -> case fromException se of + _ -> catch (case fromException se of Just IOError{ ioe_type = ResourceVanished, ioe_errno = Just ioe, ioe_handle = Just hdl } | Errno ioe == ePIPE, hdl == stdout -> exit 0 _ -> do reportError se exit 1 - + ) (disasterHandler exit) + +-- don't use errorBelch() directly, because we cannot call varargs functions +-- using the FFI. +foreign import ccall unsafe "HsBase.h errorBelch2" + errorBelch :: CString -> CString -> IO () + +disasterHandler :: (Int -> IO a) -> IOError -> IO a +disasterHandler exit _ = + withCAString "%s" $ \fmt -> + withCAString "encountered an exception while trying to report an exception" $ \msg -> + errorBelch fmt msg >> exit 1 -- try to flush stdout/stderr, but don't worry if we fail -- (these handles might have errors, and we don't want to go into }}} though it feels a bit artificial. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by trommler): * cc: trommler (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by rwbarton): I suppose another approach would be to store the result of `strerror` in a new field the IOError as a ByteString, and display it using non-locale- aware IO (currently we do a round-trip through String, which should be the identity). Then put the field with the String version behind its own unsafePerformIO. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Description changed by simonpj: Old description:
Originally discussed at: https://groups.google.com/d/msg/haskell- cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as a static linking and Docker issue, but in fact affects dynamically linked executables without any containerization.
I've put together the following script that reproduces my problem:
{{{ cat > hello.hs <
rm -rf tmp mkdir tmp
cp hello tmp
mkdir -p tmp/usr/lib/x86_64-linux-gnu cp /usr/lib/x86_64-linux-gnu/libgmp.so.10 tmp/usr/lib/x86_64-linux-gnu
mkdir -p tmp/lib/x86_64-linux-gnu cp \ /lib/x86_64-linux-gnu/libm.so.6 \ /lib/x86_64-linux-gnu/librt.so.1 \ /lib/x86_64-linux-gnu/libdl.so.2 \ /lib/x86_64-linux-gnu/libc.so.6 \ /lib/x86_64-linux-gnu/libpthread.so.0 \ tmp/lib/x86_64-linux-gnu
mkdir -p tmp/lib64 cp /lib64/ld-linux-x86-64.so.2 tmp/lib64
#mkdir -p tmp/usr/lib/x86_64-linux-gnu/gconv/ #cp \ # /usr/lib/x86_64-linux-gnu/gconv/UTF-32.so \ # /usr/lib/x86_64-linux-gnu/gconv/gconv-modules \ # tmp/usr/lib/x86_64-linux-gnu/gconv
sudo chroot tmp /hello }}}
If I uncomment the block that copies the gconv files, the program runs as expected. However, without those files copied, the program burns CPU and consumes memory until killed by the OS. I ran strace on a similar executable, and got the results at:
https://gist.github.com/snoyberg/095efb17e36acc1d6360
Note that this problem also occurs with statically linked executables when some of the other dynamically linked libraries are not available in the chroot environment.
Expected behavior: ideal would be not to require the gconv files and other shared libraries be present, especially when statically linked. Barring that, it would be much better if the RTS could produce a meaningful error message about the missing file. Note that strace does demonstrate that a open system call is failing, e.g.:
open("/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
Reproduced on GHC 7.8.4 and 7.10.1, on Ubuntu 14.04 64-bit.
New description:
Originally discussed at: https://groups.google.com/d/msg/haskell-
cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as
a static linking and Docker issue, but in fact affects dynamically linked
executables without any containerization.
Other examples of the same bug: #7695, #8977, #8928
I've put together the following script that reproduces my problem:
{{{
cat > hello.hs <

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Old description:
Originally discussed at: https://groups.google.com/d/msg/haskell- cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as a static linking and Docker issue, but in fact affects dynamically linked executables without any containerization.
Other examples of the same bug: #7695, #8977, #8928
I've put together the following script that reproduces my problem:
{{{ cat > hello.hs <
rm -rf tmp mkdir tmp
cp hello tmp
mkdir -p tmp/usr/lib/x86_64-linux-gnu cp /usr/lib/x86_64-linux-gnu/libgmp.so.10 tmp/usr/lib/x86_64-linux-gnu
mkdir -p tmp/lib/x86_64-linux-gnu cp \ /lib/x86_64-linux-gnu/libm.so.6 \ /lib/x86_64-linux-gnu/librt.so.1 \ /lib/x86_64-linux-gnu/libdl.so.2 \ /lib/x86_64-linux-gnu/libc.so.6 \ /lib/x86_64-linux-gnu/libpthread.so.0 \ tmp/lib/x86_64-linux-gnu
mkdir -p tmp/lib64 cp /lib64/ld-linux-x86-64.so.2 tmp/lib64
#mkdir -p tmp/usr/lib/x86_64-linux-gnu/gconv/ #cp \ # /usr/lib/x86_64-linux-gnu/gconv/UTF-32.so \ # /usr/lib/x86_64-linux-gnu/gconv/gconv-modules \ # tmp/usr/lib/x86_64-linux-gnu/gconv
sudo chroot tmp /hello }}}
If I uncomment the block that copies the gconv files, the program runs as expected. However, without those files copied, the program burns CPU and consumes memory until killed by the OS. I ran strace on a similar executable, and got the results at:
https://gist.github.com/snoyberg/095efb17e36acc1d6360
Note that this problem also occurs with statically linked executables when some of the other dynamically linked libraries are not available in the chroot environment.
Expected behavior: ideal would be not to require the gconv files and other shared libraries be present, especially when statically linked. Barring that, it would be much better if the RTS could produce a meaningful error message about the missing file. Note that strace does demonstrate that a open system call is failing, e.g.:
open("/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
Reproduced on GHC 7.8.4 and 7.10.1, on Ubuntu 14.04 64-bit.
New description:
Originally discussed at: https://groups.google.com/d/msg/haskell-
cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as
a static linking and Docker issue, but in fact affects dynamically linked
executables without any containerization.
I've put together the following script that reproduces my problem:
{{{
cat > hello.hs <

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Old description:
Originally discussed at: https://groups.google.com/d/msg/haskell- cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as a static linking and Docker issue, but in fact affects dynamically linked executables without any containerization.
I've put together the following script that reproduces my problem:
{{{ cat > hello.hs <
rm -rf tmp mkdir tmp
cp hello tmp
mkdir -p tmp/usr/lib/x86_64-linux-gnu cp /usr/lib/x86_64-linux-gnu/libgmp.so.10 tmp/usr/lib/x86_64-linux-gnu
mkdir -p tmp/lib/x86_64-linux-gnu cp \ /lib/x86_64-linux-gnu/libm.so.6 \ /lib/x86_64-linux-gnu/librt.so.1 \ /lib/x86_64-linux-gnu/libdl.so.2 \ /lib/x86_64-linux-gnu/libc.so.6 \ /lib/x86_64-linux-gnu/libpthread.so.0 \ tmp/lib/x86_64-linux-gnu
mkdir -p tmp/lib64 cp /lib64/ld-linux-x86-64.so.2 tmp/lib64
#mkdir -p tmp/usr/lib/x86_64-linux-gnu/gconv/ #cp \ # /usr/lib/x86_64-linux-gnu/gconv/UTF-32.so \ # /usr/lib/x86_64-linux-gnu/gconv/gconv-modules \ # tmp/usr/lib/x86_64-linux-gnu/gconv
sudo chroot tmp /hello }}}
If I uncomment the block that copies the gconv files, the program runs as expected. However, without those files copied, the program burns CPU and consumes memory until killed by the OS. I ran strace on a similar executable, and got the results at:
https://gist.github.com/snoyberg/095efb17e36acc1d6360
Note that this problem also occurs with statically linked executables when some of the other dynamically linked libraries are not available in the chroot environment.
Expected behavior: ideal would be not to require the gconv files and other shared libraries be present, especially when statically linked. Barring that, it would be much better if the RTS could produce a meaningful error message about the missing file. Note that strace does demonstrate that a open system call is failing, e.g.:
open("/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
Reproduced on GHC 7.8.4 and 7.10.1, on Ubuntu 14.04 64-bit.
New description:
Originally discussed at: https://groups.google.com/d/msg/haskell-
cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as
a static linking and Docker issue, but in fact affects dynamically linked
executables without any containerization.
Other examples of the same bug: #7695, #8977, #8928
I've put together the following script that reproduces my problem:
{{{
cat > hello.hs <

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by thoughtpolice): Oh, also, we should definitely have a `Note` explaining exactly what's going on here, just in case we change it later. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by hsyl20): I have an additional fix to propose for this bug: handle Unicode to ASCII conversions in GHC without Iconv: {{{ diff --git a/libraries/base/GHC/IO/Encoding.hs b/libraries/base/GHC/IO/Encoding.hs index 31683b4..c67f317 100644 --- a/libraries/base/GHC/IO/Encoding.hs +++ b/libraries/base/GHC/IO/Encoding.hs @@ -243,6 +243,7 @@ mkTextEncoding' cfm enc = case [toUpper c | c <- enc, c /= '-'] of "UTF32" -> return $ UTF32.mkUTF32 cfm "UTF32LE" -> return $ UTF32.mkUTF32le cfm "UTF32BE" -> return $ UTF32.mkUTF32be cfm + "ANSI_X3.41968" -> return char8 -- match "ANSI_X3.4-1968" (ASCII) #if defined(mingw32_HOST_OS) 'C':'P':n | [(cp,"")] <- reads n -> return $ CodePage.mkCodePageEncoding cfm cp _ -> unknownEncodingErr (enc ++ codingFailureModeSuffix cfm) }}} It seems that static binaries fall back to ASCII even if the current locale is UTF-8. ASCII is identified with the string "ANSI_X3.4-1968" on my system (Linux 4.0, glibc 2.21). Maybe we should match other possible aliases? I tested this patch with a single static binary in a initramfs and it works fine now. It should fix #7695 too (single static binary in a chrooted environment). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: patch Priority: normal | Milestone: 7.10.2 Component: Runtime System | Version: 7.10.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: Phab:D898 -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: new => patch * differential: => Phab:D898 * milestone: => 7.10.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable
-------------------------------------+-------------------------------------
Reporter: snoyberg | Owner: simonmar
Type: bug | Status: patch
Priority: normal | Milestone: 7.10.2
Component: Runtime System | Version: 7.10.1
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64
Type of failure: Runtime crash | (amd64)
Blocked By: | Test Case:
Related Tickets: | Blocking:
| Differential Revisions: Phab:D898
-------------------------------------+-------------------------------------
Comment (by Austin Seipp

#10298: Infinite loop when shared libraries are unavailable -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: simonmar Type: bug | Status: closed Priority: normal | Milestone: 7.10.2 Component: Runtime System | Version: 7.10.1 Resolution: duplicate | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime crash | (amd64) Blocked By: | Test Case: Related Tickets: #7695 | Blocking: | Differential Revisions: Phab:D898 -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: patch => closed * resolution: => duplicate * related: => #7695 Comment: Fixed; I'm closing this as a duplicate of #7695, which I'll move to `merge` for 7.10.2. I couldn't unfortunately think of a way to introduce a reliable test here, without chroot/sudo, which is unsuitable for the testsuite. But I can confirm it fixes the above program - in fact, it runs successfully now, thanks to the extra patch from Sylvain, and in the case `iconv` errors, an exception should be thrown properly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10298#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10298: Infinite loop when shared libraries are unavailable
-------------------------------------+-------------------------------------
Reporter: snoyberg | Owner: simonmar
Type: bug | Status: closed
Priority: normal | Milestone: 7.10.2
Component: Runtime System | Version: 7.10.1
Resolution: duplicate | Keywords:
Operating System: Linux | Architecture: x86_64
Type of failure: Runtime crash | (amd64)
Blocked By: | Test Case:
Related Tickets: #7695 | Blocking:
| Differential Revisions: Phab:D898
-------------------------------------+-------------------------------------
Comment (by Ben Gamari
participants (1)
-
GHC