[GHC] #15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Keywords: | Operating System: Linux Architecture: m68k | Type of failure: Incorrect result | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- We have observed in Debian that GHC can produce a erratically behaving version of ghc-pkg which causes issues when piping it's output to another program. Example: {{{ make_setup_recipe Running ghc --make Setup.hs -o debian/hlibrary.setup [1 of 1] Compiling Main ( Setup.hs, Setup.o ) Linking debian/hlibrary.setup ... . /usr/share/haskell-devscripts/Dh_Haskell.sh && \ configure_recipe Running debian/hlibrary.setup configure --ghc -v2 --package- db=/var/lib/ghc/package.conf.d --prefix=/usr --libdir=/usr/lib/haskell- packages/ghc/lib --libexecdir=/usr/lib --builddir=dist-ghc --ghc-option =-optl-Wl\,-z\,relro --haddockdir=/usr/lib/ghc-doc/haddock/base-prelude-/ --datasubdir=base-prelude --htmldir=/usr/share/doc/libghc-base-prelude- doc/html/ --enable-library-profiling Configuring base-prelude-1.2.1... Warning: cannot determine version of /usr/bin/ghc-pkg : "" hlibrary.setup: The program 'ghc-pkg' is required but the version of /usr/bin/ghc-pkg could not be determined. }}} (Full log: https://buildd.debian.org/status/fetch.php?pkg=haskell-base- prelude&arch=m68k&ver=1.2.1-1&stamp=1530588494&raw=0) Examining the behavior of the affected ghc-pkg binary on m68k shows what's going on: {{{ root@pacman:~# uname -a Linux pacman 4.16.0-2-m68k #1 Debian 4.16.16-1 (2018-06-19) m68k GNU/Linux root@pacman:~# ghc-pkg --version GHC package manager version 8.2.2 root@pacman:~# ghc-pkg --version | cat root@pacman:~# }}} Comparing that to an x86_64 machine shows that piping to cat should actually output something: {{{ glaubitz@epyc:~$ uname -a Linux epyc 4.16.0-2-amd64 #1 SMP Debian 4.16.12-1 (2018-05-27) x86_64 GNU/Linux glaubitz@epyc:~$ ghc --version The Glorious Glasgow Haskell Compilation System, version 8.2.2 glaubitz@epyc:~$ ghc --version | cat The Glorious Glasgow Haskell Compilation System, version 8.2.2 glaubitz@epyc:~$ }}} Further investigation shows that the problem is partially resolved when building GHC with reduced optimization: {{{ echo "SRC_HC_OPTS += -O0 -H64m" >> mk/build.mk echo "GhcStage1HcOpts = -O" >> mk/build.mk echo "GhcStage2HcOpts = -O0" >> mk/build.mk echo "GhcLibHcOpts = -O" >> mk/build.mk }}} The above issue goes away, but ghc-pkg itself still shows some strange behavior: {{{ make_setup_recipe Running ghc --make Setup.hs -o debian/hlibrary.setup [1 of 1] Compiling Main ( Setup.hs, Setup.o ) Linking debian/hlibrary.setup ... . /usr/share/haskell-devscripts/Dh_Haskell.sh && \ configure_recipe Running debian/hlibrary.setup configure --ghc -v2 --package- db=/var/lib/ghc/package.conf.d --prefix=/usr --libdir=/usr/lib/haskell- packages/ghc/lib --libexecdir=/usr/lib --builddir=dist-ghc --ghc-option =-optl-Wl\,-z\,relro --haddockdir=/usr/lib/ghc-doc/haddock/microlens- ghc-0.4.8.0/ --datasubdir=microlens-ghc --htmldir=/usr/share/doc/libghc- microlens-ghc-doc/html/ --enable-library-profiling Configuring microlens-ghc-0.4.8.0... hlibrary.setup: ghc-pkg dump failed: dieVerbatim: user error (hlibrary.setup: '/usr/bin/ghc-pkg' exited with an error: ghc-pkg: <stdout>: commitBuffer: invalid argument (invalid character) ) }}} (Full log: https://buildd.debian.org/status/fetch.php?pkg=haskell- microlens-ghc&arch=m68k&ver=0.4.8.0-2&stamp=1529960552&raw=0) To reproduce the issue, GHC can be tested using QEMU which has pretty good support for the m68k target these days. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by slyfox): * cc: slyfox (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by slyfox): I think I can reproduce it in qemu-sh4 on ghc-HEAD. Will look at it. {{{ $ file utils/ghc-pkg/dist-install/build/tmp/ghc-pkg utils/ghc-pkg/dist-install/build/tmp/ghc-pkg: ELF 32-bit LSB executable, Renesas SH, version 1 (SYSV), dynamically linked, interpreter /lib/ld- linux.so.2, for GNU/Linux 3.2.0, not stripped $ LD_LIBRARY_PATH="$(pwd)/rts/dist/build:"$(for d in $(find -name '*.so' | fgrep dist-install); do dirname "$(pwd)/$d"; done | tr '$\n' ':') utils /ghc-pkg/dist-install/build/tmp/ghc-pkg --version GHC package manager version 8.7.20180704 $ LD_LIBRARY_PATH="$(pwd)/rts/dist/build:"$(for d in $(find -name '*.so' | fgrep dist-install); do dirname "$(pwd)/$d"; done | tr '$\n' ':') utils /ghc-pkg/dist-install/build/tmp/ghc-pkg --version | cat }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by slyfox): Here is the minimal reproducer: {{{#!hs -- a.hs import System.IO main = do hSetBuffering stdout (BlockBuffering Nothing) hPutStrLn stdout "hello" }}} {{{ $ ghc --make a.hs -O1 -dynamic $ ./a <no output> }}} Confirmed locally that: - broken: '''sh4''' and '''m68k''' - seem to work: '''x86_64''', '''mipsn32''', '''powerpc''', '''powerpc64''', '''sparc''' I have not got to the bottom of the bug yet but on the way there. Somehow flushStdHandles fails to flush stdout when (and only when) in full buffered mode. A few observations: - the trigger needs both '''-O1' and '''-dynamic''' - both '''sh4''' and '''m68k''' are arches with 2-byte instruction width I'll keep debugging. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by slyfox): Replying to [comment:3 slyfox]:
I'll keep debugging.
I think I understand breakage mechanics now:
The example above breaks because '''stdout''' haskell closure is evaluated
twice:
- once in the test binary
- once in shared library
and creates two objects. '''hFlush''' is called only for second (shared
library) object because it is called from there.
Duplication happens because '''base_GHCziIOziHandleziFD_stdout_closure'''
(closure behind '''System.IO.stdout''' object) is copied by the linker
'''COPY''' relocation:
{{{
$ objdump -R -D bug/ghc-pkg
00415248

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by slyfox): Replying to [comment:4 slyfox]:
Normally '''COPY''' relocations are used only for immutable ('''const''' in C land) data. But '''_closure'''s are mutable. I'll double-check generated C code and file a toolchain bug upstream.
James (jrtc27) pointed out that '''COPY''' relocations are fine for
mutable data as long as shared library allows interposition of the symbol.
James also noted that GHC uses '''-Bsymbolic''' which forbids symbol
interposition and binds global symbols to library's definition.
'''-Bsymbolic''' is set in '''GHC'''s driver:
http://git.haskell.org/ghc.git/blob/HEAD:/compiler/main/SysTools.hs#l550
Thus smaller C-only reproducer that illustrates the problem is the
following:
{{{#!c
/* lib.c: */
int things[] = { 99,98,97,96 };
int shlib_f(int i) {
return things[i];
}
}}}
{{{#!c
/* bin.c */
#include

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jrtc27): Replying to [comment:5 slyfox]:
Replying to [comment:4 slyfox]:
Normally '''COPY''' relocations are used only for immutable ('''const''' in C land) data. But '''_closure'''s are mutable. I'll double-check generated C code and file a toolchain bug upstream.
James (jrtc27) pointed out that '''COPY''' relocations are fine for mutable data as long as shared library allows interposition of the symbol. James also noted that GHC uses '''-Bsymbolic''' which forbids symbol interposition and binds global symbols to library's definition.
Even for immutable data it can pose problematic if you perform address comparisons, but you're right that mutable data can lead to more problems.
{{{ $ cross=x86_64-pc-linux-gnu- emulator= ./a.sh
target: x86_64-pc-linux-gnu-; emulator=; no-pie: main (before store): things[0] = 99 shlib (before store): things[0] = 99 main (after store): things[0] = 45 shlib (after store): things[0] = 99 target: x86_64-pc-linux-gnu-; emulator=; pie: main (before store): things[0] = 99 shlib (before store): things[0] = 99 main (after store): things[0] = 45 shlib (after store): things[0] = 99 }}}
Note: the value of '''things[0]''' disagrees between binary and library copy.
Actually x86_64 is more unusual here in that PIE is *also* broken by this. Most architectures will fall back on the normal GOT mechanisms for PIE, but on x86_64 RIP-relative addressing is available, giving a cheaper position-independent way to access globals (and of course it's still fine for PIE because we can still use COPY relocations). If you were to run that for i386 or a traditional RISC architecture, you would see that PIE did not exhibit the bug. Surely the way forward is to ditch `-Bsymbolic`, and ensure that whatever non-PIC patterns exist (as mentioned by https://ghc.haskell.org/trac/ghc/wiki/Commentary/PositionIndependentCode#Lin...) in the DSOs produced are fixed? Presumably the only non-PIC bits are whatever would work for PIE (otherwise how on earth can you have a variable image base) and must be PC-relative? For code that should be fine (you could even use `-Bsymbolic-functions` and I believe it would work), and for data there can't be many instances. In fact, I think `-Bsymbolic` might well work just fine for NCG as it already tries to avoid COPY relocations (does it always successfully avoid them?), in which case we could just drop `-Bsymbolic` when compiling via C (GCC should just do the right thing if you give it valid C). Thoughts? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jrtc27): * cc: jrtc27 (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Indeed we had similar [https://sourceware.org/bugzilla/show_bug.cgi?id=16177 issues] on ARM resulting in #4210. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4959 Wiki Page: | -------------------------------------+------------------------------------- Changes (by slyfox): * differential: => Phab:D4959 Comment: Sent https://phabricator.haskell.org/D4959 for review (remove -Bsymbolic from UNREG targets first). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4
-------------------------------------+-------------------------------------
Reporter: glaubitz | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.4.3
Resolution: | Keywords:
Operating System: Linux | Architecture: m68k
Type of failure: Incorrect result | Test Case:
at runtime |
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4959
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Sergei Trofimovich

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4959 Wiki Page: | -------------------------------------+------------------------------------- Comment (by glaubitz): Some packages like haskell-ansi-wl-pprint still seem to have issues though: {{{ Running debian/hlibrary.setup configure --ghc -v2 --package- db=/var/lib/ghc/package.conf.d --prefix=/usr --libdir=/usr/lib/haskell- packages/ghc/lib --libexecdir=/usr/lib --builddir=dist-ghc --ghc-option =-optl-Wl\,-z\,relro --haddockdir=/usr/lib/ghc-doc/haddock/ansi-wl- pprint-0.6.8.2/ --datasubdir=ansi-wl-pprint --htmldir=/usr/share/doc /libghc-ansi-wl-pprint-doc/html/ --enable-library-profiling Configuring ansi-wl-pprint-0.6.8.2... hlibrary.setup: failed to parse output of 'ghc-pkg dump' make: *** [/usr/share/cdbs/1/class/hlibrary.mk:142: configure-ghc-stamp] Error 1 dpkg-buildpackage: error: debian/rules build-arch subprocess returned exit status 2 }}} This is with the patch tested on GHC 8.2.2 with the patch and with optimization set to -O0. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15338: ghc-pkg misbehaves after possible miscompilation on m68k and sh4 -------------------------------------+------------------------------------- Reporter: glaubitz | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Linux | Architecture: m68k Type of failure: Incorrect result | Test Case: at runtime | Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4959 Wiki Page: | -------------------------------------+------------------------------------- Comment (by glaubitz): Ok, looks like just adding the patch but not changing the optimization level addresses the build issue with haskell-ansi-wl-pprint. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15338#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC