Completely reproducible Haskell builds

I am trying to get ghc-7.0.3 build procedure down to a byte-identical rebuild on Linux-amd64. I solved one source of variability: ar embedding timestamps into .a files (HOWTO at the end) Now I am looking to eliminate variations in ELF64 executables. To make things easy, I am going to demonstrate with unlit binary. I start at a point where make leaves off. % cp utils/unlit/dist/build/tmp/unlit{,.1} % "/usr/bin/ghc" -o utils/unlit/dist/build/tmp/unlit -O -H64m -package-conf libraries/bootstrapping.conf -i -iutils/unlit/. -iutils/unlit/dist/build -iutils/unlit/dist/build/autogen -Iutils/unlit/dist/build -Iutils/unlit/dist/build/autogen -no-user-package-conf -rtsopts -odir utils/unlit/dist/build -hidir utils/unlit/dist/build -stubdir utils/unlit/dist/build -hisuf hi -osuf o -hcsuf hc -no-auto-link-packages -no-hs-main utils/unlit/dist/build/unlit.o % sha1sum utils/unlit/dist/build/tmp/unlit{,.1} 6f679d9dd9a9ea84a68be99369c9f1dc72ba41f0 utils/unlit/dist/build/tmp/unlit beed059e09c9429c3b74ea613d5be30c6c17ac3c utils/unlit/dist/build/tmp/unlit.1 % ls -l utils/unlit/dist/build/tmp/unlit{,.1} -rwxr-x--- 1 gnezdo eng 15112 May 25 21:53 utils/unlit/dist/build/tmp/unlit -rwxr-x--- 1 gnezdo eng 15112 May 25 21:53 utils/unlit/dist/build/tmp/unlit.1 % for i in utils/unlit/dist/build/tmp/unlit{,.1}; do readelf -a $i > $i.elf; done % diff utils/unlit/dist/build/tmp/unlit{,.1}.elf 250c250 < 27: 0000000000000000 0 FILE LOCAL DEFAULT ABS ghc27965_0.c ---
27: 0000000000000000 0 FILE LOCAL DEFAULT ABS ghc20499_0.c
Looks like there is a temporary file name baked into the ELF file. Indeed, running with -v reveals: *** C Compiler: /usr/bin/gcc -c /tmp/ghc28016_0/ghc28016_0.c -o /tmp/ghc28016_0/ghc28016_0.o -I/usr/lib/ghc-7.0.3/include -fno-stack-protector *** Linker: /usr/bin/gcc -v -o utils/unlit/dist/build/tmp/unlit -fno-stack-protector utils/unlit/dist/build/unlit.o /tmp/ghc28016_0/ghc28016_0.o Digging a bit into the sources reveals mkExtraCObj in DriverPipeline.hs which calls newTempName. The best option for dealing with this seems to be using gcc ability to accept input from a pipe. I know I could make this work on a Posix system. Yet I suspect getting it to work on Windows would be overly onerous. Next best idea is to make GHC use repeatable temporary .c & .o file names for each invocation. There is already a unique temporary directory where all the the temporary files are created. This suggests I do not need to worry about adversarial races. So GHC just need to avoid racing with itself. I see a couple of options: 1) newTempName should create a new subdirectory for each call and the return a fixed name inside of this (so /tmp/ghc28016_0/ghc28016_0.c above would become /tmp/ghc28016_0/0/dummy.c) 2) mkExtraCObj could compute some hash function of its xs argument (C program text) and then create a file named, e.g. /tmp/ghc28016_0/38eb8d8eb0abe9c828ba60983e2a97f7a069ec41.c Which of these two looks better? Other ideas? Would people be open to accepting a patch along these lines if I were to write one? The steps to make ar not include the timestamps were 1) Add "AR_OPTS = qD" to build.mk. This takes care of most .a files. 2) Set AR_FLAGS=qD in the evironment and dummy out ranlib (create a no-op script called ranlib on your PATH prior to real ranlib). This takes care of libffi build. Thanks Greg -- nest.cx is Gmail hosted, use PGP for anything private. Key: http://tinyurl.com/ho8qg Fingerprint: 5E2B 2D0E 1E03 2046 BEC3 4D50 0B15 42BD 8DF5 A1B0

On 26/05/2011 17:15, Greg Steuck wrote:
I am trying to get ghc-7.0.3 build procedure down to a byte-identical rebuild on Linux-amd64.
This is likely to be very fragile with GHC at the moment, due to some non-deterministic behaviour in its internals. I have tried to eliminate as much as I can, but there are still a couple of known sources, and possibly more unknown ones. See the last bullet point here: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/RecompilationAv... Having said that I think it's a good idea to fix any sources of binary differences as you're doing.
The best option for dealing with this seems to be using gcc ability to accept input from a pipe. I know I could make this work on a Posix system. Yet I suspect getting it to work on Windows would be overly onerous.
Next best idea is to make GHC use repeatable temporary .c& .o file names for each invocation. There is already a unique temporary directory where all the the temporary files are created. This suggests I do not need to worry about adversarial races. So GHC just need to avoid racing with itself. I see a couple of options:
1) newTempName should create a new subdirectory for each call and the return a fixed name inside of this (so /tmp/ghc28016_0/ghc28016_0.c above would become /tmp/ghc28016_0/0/dummy.c) 2) mkExtraCObj could compute some hash function of its xs argument (C program text) and then create a file named, e.g. /tmp/ghc28016_0/38eb8d8eb0abe9c828ba60983e2a97f7a069ec41.c
Which of these two looks better? Other ideas?
The first is easier, and would be fine with me.
Would people be open to accepting a patch along these lines if I were to write one?
Sure. But as I mentioned above, this might not be enough, or at least you might still get random differences from time to time. Cheers, Simon

On Fri, May 27, 2011 at 04:35:17PM +0100, Simon Marlow wrote:
Next best idea is to make GHC use repeatable temporary .c& .o file names for each invocation. There is already a unique temporary directory where all the the temporary files are created. This suggests I do not need to worry about adversarial races. So GHC just need to avoid racing with itself. I see a couple of options:
1) newTempName should create a new subdirectory for each call and the return a fixed name inside of this (so /tmp/ghc28016_0/ghc28016_0.c above would become /tmp/ghc28016_0/0/dummy.c) 2) mkExtraCObj could compute some hash function of its xs argument (C program text) and then create a file named, e.g. /tmp/ghc28016_0/38eb8d8eb0abe9c828ba60983e2a97f7a069ec41.c
Which of these two looks better? Other ideas?
The first is easier, and would be fine with me.
An alternative could be to just strip the single `symbol' off the object file (using something like strip -N ...). I didn't yet test this for real. Ciao, Kili

On Fri, May 27, 2011 at 9:59 AM, Matthias Kilian
An alternative could be to just strip the single `symbol' off the object file (using something like strip -N ...). I didn't yet test this for real.
This has to go with -optl=-Wl,build-id=none. Otherwise ld produces a .note.gnu.build-id ELF section. In words of my wiser colleague: The reason build-ids differ is that build-id is a checksum of "important ELF bits" (at link time), and symbol table counts. I suppose one could strip off the .note.gnu.build-id as well, but avoiding differences in the first places is a better strategy. Thanks Greg -- nest.cx is Gmail hosted, use PGP for anything private. Key: http://tinyurl.com/ho8qg Fingerprint: 5E2B 2D0E 1E03 2046 BEC3 4D50 0B15 42BD 8DF5 A1B0
participants (3)
-
Greg Steuck
-
Matthias Kilian
-
Simon Marlow