Non-deterministic package IDs are really bad in 7.10

Hi, I'd like to bring some attention to ticket #4012 about non-determinism. As many of you may know, the nix package manager distributes binaries throughout its binary caches. The binaries are shared as long as the hash of some of their inputs matches: this means that we can end up with two of the same hashes of inputs but thanks to #4012 means that the actual contents can differ. You end up with machines with some packages built locally and some elsewhere and due to non-determinism, the GHC package IDs don't line up and everything is broken. The situation was pretty bad in 7.8.4 in presence of parallel builds so we switched those off. Joachim's a477e8118137b7483d0a7680c1fd337a007a023b helped a great deal there and we were hopeful for 7.10. Now that 7.10.1 is out and people have been using and testing it, the situation turns out to be really bad: daily we get multiple reports from people about their packages ending up broken and our only advice is to do what we did back in 7.8 days which is to purge GHC and rebuild everything locally or fetch everything from a machine that already built it all, as long as the two don't mix. This is not really acceptable. See comment 76 on #4012 for an example of a rather simple file you can compile right now with nothing extra but -O and get different interface hash. This e-mail is just to raise awareness that there is a serious problem. If people are thinking about cutting 7.10.2 or whatever, I would consider part of this ticket to be a blocker as it makes using GHC reliably while benefitting from binary caches pretty much impossible. Of course there's the ‘why don't you fix it yourself’ question. I certainly plan to but do not have time for a couple more weeks to do so. For all I know right now, the fix to comment 76 might be easy and someone looking for something to hack on might have the time to get to it before I do. Thanks -- Mateusz K.

Currently the priority of #4012 is normal, shouldn't it be at least high?
Also the milestone is 7.12.1, should it be 7.10.2?
On Sun, May 10, 2015 at 3:39 PM, Mateusz Kowalczyk
Hi,
I'd like to bring some attention to ticket #4012 about non-determinism. As many of you may know, the nix package manager distributes binaries throughout its binary caches. The binaries are shared as long as the hash of some of their inputs matches: this means that we can end up with two of the same hashes of inputs but thanks to #4012 means that the actual contents can differ. You end up with machines with some packages built locally and some elsewhere and due to non-determinism, the GHC package IDs don't line up and everything is broken.
The situation was pretty bad in 7.8.4 in presence of parallel builds so we switched those off. Joachim's a477e8118137b7483d0a7680c1fd337a007a023b helped a great deal there and we were hopeful for 7.10. Now that 7.10.1 is out and people have been using and testing it, the situation turns out to be really bad: daily we get multiple reports from people about their packages ending up broken and our only advice is to do what we did back in 7.8 days which is to purge GHC and rebuild everything locally or fetch everything from a machine that already built it all, as long as the two don't mix. This is not really acceptable.
See comment 76 on #4012 for an example of a rather simple file you can compile right now with nothing extra but -O and get different interface hash.
This e-mail is just to raise awareness that there is a serious problem. If people are thinking about cutting 7.10.2 or whatever, I would consider part of this ticket to be a blocker as it makes using GHC reliably while benefitting from binary caches pretty much impossible.
Of course there's the ‘why don't you fix it yourself’ question. I certainly plan to but do not have time for a couple more weeks to do so. For all I know right now, the fix to comment 76 might be easy and someone looking for something to hack on might have the time to get to it before I do.
Thanks -- Mateusz K. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I'd like to add a +1 to getting this fixed. It bites us at work, where exact binary reproducibility of builds is strongly desirable. Regards, Malcolm On 14 May 2015, at 17:38, George Colpitts wrote:
Currently the priority of #4012 is normal, shouldn't it be at least high? Also the milestone is 7.12.1, should it be 7.10.2?
On Sun, May 10, 2015 at 3:39 PM, Mateusz Kowalczyk
wrote: Hi, I'd like to bring some attention to ticket #4012 about non-determinism. As many of you may know, the nix package manager distributes binaries throughout its binary caches. The binaries are shared as long as the hash of some of their inputs matches: this means that we can end up with two of the same hashes of inputs but thanks to #4012 means that the actual contents can differ. You end up with machines with some packages built locally and some elsewhere and due to non-determinism, the GHC package IDs don't line up and everything is broken.
The situation was pretty bad in 7.8.4 in presence of parallel builds so we switched those off. Joachim's a477e8118137b7483d0a7680c1fd337a007a023b helped a great deal there and we were hopeful for 7.10. Now that 7.10.1 is out and people have been using and testing it, the situation turns out to be really bad: daily we get multiple reports from people about their packages ending up broken and our only advice is to do what we did back in 7.8 days which is to purge GHC and rebuild everything locally or fetch everything from a machine that already built it all, as long as the two don't mix. This is not really acceptable.
See comment 76 on #4012 for an example of a rather simple file you can compile right now with nothing extra but -O and get different interface hash.
This e-mail is just to raise awareness that there is a serious problem. If people are thinking about cutting 7.10.2 or whatever, I would consider part of this ticket to be a blocker as it makes using GHC reliably while benefitting from binary caches pretty much impossible.
Of course there's the ‘why don't you fix it yourself’ question. I certainly plan to but do not have time for a couple more weeks to do so. For all I know right now, the fix to comment 76 might be easy and someone looking for something to hack on might have the time to get to it before I do.
Thanks -- Mateusz K. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi, Am Sonntag, den 10.05.2015, 19:39 +0100 schrieb Mateusz Kowalczyk:
I'd like to bring some attention to ticket #4012 about non-determinism. As many of you may know, the nix package manager distributes binaries throughout its binary caches. The binaries are shared as long as the hash of some of their inputs matches: this means that we can end up with two of the same hashes of inputs but thanks to #4012 means that the actual contents can differ. You end up with machines with some packages built locally and some elsewhere and due to non-determinism, the GHC package IDs don't line up and everything is broken.
is NixOS sensitive to changes in the build directory. Debian is, and since 7.8 the build path creeps into the interface files and affects the hash: https://bugs.debian.org/785282 But you probably are not, otherwise you’d have complained when 7.8 was out, and not now :-) Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
participants (4)
-
George Colpitts
-
Joachim Breitner
-
Malcolm Wallace
-
Mateusz Kowalczyk