[GHC] #14769: The RecompBecause [TH] check is not resume-build-safe

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Incorrect result Unknown/Multiple | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: #481 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- `ghc --make` aims to make compilation behave correctly and produce up-to- date, no matter whether it completes or is interrupted (e.g. with Ctrl+C) and later resumed At no point should interrupting a build and running it again produce "less correct" outputs than running it to the end; specifically interrupting shouldn't result in necessary build steps to be "forgotten". However, this seems to be the case with the `[TH]` check introduced in #481. Minimal test case with 3 files: https://github.com/nh2/th-recomp-test Good behaviour when running without interrupting: {{{ % touch A.hs && ~/.stack/programs/x86_64-linux/ghc-8.0.1/bin/ghc --make Main.hs [1 of 3] Compiling A ( A.hs, A.o ) [2 of 3] Compiling B ( B.hs, B.o ) [TH] [3 of 3] Compiling Main ( Main.hs, Main.o ) [TH] Linking Main ... }}} Bad behaviour when interrupting and running again: {{{ % touch A.hs; timeout 0.1 ~/.stack/programs/x86_64-linux/ghc-8.0.1/bin/ghc --make Main.hs; ~/.stack/programs/x86_64-linux/ghc-8.0.1/bin/ghc --make Main.hs [1 of 3] Compiling A ( A.hs, A.o ) [2 of 3] Compiling B ( B.hs, B.o ) [TH] Linking Main ... }}} As you can see, when interrupting, the step `Compiling Main ( Main.hs, Main.o ) [TH]` went missing. This suggests to me that either: * the `[TH]`s are unnecessary in the first place (unlikely and I buy into the reason explained by #481 that changed values may be picked up by TH in higher-level modules) * the resumed `ghc --make` invocation may potentially produce wrong / not- up-to-date outputs (most likely) * re-running somehow retrieves the information that in fact recompiling `Main` is not necessary for a correct build (if this is the case, then I'd really like to know how that works and we should document it, likely also try to use that way to avoid recompiling `Main` in the uninterrupted run) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): The relevant bit of code is this in HscMain: {{{ -- If the module used TH splices when it was last -- compiled, then the recompilation check is not -- accurate enough (#481) and we must ignore -- it. However, if the module is stable (none of -- the modules it depends on, directly or -- indirectly, changed), then we *can* skip -- recompilation. This is why the SourceModified -- type contains SourceUnmodifiedAndStable, and -- it's pretty important: otherwise ghc --make -- would always recompile TH modules, even if -- nothing at all has changed. Stability is just -- the same check that make is doing for us in -- one-shot mode. case m_tc_result of Nothing | mi_used_th iface && not stable -> compile mb_old_hash (RecompBecause "TH") _ -> skip iface }}} I think the `&& not stable` is wrong. After interrupting the first compilation, the second compilation finds some modules to be stable and this overrides the TH check. Unfortunately if we just delete this condition, then TH will cause a *lot* of recompilation. Just re-running `ghc --make` on something you just recompiled will recompile all the TH modules and everything that depends on them. The stable check was always a bit of a hack though, to get around the fact that we don't have the information we really need, which is "did the object file change". Perhaps we just need to be more clever and store hashes of object files as dependencies when we're using TH. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2):
we don't have the information we really need, which is "did the object file change". Perhaps we just need to be more clever and store hashes of object files as dependencies when we're using TH
@simonmar Exactly. I thought the same, and implemented that a couple days ago as a work-in-progress: https://github.com/ghc/ghc/compare/ghc-8.2.1-release...nh2:ghc-8.2.1 -release-nonrecursive-recompBecause-TH?expand=1 Would be nice if you could have a look, I can then work towards finishing that. In my preliminary experiments, this reduced the occurrance of the `[TH]` reason from 40 modules to 8 in a 150-module project. However, the big problem is still #12935, non-deterministic `.o` files. If they are nondeterministic, they will destroy object file hashing based recompilation avoidance. However, once it works, I think it will be great: With `-O0` I get (nondeterministically) a reduction from 40 modules to 1 module + relink. So I took a shot at that problem as well (making object files deterministic), taking a different approach than what was taken so far (instead of making uniques not appear in outputs, I'm trying to genrate them deterministically). In my small example it works up to a single difference in the final binary, but more work is needed. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

So I took a shot at that problem as well (making object files deterministic), taking a different approach than what was taken so far (instead of making uniques not appear in outputs, I'm trying to genrate
#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Will take a look at the github branch (you wouldn't perhaps consider Phabricator instead? It's much easier to review code there) them deterministically) I'd be surprised if you can do that, we thought pretty hard about whether this was feasible and came to the conclusion that it probably wasn't, due to things like parallel builds, ghc --make needing to have unique supplies for multiple modules, and lazy I/O in the typechecker. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * cc: niteria (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Replying to [comment:3 simonmar]:
you wouldn't perhaps consider Phabricator instead? It's much easier to review code there
I will when I have it working and have some non-hacky commits; until then I prefer Github because it is much easier to update multi-commit patch series there than on Phab.
I'd be surprised if you can do that, we thought pretty hard about whether this was feasible and came to the conclusion that it probably wasn't, due to things like parallel builds, ghc --make needing to have unique supplies for multiple modules, and lazy I/O in the typechecker.
I'm not convinced either yet that it will work, and the `'r'` `UniqueSupply` in the global `--make` `HscEnv` is giving me some troubles at the current state, but so far I haven't found a reason why maintaining an independent "seed" for each module (as we know all modules ahead of time in a `--make` invocation) wouldn't cover at least the cases of multiple modules and `-j`. Regarding the lazy I/O in the typechecker, could you point me at where that happens? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar):
but so far I haven't found a reason why maintaining an independent "seed" for each module (as we know all modules ahead of time in a --make invocation) wouldn't cover at least the cases of multiple modules and -j.
Regarding the lazy I/O in the typechecker, could you point me at where
But what are those seeds? The unique supply for each module would need to generate uniques that are - different from those in every other module - the same each time you compile this module So it seems like the uniques would have to be qualifed by the module/package names somehow. Or did you have some other idea in mind? that happens? Look at `TcRnMonad.forkM_maybe`. If you solved the seed problem (above) then the same solution could be applied to unique supplies for imported modules which would also solve the lazy I/O problem, but it's still not clear to me how you do this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Replying to [comment:6 simonmar]:
So it seems like the uniques would have to be qualifed by the module/package names somehow.
Yes. I use the "`mod_index`" number, like `1` in the output `[1 of 3] Compiling A`, by which I index an array of `UniqueSupply` instead of having only one global one. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): But wouldn't that run into problems if the index changes? If we add or remove imports, or if we compile a different subset of modules, or if we compile a module by itself using `ghc -c`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Yes, that scheme lacks "stability" against some use cases. Most prominently, adding a new module and importing it, thus turning e.g. `[1 of 3]` into `[2 of 4]`, will have us end up with different uniques and thus different `.o` files and different binaries (and the benefit of reduced `[TH]` compilations is lost for such changes). But the scheme should still solve the most common use cases: * A from-scratch build using the same code and same compiler would result in byte-identical outputs * Resuming a `--make` (`-j`) build after merely touching a file or changing a comment would result in byte-identical outputs * Resuming a `--make` (`-j`) build after changing the code in a module will would result in only the `.o` for that module to change (plus potentially some inlining sites if `-O` is on); `.o` files that were recompiled because of `[TH]` that generate identical `.o` outputs will allow for the identical-.o-file-`[TH]`-avoidance you proposed above The first point will allow us to get byte-reproducible binaries ala Debian's "reproducible builds"; until now due to Unique nondeterminism, this wasn't possible even under guaranteed input control and stopping-the- system-time sandboxes, as Nix does it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I'm very sceptical (in case you hadn't guessed!). @niteria spent a great deal of time removing cases of non-determinism from the compiler, so that we now have non-determinism at the ABI level. The way it's done currently is more robust than relying on unique determinism, because the compiler produces deterministic output even in the presence of unique non- determinism. Relying on unique determinism is quite fragile and easily broken, and we have no way to know when that happens - at least with the current scheme we have some protection in the APIs for things like `UniqFM` so that it's hard to accidentally introduce non-determinism. I'm really not keen on the idea of using the module index, because compiling a module individually with `-c` should produce exactly the same output as if it was compiled as part of a set of modules with `--make`. Why not instead extend the current mechanism to add non-determinism in the code generator? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Why not instead extend the current mechanism to add non-determinism in
#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): Replying to [comment:10 simonmar]: the code generator? From my side it's just a limitation of how much time I can sink into it. Making uniques not surface at all in the generated code would be great and cover more use cases, like being able to switch between `-c` and `--make` as you say, but an approach like I suggest would already cover the most common use cases (e.g. libraries and binaries built by cabal, Stack, Linux distributions would be deterministic, as they all use `--make`, and `[TH]` recompilation would be reduced), while (to my knowledge) not regressing any existing functionality (switching between `--make` and `-c` would remain as determinism-destroying as they were so far). Also, the two approaches don't seem to be competing -- it appears to me that in an ideal world, GHC would have neither internal variables appear in outputs, nor use any form of module-spanning "impure" internal state driven by `unsafePerformIO` as it has now. Using the module index is a quickest-and-easiest way to map the modules into a compact range of integers for easy array indexing. If this turns out to not be relevant for performance, then I could imagine also using the fully qualified module name or something similar, which might address the `-c` vs `--make` issue.
Relying on unique determinism is quite fragile and easily broken, and we have no way to know when that happens - at least with the current scheme we have some protection in the APIs for things like `UniqFM` so that it's hard to accidentally introduce non-determinism.
I'd argue that removing the possibility to access an `unsafePerformIO`-based number generator at all would be a much stronger guarantee that non-determinism cannot be introduced, is that not accurate? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I'd argue that removing the possibility to access an unsafePerformIO-
#14769: The RecompBecause [TH] check is not resume-build-safe -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #481 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): based number generator at all would be an even stronger guarantee that non-determinism cannot be introduced, is that not accurate? Certainly, but there is a reason we use uniques: they allow very cheap comparison. Recovering this in a language with no sense of identity of pure values is not easy and I don't believe we are willing to eat the compile time regressions that would likely come from dropping uniques from the compiler. Of course, if you can show a rework of a subset of the compiler that doesn't regress in performance while avoiding a unique source, we would be quite interested. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14769#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC