[GHC] #16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk`

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- When working on a stage1 compiler, the slightest change in any of the files leads to rebuilding the dependency matrix, which takes 20-30s time. That makes for a very disruptive edit-compile cycle. Alp helped me on #ghc and found `--skip=_build/stage0/compiler/.dependencies.mk` as the right flag to skip dependency rebuilding. I wonder if could hide that behind a nicer flag? I think this should do similar things as `--freeze1`, only that we 'freeze' stage 0 and dependency building. The analogy is that we need a hadrian equivalent of `make -C ghc 1` as we have `--freeze1` for `make -C ghc 2`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by snowleopard): * cc: goldfire (added) Comment: Let me also add @goldfire here who was asking similar questions in this ticket: https://ghc.haskell.org/trac/ghc/ticket/16242 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
When working on a stage1 compiler, the slightest change in any of the files leads to rebuilding the dependency matrix, which takes 20-30s time. That makes for a very disruptive edit-compile cycle.
This is mysterious to me. I thought that part of the wonderfulness of Shake and early cut-off was that all this repeated work is not done. So why is that not working? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard): Simon: this is expected behaviour in this case. Dependency analysis of Haskell sources is performed on per package (not per file) basis, in one go. Whenever a single source file is changed, we invoke `ghc -M` on the whole package, which can take a while for a large package. This is how Make works too, but it often disables the tracking mechanism, which may lead to incorrect build results but is fast. Perhaps, we could/should switch to dependency analysis on the per-file basis (as we do with C sources), which would directly address this particular ticket without introducing yet another way to disable tracking. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard): Looking at the documentation of `ghc -M`, it looks like the current per- package approach is due to a limitation of GHC's dependency analysis. Quoting from https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/separate_com... #makefile-dependencies
In general, `ghc -M Foo` does the following. For each module `M` in the set `Foo` plus all its imports (transitively), it adds to the Makefile [...]
That is, GHC always does **transitive** dependency analysis, which means invoking it separately on each file would be rather inefficient (each time it will likely traverse almost the whole dependency graph). This is why Make and Hadrian choose to perform the analysis just once but for the whole package. Perhaps, it's not too difficult to add a more fine-grain dependency analysis to GHC, i.e. produce only the list of immediate dependencies of a specified module. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
Perhaps, it's not too difficult to add a more fine-grain dependency analysis to GHC, i.e. produce only the list of immediate dependencies of a specified module.
I'm sure it would be hard to have another flag so that `ghc -new-flag M.hs` would produce just the immediate dependencies of `M`. Would that solve the problem? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard):
Would that solve the problem?
Yes, switching to per-file dependencies in Hadrian would be easy if we had such flag. However, per-package vs per-file is a bit of a trade-off. Per-package analysis will likely be faster for the full build (you do analysis only once instead of for each file separately), whereas per-file analysis will be faster for incremental builds. (It is likely that the reduction of performance for the full build when switching to the per-file approach will be negligible, but we'll need to check this.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by snowleopard): * cc: NeilMitchell (added) Comment: Speaking of the trade-off, perhaps, this is a nice use case for Shake's "batching" feature: http://hackage.haskell.org/package/shake-0.17.4/docs/Development- Shake.html#v:batch We could have a batching rule for dependency analysis: if multiple files need to be analysed, their analysis could be combined into a single GHC invocation. That would be quite cool. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
Yes, switching to per-file dependencies in Hadrian would be easy if we had such flag.
OK, let's do it! Can't be hard. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NeilMitchell): Andrey, is the reason this is hideously expensive because {{{ghc -M}}} is hideously expensive? If not, then using oracles to cache the various parts of {{{dependencies.mk}}} would be the right solution. I believe the way {{{-M}}} works is it builds a complete dependency tree, which is pretty expensive, and requires running all C preprocessors etc. Doing it in individual steps is likely to be hideously expensive. The solution I've always used in the past is something like https://shakebuild.com/includes#generated-transitive-imports. Pro's are it's super fast, super granular and allows you to import files that are themselves generated on demand. Con is that you have to write your own "spot an import" code. My experience is that's really hard ''in general'' but quite easy for any specific project with sane conventions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Can you describe more explicitly "the solution you have used in the past" in our context? I think you are saying * Implement `ghc -scome-new-flag M.hs` which runs CPP on `M.hs` (if necessary), parses the result in some simple minded way, and spits out all of `M`'s direct imports. This seems to be what your `usedHeaders` thing does. If we could do `need (usedHeaders "M.hs")` maybe we would never need to use `ghc -M` at all? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NeilMitchell): In the past I've written a function that reads the file, and using fairly simplistic string matching guesses what it depends on, in the build system itself. It can avoid shelling out to GHC (hugely expensive on Windows, especially with corporate antivirus systems) and avoid running CPP. Generally most CPP doesn't impact which files are used, and even if it does, having a superset isn't a problem. The kind of function I've used previously is on the order of: {{{#!hs [... extract_the_module_name x ... | x <- lines src, "import " `isPrefixOf` x] }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard):
Andrey, is the reason this is hideously expensive because `ghc -M` is hideously expensive?
There are ~500 Haskell files in the `compiler` directory, so global (i.e. per-package) dependency analysis can't be very fast, however efficiently it is implemented.
If not, then using oracles to cache the various parts of `dependencies.mk` would be the right solution.
This is what we do, but this doesn't solve the problem: right now, if you edit a single Haskell file in `compiler`, we will rerun `ghc -M` on the whole set of ~500 package files. Yes, oracles will helpfully cut the changes from propagating further, but this single `ghc -M` invocation will be slow. I think the only solution is to have a way (e.g. a new GHC flag) to run dependency analysis on a single file, without transitive exploration of all its dependencies. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard): Neil: Doing conservative dependency analysis directly from within Hadrian is an option too! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by NeilMitchell):
I think the only solution is to have a way (e.g. a new GHC flag) to run dependency analysis on a single file, without transitive exploration of all its dependencies.
That sounds reasonable. It would certainly be more robust than the strategy I describe. It won't help if you have deeply nested CPP includes (you'd still rescan them each time), but I suspect that's negligible for GHC. As you say, if that flag can take multiple files at once, you could batch it, which would be a good performance improvement. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by alpmestan): I can give a shot at implementing this `-M`-on-a-diet flag idea, so as to then use this when appropriate in Hadrian, whenever this would save us work. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by alpmestan): Self-note: the relevant code lives in https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/DriverMkDepend..... The simplest approach is probably to refine that `GhcMode` to be either transitive or not. When transitive, we'd take the current code path, when not, we'd take the new one that just looks at and reports the immediate dependencies. The former would still be exposed under `-M` and the new one under some other flag (any suggestion is welcome). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16253: Offer a shorthand for `--skip=_build/stage$n/compiler/.dependencies.mk` -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Build System | Version: 8.6.3 (Hadrian) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by snowleopard): Thanks taking this up, Alp! We could name the new flag `-M1` -- this would suggest that only dependencies at "depth" 1 are reported. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16253#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC