Discovery of source dependencies without --make

(cross-posted from [haskell-cafe]) Hi, I've got a problem in our – admittedly complex – build process. We're running an automated grading system for student-submitted homework exercises. The compilation proceeds in stages: 1) Compile a library file, marked as trustworthy 2) Compile student submissions using Safe Haskell, able to access a whitelisted set of packages and the previously compiled library files 3) Compile the test suite 4) Link everything together into an executable The invocation is similar to this: ghc -c -outputdir "$OUT" -XTrustworthy Library.hs ghc -c -outputdir "$OUT" -i"$OUT" -XSafe "$SUBMISSION" ghc -c -outputdir "$OUT" -i"$OUT" Test_Suite.hs ghc -outputdir "$OUT" -i"$OUT" -o "$OUT/runner" The second stage works well under the assumption that there's just one single submitted file. If there's more than one, I need to specify them in topological order wrt their module dependencies. Consider two simple modules: A.hs
module A where
import B import Library
B.hs
module B where
Invoking GHC with "A.hs" "B.hs" will fail: A.hs:3:1: Failed to load interface for ‘B’ My naive attempt at solving that problem was to just insert "--make" into the compiler flags. No luck, because according to [1, §4.7.3], "--make" will only look at source files and will ignore the "Library.hi" file from the previous compilation stage. If I give the path to "Library.hs" as well, GHC insists on recompiling it with changed flags ("-XSafe" instead of "-XTrustworthy"). Reading a bit further, I discovered the '-M' flag, however, not only does it output Makefile-formatted output, it also ignores .hi files. Is there any way to get the dependency discovery of '--make' without the rest? I know I could probably make it work if I made a package out of the first build stage, but I really want to avoid that in order to not increase the build complexity even further. Cheers Lars [1] https://downloads.haskell.org/~ghc/7.6.3/docs/html/users_guide/separate-comp...

Dear Lars, Am Dienstag, den 25.11.2014, 10:36 +0100 schrieb Lars Hupel:
The invocation is similar to this:
ghc -c -outputdir "$OUT" -XTrustworthy Library.hs ghc -c -outputdir "$OUT" -i"$OUT" -XSafe "$SUBMISSION" ghc -c -outputdir "$OUT" -i"$OUT" Test_Suite.hs ghc -outputdir "$OUT" -i"$OUT" -o "$OUT/runner"
the only reason you do these in individual steps is that you need to pass different flags, or are there other reasons? Have you tried putting the pragma {-# LANGUAGE Safe #-} as the first line into the submission file? I’m not sure how safe that actually is, but at least {-# LANGUAGE Safe #-} {-# LANGUAGE Trustworthy #-} module Foo where import Unsafe.Coerce is rejected by 7.6.3. This way, it could work with a single invocation of --make. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

the only reason you do these in individual steps is that you need to pass different flags, or are there other reasons?
Not only that – we also want to be able to distinguish compilation errors between "caused by students" and "we broke something". Under the assumption that compilation for library files always succeeds, this is not an issue for the first two stages.
Have you tried putting the pragma {-# LANGUAGE Safe #-} as the first line into the submission file? I’m not sure how safe that actually is, but at least
{-# LANGUAGE Safe #-} {-# LANGUAGE Trustworthy #-} module Foo where
import Unsafe.Coerce
is rejected by 7.6.3. This way, it could work with a single invocation of --make.
I'll let others comment on how safe that is since it does look a little bit fragile. It could work, though. Cheers Lars

On 25/11/14 12:29, Joachim Breitner wrote:
Dear Lars,
Am Dienstag, den 25.11.2014, 10:36 +0100 schrieb Lars Hupel:
The invocation is similar to this:
ghc -c -outputdir "$OUT" -XTrustworthy Library.hs ghc -c -outputdir "$OUT" -i"$OUT" -XSafe "$SUBMISSION" ghc -c -outputdir "$OUT" -i"$OUT" Test_Suite.hs ghc -outputdir "$OUT" -i"$OUT" -o "$OUT/runner"
the only reason you do these in individual steps is that you need to pass different flags, or are there other reasons?
Have you tried putting the pragma {-# LANGUAGE Safe #-} as the first line into the submission file? I’m not sure how safe that actually is, but at least
{-# LANGUAGE Safe #-} {-# LANGUAGE Trustworthy #-} module Foo where
import Unsafe.Coerce
is rejected by 7.6.3. This way, it could work with a single invocation of --make.
The only problem I see with that is that error message locations will be a bit off, since the file being compiled is different from the file submitted. But since we're in the hacks territory anyway, this could be fixed up with a simple regex :-) Roman

The only problem I see with that is that error message locations will be a bit off, since the file being compiled is different from the file submitted. But since we're in the hacks territory anyway, this could be fixed up with a simple regex :-)
... or line pragmas :-) I'm currently investigating another route though. I wrote a simple program which parses some Haskell files with "haskell-src-exts" (we actually even need "hse-cpp"), builds a graph with all known dependencies, uses "topSort" from "Data.Graph" to come up with a proper ordering and prints the source files in that order. That approach feels a lot safer to me. It can be used like: ghc -c $(cabal exec topoSort "${files[@]}") (Never mind the obvious unsafe file name handling.) The downside is that parse errors won't be reported by GHC, but by our preprocessing tool. (I've attached the tool, but keep in mind it's just a quick proof of concept.) Cheers Lars

Is -M perhaps what you’ve been looking for? https://downloads.haskell.org/~ghc/7.8.3/docs/html/users_guide/separate-comp... -g On November 27, 2014 at 5:32:01 AM, Lars Hupel (lars@hupel.info) wrote:
The only problem I see with that is that error message locations will be a bit off, since the file being compiled is different from the file submitted. But since we're in the hacks territory anyway, this could be fixed up with a simple regex :-)
... or line pragmas :-)
I'm currently investigating another route though. I wrote a simple program which parses some Haskell files with "haskell-src-exts" (we actually even need "hse-cpp"), builds a graph with all known dependencies, uses "topSort" from "Data.Graph" to come up with a proper ordering and prints the source files in that order. That approach feels a lot safer to me. It can be used like:
ghc -c $(cabal exec topoSort "${files[@]}")
(Never mind the obvious unsafe file name handling.)
The downside is that parse errors won't be reported by GHC, but by our preprocessing tool.
(I've attached the tool, but keep in mind it's just a quick proof of concept.)
Cheers Lars _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Hi Gershom,
Is -M perhaps what you’ve been looking for?
sadly, no. Firstly, it behaves in the same way as "--make" (i.e. only looks at source files) and secondly, it produces a Makefile as output. (I'd be happy though to use the GHC API if somebody could tell me whether/where this functionality is exposed.) Cheers Lars

I have only been skimming this thread, but would it be worth writing a tight specification of what exactly you want? Your original message said only "Is there any way to get the dependency discovery of '--make' without the rest", but I really don't know what that means. Nor do I know what you mean by "it only looks at source files". Rather than explain by deltas from something else, it might be easier just to write down precisely what you seek. Andrey Mokhov also wants a way to take a single Haskell module, and discover its immediate imports, a kind of non-recursive version of -M. So ghc -M-one-shot Foo.hs would print out the list of Haskell modules that Foo imports. (There are doubtless complications to do with CPP too, but that's the general idea.) I don't know if that is what you want too. Simon | -----Original Message----- | From: Glasgow-haskell-users [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Lars Hupel | Sent: 28 November 2014 08:53 | To: Gershom B | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Discovery of source dependencies without --make | | Hi Gershom, | | > Is -M perhaps what you’ve been looking for? | | sadly, no. Firstly, it behaves in the same way as "--make" (i.e. only | looks at source files) and secondly, it produces a Makefile as output. | | (I'd be happy though to use the GHC API if somebody could tell me | whether/where this functionality is exposed.) | | Cheers | Lars | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Rather than explain by deltas from something else, it might be easier just to write down precisely what you seek.
Let's say the hypothetical feature is selected via the GHC flag "--topo-sort". It would add a step before regular compilation and wouldn't affect any other flag: ghc -c --topo-sort fileA.hs fileB.hs ... This would first read in the specified source files and look at their module headers and import statements. It would build a graph of module dependencies _between_ the specified source files (ignoring circular dependencies), perform a topological sort on that graph, and proceed with compiling the source files in that order. As a consequence, if there is an order in which these modules can be successfully compiled, "--topo-sort" would choose such an order. In that sense, the above invocation would be equivalent to ghc -c fileB.hs fileA.hs ... (using some permutation of the original order specified) Another consequence is that any invocation of GHC in the form of ghc -c flags... sources.hs... with arbitrary flags would still work as usual when adding "--topo-sort". Quoting from the user manual:
In your program, you import a module Foo by saying import Foo. In --make mode or GHCi, GHC will look for a source file for Foo and arrange to compile it first. Without --make, GHC will look for the interface file for Foo, which should have been created by an earlier compilation of Foo.
The hypothetical "--topo-sort" flag would behave in the latter way, i.e. it would not require a source file for an unknown dependency. Hence, "--topo-sort" and "--make" would be conflicting options. I hope that clears up things a bit. Cheers Lars

| Let's say the hypothetical feature is selected via the GHC flag "-- | topo-sort". It would add a step before regular compilation and | wouldn't affect any other flag: | | ghc -c --topo-sort fileA.hs fileB.hs ... | | This would first read in the specified source files and look at their | module headers and import statements. It would build a graph of module | dependencies _between_ the specified source files (ignoring circular | dependencies), perform a topological sort on that graph, and proceed | with compiling the source files in that order. Interesting (and quite different from what I anticipated, so it's a good thing you wrote it down!). How does that differ from ghc --make? The only difference I can see is that - Modules that --make might find, but not listed on the command line, would not be compiled by --topo-sort Simon | -----Original Message----- | From: Lars Hupel [mailto:lars@hupel.info] | Sent: 28 November 2014 14:42 | To: Simon Peyton Jones | Cc: glasgow-haskell-users@haskell.org; Andrey Mokhov | Subject: Re: Discovery of source dependencies without --make | | > Rather than explain by deltas from something else, it might be | easier | > just to write down precisely what you seek. | | Let's say the hypothetical feature is selected via the GHC flag "-- | topo-sort". It would add a step before regular compilation and | wouldn't affect any other flag: | | ghc -c --topo-sort fileA.hs fileB.hs ... | | This would first read in the specified source files and look at their | module headers and import statements. It would build a graph of module | dependencies _between_ the specified source files (ignoring circular | dependencies), perform a topological sort on that graph, and proceed | with compiling the source files in that order. | | As a consequence, if there is an order in which these modules can be | successfully compiled, "--topo-sort" would choose such an order. In | that sense, the above invocation would be equivalent to | | ghc -c fileB.hs fileA.hs ... | | (using some permutation of the original order specified) | | Another consequence is that any invocation of GHC in the form of | | ghc -c flags... sources.hs... | | with arbitrary flags would still work as usual when adding "--topo- | sort". | | Quoting from the user manual: | | > In your program, you import a module Foo by saying import Foo. In | > --make mode or GHCi, GHC will look for a source file for Foo and | > arrange to compile it first. Without --make, GHC will look for the | > interface file for Foo, which should have been created by an earlier | > compilation of Foo. | | The hypothetical "--topo-sort" flag would behave in the latter way, | i.e. | it would not require a source file for an unknown dependency. Hence, | "--topo-sort" and "--make" would be conflicting options. | | I hope that clears up things a bit. | | Cheers | Lars

How does that differ from ghc --make? The only difference I can see is that - Modules that --make might find, but not listed on the command line, would not be compiled by --topo-sort
"--make" always requires a full view on all sources. That is, any imports which cannot be resolved from the package database are assumed to exist as source files. Imagine the following situation: A.hs
module A where
import B import Library
B.hs
module B where
If I compile these two with "--make", it also needs "Library.hs" as input. If I compile them without "--make", it just needs the interface ("Library.hi") as additional input. Concretely, assuming that only "path/Library.hi" exists, this fails: ghc -c -ipath --make A.hs B.hs In contrast to that, I propose that this should work: ghc -c -ipath --topo-sort A.hs B.hs

I suppose that if --make found Foo.hi, but no Foo.hs, it could simply use the Foo.hi. That would do strictly more than now. I don't know if there would be any disadvantages. Augmenting --make's semantics sounds better to me than inventing a new compilation mode. Maybe a feature request ticket. Then if people like it maybe someone can implement it. Simon | -----Original Message----- | From: Glasgow-haskell-users [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Lars Hupel | Sent: 28 November 2014 16:26 | To: Simon Peyton Jones | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Discovery of source dependencies without --make | | > How does that differ from ghc --make? The only difference I can see | is that | > - Modules that --make might find, but not listed on | > the command line, would not be compiled by --topo-sort | | "--make" always requires a full view on all sources. That is, any | imports which cannot be resolved from the package database are assumed | to exist as source files. Imagine the following situation: | | A.hs | > module A where | > | > import B | > import Library | | B.hs | > module B where | | If I compile these two with "--make", it also needs "Library.hs" as | input. If I compile them without "--make", it just needs the interface | ("Library.hi") as additional input. | | Concretely, assuming that only "path/Library.hi" exists, this fails: | | ghc -c -ipath --make A.hs B.hs | | In contrast to that, I propose that this should work: | | ghc -c -ipath --topo-sort A.hs B.hs | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

I suppose that if --make found Foo.hi, but no Foo.hs, it could simply use the Foo.hi. That would do strictly more than now. I don't know if there would be any disadvantages.
Augmenting --make's semantics sounds better to me than inventing a new compilation mode.
Sure, that sounds even better.
Maybe a feature request ticket. Then if people like it maybe someone can implement it.
Done: https://ghc.haskell.org/trac/ghc/ticket/9846 I have no idea how difficult this is going to be – if someone tells me it's easy, I'll have a stab at it. Thanks for your help! Cheers Lars

On Fri, Nov 28, 2014 at 3:41 PM, Lars Hupel
Let's say the hypothetical feature is selected via the GHC flag "--topo-sort". It would add a step before regular compilation and wouldn't affect any other flag:
ghc -c --topo-sort fileA.hs fileB.hs ...
This would first read in the specified source files and look at their module headers and import statements. It would build a graph of module dependencies _between_ the specified source files (ignoring circular dependencies), perform a topological sort on that graph, and proceed with compiling the source files in that order.
GHC 8 will have support for Frontend plugins. Frontend plugins enable you to write plugins to replace GHC major modes. E.g. instead of saying ghc --make A B C you can now say: ghc --frontend TopoSort A B C You still have to implement TopoSort.hs yourself, using the GHC API to compile A B C in topological order, but some of the plumbing is taken care of by the Frontend plugin infrastructure already. Take a look at this commit, especially the user's guide section and the test case: https://github.com/ghc/ghc/commit/a3c2a26b3af034f09c960b2dad38f73be7e3a655.

I missed context, but if you just want the topological graph, depanal will give you a module graph which you can then topsort with topSortModuleGraph (all in GhcMake). Then you can do what you want with the result. You will obviously need accurate targets but frontend plugins and guessTarget will get you most of the way there. Edward Excerpts from Thomas Miedema's message of 2015-12-13 16:12:39 -0800:
On Fri, Nov 28, 2014 at 3:41 PM, Lars Hupel
wrote: Let's say the hypothetical feature is selected via the GHC flag "--topo-sort". It would add a step before regular compilation and wouldn't affect any other flag:
ghc -c --topo-sort fileA.hs fileB.hs ...
This would first read in the specified source files and look at their module headers and import statements. It would build a graph of module dependencies _between_ the specified source files (ignoring circular dependencies), perform a topological sort on that graph, and proceed with compiling the source files in that order.
GHC 8 will have support for Frontend plugins. Frontend plugins enable you to write plugins to replace GHC major modes.
E.g. instead of saying
ghc --make A B C
you can now say:
ghc --frontend TopoSort A B C
You still have to implement TopoSort.hs yourself, using the GHC API to compile A B C in topological order, but some of the plumbing is taken care of by the Frontend plugin infrastructure already.
Take a look at this commit, especially the user's guide section and the test case: https://github.com/ghc/ghc/commit/a3c2a26b3af034f09c960b2dad38f73be7e3a655.

Here's what I use:
http://ofb.net/~elaforge/shake/Shake/HsDeps.hs
It's a very dumb but fast parser that figures out dependencies. I use
it for my shake build system. Not sure if it's useful, but there it
is if it helps.
If you have a complicated build setup and you're not using shake,
maybe you should consider it :)
On Fri, Nov 28, 2014 at 12:52 AM, Lars Hupel
Hi Gershom,
Is -M perhaps what you’ve been looking for?
sadly, no. Firstly, it behaves in the same way as "--make" (i.e. only looks at source files) and secondly, it produces a Makefile as output.
(I'd be happy though to use the GHC API if somebody could tell me whether/where this functionality is exposed.)
Cheers Lars
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (8)
-
Edward Z. Yang
-
Evan Laforge
-
Gershom B
-
Joachim Breitner
-
Lars Hupel
-
Roman Cheplyaka
-
Simon Peyton Jones
-
Thomas Miedema