
On Wed, Apr 13, 2011 at 11:44:33AM +0100, Simon Marlow wrote:
On 09/04/2011 04:32, Evan Laforge wrote:
I've found ghc's cyclic import error to be rather confusing, and I finally took the time to understand why. Here's an example:
Module imports form a cycle for modules: Cmd.Cmd (./Cmd/Cmd.hs) imports: Perform.Midi.Instrument Instrument.MidiDb Instrument.Db Perform.Midi.Instrument (./Perform/Midi/Instrument.hs) imports: Cmd.Cmd Instrument.MidiDb (./Instrument/MidiDb.hs) imports: Perform.Midi.Instrument Instrument.Db (./Instrument/Db.hs) imports: Instrument.Search Instrument.MidiDb Perform.Midi.Instrument Instrument.Search (./Instrument/Search.hs) imports: Instrument.MidiDb Perform.Midi.Instrument
It seems to be in a strange order and mentions extraneous modules. I would find this much easier to read:
Perform.Midi.Instrument -> Cmd.Cmd -> Instrument.MidiDb -> Perform.Midi.Instrument
So the algorithm that GHC uses is this:
- do a strongly connected component analysis - build until we hit a cycle - then, report the error for the cycle
Now, the modules in the cycle are a strongly connected component: every module is reachable from every other module by following imports. We report all the modules in the strongly connected component and their imports, but omit imports of modules outside the cycle.
Instead, the order goes Cmd.Cmd -> Instrument.MidiDb, and then goes backwards to Perform.Midi.Instrument -> Cmd.Cmd. Then it goes forward again to Instrument.MidiDb -> Perform.Midi.Instrument. So the order makes you jump around if you want to trace the import chain. The duplicated module that joins the cycle is not visually highlighted. Whats more, it further confuses the eye by merging in multiple loops. I suppose it could be useful to include additional loops, but I would find it easier to read if they were included on their own line, such as:
Cmd.Cmd -> Instrument.Db -> Instrument.Search -> Perform.Midi.Instrument -> Cmd.Cmd
However, I think probably the shortest loop is the most interesting one, and if there are multiple shortest loops, simply picking one randomly is likely to be sufficient.
Picking the shortest one sounds reasonable. However, there could be a single edge which if removed will break all the loops, and they might want to know which it is.
If you want to play with this, the code is very localised: it is a few lines in GhcMake.cyclicModuleError
http://hackage.haskell.org/trac/ghc/browser/compiler/main/GhcMake.hs#L1446
Hi Simon and Evan, Thanks for bringing up a problem, and providing useful information about it, in a way that is understandable enough for a newcomer to have an opinion about it! So, here is my newcomer's opinion: The error displays a strongly connected graph, with one or more cycles, but labels it a "cycle". As you both point out, having all the information about the strongly connected modules is very useful. Labeling it a cycle, however, gives the reader an expectation that is bound to be confounded. Perhaps the following change would be sufficient? --- tmp/GhcMake.hs~ 2011-04-14 09:46:02.177298318 -0700 +++ tmp/GhcMake.hs 2011-04-14 09:52:25.121290827 -0700 @@ -1460,7 +1460,8 @@ cyclicModuleErr :: [ModSummary] -> SDoc cyclicModuleErr ms - = hang (ptext (sLit "Module imports form a cycle for modules:")) + = hang (ptext (sLit "Module imports form a strongly connected graph, with one or more cycles, for these modules:")) 2 (vcat (map show_one ms)) where mods_in_cycle = map ms_mod_name ms -Bryan P.S. I'm not sure what the accepted method for formatting/wrapping string literals is, so I left it as an exercise. :)