
Hi, I have a question about cyclic inclusions. It appears in the Haskell 98 report that mutually recursive modules are allowed, however GHC complains at any Haskell project that has cyclic inclusions (implicit or explicit). Am I right in thinking that this is a GHC limitation? http://www.haskell.org/onlinereport/modules.html Kind regards, Chris.

cmb21:
Hi,
I have a question about cyclic inclusions. It appears in the Haskell 98 report that mutually recursive modules are allowed, however GHC complains at any Haskell project that has cyclic inclusions (implicit or explicit). Am I right in thinking that this is a GHC limitation?
GHC provides a mechanism to support mutually recursive modules, but you must break the cycle manually, via a boot file. http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation...

Hi Don,
GHC provides a mechanism to support mutually recursive modules, but you must break the cycle manually, via a boot file.
http://www.haskell.org/ghc/docs/latest/html/users_guide/separate-compilation...
Yes, I saw that, thanks! I guess this is because it's hard to compile a mutually recursive module... Chris.

G'day all.
Quoting "C.M.Brown"
Yes, I saw that, thanks! I guess this is because it's hard to compile a mutually recursive module...
It's because you don't need to declare the types of exported definitions. Consider, this highly artificial example: module A where import B f (x,y) = g (x,'A') module B where import A g (x,y) = f (True,y) To infer the types of f and g, you need to analyse both modules together. And yes, some people think that this is a bug in the specification. Cheers, Andrew Bromage

Andrew, Thanks very much for your reponse. It was very helpful; this makes a lot of sense! Regards, Chris. On Mon, 11 Aug 2008 ajb@spamcop.net wrote:
G'day all.
Quoting "C.M.Brown"
: Yes, I saw that, thanks! I guess this is because it's hard to compile a mutually recursive module...
It's because you don't need to declare the types of exported definitions.
Consider, this highly artificial example:
module A where
import B
f (x,y) = g (x,'A')
module B where
import A
g (x,y) = f (True,y)
To infer the types of f and g, you need to analyse both modules together.
And yes, some people think that this is a bug in the specification.
Cheers, Andrew Bromage _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 12 Aug 2008, at 11:59, C.M.Brown wrote:
Andrew,
Thanks very much for your reponse. It was very helpful; this makes a lot of sense!
And yes, some people think that this is a bug in the specification.
I'm not sure that it does make a lot of sense -- we allow (mutually) recursive functions, even though they come with an efficiency penalty. Why should we not allow (mutually) recursive modules, even though they too come with an efficiency penalty. This is even an example where the efficiency loss is *only* at compile time, and only happens once, so it's somewhat a better situation than allowing mutually recursive functions. I'd say it falls very heavily into the ghc-bug category, not the spec bug category (even if there's reasons for the bug existing in ghc). Bob

I'm not sure that it does make a lot of sense -- we allow (mutually) recursive functions, even though they come with an efficiency penalty. Why should we not allow (mutually) recursive modules, even though they too come with an efficiency penalty. This is even an example where the efficiency loss is *only* at compile time, and only happens once, so it's somewhat a better situation than allowing mutually recursive functions.
I'd say it falls very heavily into the ghc-bug category, not the spec bug category (even if there's reasons for the bug existing in ghc).
Perhaps it would be better for GHC to allow compilation of cyclic inclusions via a flag? -fcyclic or something? Or, to do it by default unless a -fno-cyclic flag is raised? It does seem strange that the only way to compile cyclic modules is to hack together a build using hi-boot files. Regards, Chris.

G'day all.
Quoting Thomas Davie
I'm not sure that it does make a lot of sense -- we allow (mutually) recursive functions, even though they come with an efficiency penalty. Why should we not allow (mutually) recursive modules, even though they too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically typed languages support mutually recursive modules. The problem is that it's impossible in general to say what the "interface" of a module is by examining the module alone. This is a very unusual property as real-world programming languages go. You could fix this by, for example, requiring that all symbols exported from a module have an explicit type annotation. Or, if you think that's not lazy enough, it could be an error to use an imported symbol that doesn't have an explicit type annotation. You could even formalise .hi-boot files if you were truly desperate. If the Haskell spec requires that multiple modules be analysed simultaneously (or multiple times to fixpoint), then that's a bug in the spec. Separate compilation is too important. Cheers, Andrew Bromage

On Tue, 12 Aug 2008, ajb@spamcop.net wrote:
G'day all.
Quoting Thomas Davie
: I'm not sure that it does make a lot of sense -- we allow (mutually) recursive functions, even though they come with an efficiency penalty. Why should we not allow (mutually) recursive modules, even though they too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the "interface" of a module is by examining the module alone. This is a very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols exported from a module have an explicit type annotation.
As far as I know the real difficulties come from mutually recursive class definitions.

G'day all.
Quoting Henning Thielemann
As far as I know the real difficulties come from mutually recursive class definitions.
I wouldn't be surprised, because that's a more blatant instance of the same problem. With classes and instances, there is no way to specify whether or not they are exported or not, which makes it that much harder to identify what the interface of a module is. Cheers, Andrew Bromage

On 12 Aug 2008, at 16:01, ajb@spamcop.net wrote:
G'day all.
Quoting Thomas Davie
: I'm not sure that it does make a lot of sense -- we allow (mutually) recursive functions, even though they come with an efficiency penalty. Why should we not allow (mutually) recursive modules, even though they too come with an efficiency penalty.
The problem is not mutually recursive modules. Plenty of statically typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the "interface" of a module is by examining the module alone. This is a very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols exported from a module have an explicit type annotation. Or, if you think that's not lazy enough, it could be an error to use an imported symbol that doesn't have an explicit type annotation. You could even formalise .hi-boot files if you were truly desperate.
If the Haskell spec requires that multiple modules be analysed simultaneously (or multiple times to fixpoint), then that's a bug in the spec. Separate compilation is too important.
Why is separate compilation important? I can understand your point about a module on it's own not being analyzable, and that that's an odd property -- on the other hand, the rapidly emerging "atomic" unit in Haskell is the package, so I see no reason why modules within one package shouldn't depend on each other. Finally, as chris suggests, if separate compilation is important to you, why not have a flag in ghc -frequire-hi-boot or something? Bob

G'day all.
Quoting Thomas Davie
Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this question. Two people have asked it now. The simplest answer is that unless your program fits in cache, it takes longer to compile two modules concurrently than it takes to compile them separately. This is even more true if one of those modules does not actually need recompilation. The longer answer is that in the Real World, programmer time is far, far more precious than computer time. Every second that the programmer waits for the computer to finish some task, there is a priority inversion. It's therefore highly desirable that jobs done by computer are as fast as possible or, failing that, as predictable as possible, so the programmer knows to do something else while waiting. Separate compilation puts a predictable upper bound on the amount of recompilation that has to be done given some set of changes. True, so does requiring recompilation of a package as a whole, but Haskell development would then be notorious for its sluggishness.
I can understand your point about a module on it's own not being analyzable, and that that's an odd property -- on the other hand, the rapidly emerging "atomic" unit in Haskell is the package, so I see no reason why modules within one package shouldn't depend on each other.
Implicit in my working definition of "module" is that a module has a well-defined interface. Am I the only one who has that understanding? (Well, me and David Parnas, at any rate.) For the record, I have no problem with modules depending on each other, so long as they only depend on their well-defined interfaces.
Finally, as chris suggests, if separate compilation is important to you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple- maintenance headaches associated with them, I'd program in C. Except for mutually recursive modules, GHC can and does generate header files automatically, so I don't see why my time should be wasted doing the job of a compiler. If something is preventing the compiler from doing that job, then that something should be fixed. Cheers, Andrew Bromage

On 13 Aug 2008, at 05:06, ajb@spamcop.net wrote:
G'day all.
Quoting Thomas Davie
: Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this question. Two people have asked it now.
The simplest answer is that unless your program fits in cache, it takes longer to compile two modules concurrently than it takes to compile them separately. This is even more true if one of those modules does not actually need recompilation.
The longer answer is that in the Real World, programmer time is far, far more precious than computer time. Every second that the programmer waits for the computer to finish some task, there is a priority inversion.
Really? So you're using YHC then? It after all compiles *much* faster than GHC, but produces slower binaries. To be honest, ghc compiles things so fast (at least on any of my systems) that I couldn't care less if it took 10 times as long (I would however like some added convenience for that time spent)
It's therefore highly desirable that jobs done by computer are as fast as possible or, failing that, as predictable as possible, so the programmer knows to do something else while waiting.
Separate compilation puts a predictable upper bound on the amount of recompilation that has to be done given some set of changes. True, so does requiring recompilation of a package as a whole, but Haskell development would then be notorious for its sluggishness.
It does? If I compile a module on which lots of other modules depend, I have to do lots of recompilation... If I compile a module which is in a cyclic dependancy group, I have to do lots of recompilation, I don't see that there's a difference here.
I can understand your point about a module on it's own not being analyzable, and that that's an odd property -- on the other hand, the rapidly emerging "atomic" unit in Haskell is the package, so I see no reason why modules within one package shouldn't depend on each other.
Implicit in my working definition of "module" is that a module has a well-defined interface. Am I the only one who has that understanding? (Well, me and David Parnas, at any rate.)
For the record, I have no problem with modules depending on each other, so long as they only depend on their well-defined interfaces.
That's a fair point about programming style, otoh, I don't think it's a reason to restrict users to not using cyclic dependancies.
Finally, as chris suggests, if separate compilation is important to you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple- maintenance headaches associated with them, I'd program in C. Except for mutually recursive modules, GHC can and does generate header files automatically, so I don't see why my time should be wasted doing the job of a compiler.
If something is preventing the compiler from doing that job, then that something should be fixed.
Something *does* prevent the compiler doing that job -- the fact that ghc can't deal with cyclic module includes without an hi-boot file. This is *exactly* my point -- I don't see why my time should be wasted doing the job of the compiler just because I happen to have a cyclic dependancy, that the compiler could quite happily sort out, by making my compile time in this situation slightly longer. If I *really* think I'm going to save more time by writing a hi-boot file, then I would be able to turn on the option! Bob

On Wed, 13 Aug 2008, Thomas Davie wrote:
On 13 Aug 2008, at 05:06, ajb@spamcop.net wrote:
Quoting Thomas Davie
: Why is separate compilation important?
I'm a little shocked that anyone on this list should have to ask this question. Two people have asked it now.
Really? So you're using YHC then? It after all compiles *much* faster than GHC, but produces slower binaries. To be honest, ghc compiles things so fast (at least on any of my systems) that I couldn't care less if it took 10 times as long (I would however like some added convenience for that time spent)
It's the ubiquitous "computers are fast enough today" argument. I don't buy it. We don't have compile time to waste. There will always be computers that are much slower and have less memory than the current customer desktop computers, there are always tasks that a computer can do instead of doing slowed down compilation. I'm glad that we have overcome C's way of concatening all header files together before starting compilation.

On 13 Aug 2008, at 11:10, Henning Thielemann wrote:
On Wed, 13 Aug 2008, Thomas Davie wrote:
On 13 Aug 2008, at 05:06, ajb@spamcop.net wrote:
Why is separate compilation important? I'm a little shocked that anyone on this list should have to ask
Quoting Thomas Davie
: this question. Two people have asked it now. Really? So you're using YHC then? It after all compiles *much* faster than GHC, but produces slower binaries. To be honest, ghc compiles things so fast (at least on any of my systems) that I couldn't care less if it took 10 times as long (I would however like some added convenience for that time spent)
It's the ubiquitous "computers are fast enough today" argument. I don't buy it. We don't have compile time to waste. There will always be computers that are much slower and have less memory than the current customer desktop computers, there are always tasks that a computer can do instead of doing slowed down compilation. I'm glad that we have overcome C's way of concatening all header files together before starting compilation.
If you don't want to be slowed down by the compiler, why aren't you writing machine code directly? The point is that time saved with convenience (often) comes at the cost of time spent while compiling. Adding an option to ghc to allow you to not waste time writing hi-boot files allows me to make an informed decision about whether it will take me longer to (a) figure out *how* to write an hi-boot file (b) actually do the writing, or if it'll take less time/effort to just let the compiler do it for me. Bob

G'day all.
Quoting Thomas Davie
To be honest, ghc compiles things so fast (at least on any of my systems) that I couldn't care less if it took 10 times as long (I would however like some added convenience for that time spent)
Have you ever compiled GHC itself? Just curious what you'd think about a 10x speed hit there. If it helps, think about the lifetime of a program. If you assume that a program grows linearly over time, and that recompilations occur at a roughly constant rate, it follows that the time spent recompiling is O(n^2). Constant factors matter.
If I compile a module on which lots of other modules depend, I have to do lots of recompilation... If I compile a module which is in a cyclic dependancy group, I have to do lots of recompilation, I don't see that there's a difference here.
If you only change the implementation of a module, not its interface, you don't need to recompile anything that imports it. (At least, this is true at -O0, which if you care about fast recompilation because you're deep in development, you're probably doing.)
That's a fair point about programming style, otoh, I don't think it's a reason to restrict users to not using cyclic dependancies.
As previously noted, cyclic dependencies alone aren't the problem. Cheers, Andrew Bromage

On 13 Aug 2008, at 13:18, ajb@spamcop.net wrote:
G'day all.
Quoting Thomas Davie
: To be honest, ghc compiles things so fast (at least on any of my systems) that I couldn't care less if it took 10 times as long (I would however like some added convenience for that time spent)
Have you ever compiled GHC itself? Just curious what you'd think about a 10x speed hit there. On the machine I'm sat at now, it takes 20 minutes. Secondly, you would *only* pay the speed penalty when you had cyclic includes, and you didn't use the flag to use an hi-boot file, so it would continue to take 20 minutes.
If it helps, think about the lifetime of a program. If you assume that a program grows linearly over time, and that recompilations occur at a roughly constant rate, it follows that the time spent recompiling is O(n^2). Constant factors matter. But the penalty would be no where near that big -- unless you happened to have *all* your modules depend on *all* other modules.
If I compile a module on which lots of other modules depend, I have to do lots of recompilation... If I compile a module which is in a cyclic dependancy group, I have to do lots of recompilation, I don't see that there's a difference here.
If you only change the implementation of a module, not its interface, you don't need to recompile anything that imports it. (At least, this is true at -O0, which if you care about fast recompilation because you're deep in development, you're probably doing.)
Well, if this turned out to take a long time, then I'd probably switch to writing an hi-boot file... Something that I wouldn't have to waste time doing at all if it happened that ghc was still "fast enough" without providing one. Bob

On 2008-08-13, ajb@spamcop.net
If you only change the implementation of a module, not its interface, you don't need to recompile anything that imports it. (At least, this is true at -O0, which if you care about fast recompilation because you're deep in development, you're probably doing.)
This is only true if the interface can be tracked separately from the implementation. Which, despite the flaws, C's header files can be coaxed into doing. -- Aaron Denney -><-

Andrew, [...]
For the record, I have no problem with modules depending on each other, so long as they only depend on their well-defined interfaces.
Finally, as chris suggests, if separate compilation is important to you, why not have a flag in ghc -frequire-hi-boot or something?
Well, if I wanted separate header files, and the inevitable multiple- maintenance headaches associated with them, I'd program in C. Except for mutually recursive modules, GHC can and does generate header files automatically, so I don't see why my time should be wasted doing the job of a compiler.
If something is preventing the compiler from doing that job, then that something should be fixed.
But isn't this exactly the point I was trying to make!? The whole point, to me, in functional programming, is that we shouldn't have to worry about the underlying implementation. What you've listed above are the restrictions of a particular compiler implementation. All of which, if needed, can be controlled by a flag if compilation performance is so important. Regards, Chris.

G'day all.
Quoting "C.M.Brown"
But isn't this exactly the point I was trying to make!? The whole point, to me, in functional programming, is that we shouldn't have to worry about the underlying implementation.
It is not exposing an underlying implementation detail to mandate that modules should have well-defined interfaces. If anything, it's enforcing good programming practice. Cheers, Andrew Bromage

Andrew,
But isn't this exactly the point I was trying to make!? The whole point, to me, in functional programming, is that we shouldn't have to worry about the underlying implementation.
It is not exposing an underlying implementation detail to mandate that modules should have well-defined interfaces. If anything, it's enforcing good programming practice.
I agree absolutely that having well-defined interfaces is a good thing. I wasn't actually referring to that, I apologise for not being clear. However I saw no real argument for not having cyclic inclusions. You say we shouldn't have to spend time writing hi-boot files, and yet you also think that GHC should not do it automatically. So we have to restrict all programmers to never writing cyclic inclusions? :) Kind Regards, Chris.

G'day.
Quoting "C.M.Brown"
However I saw no real argument for not having cyclic inclusions. You say we shouldn't have to spend time writing hi-boot files, and yet you also think that GHC should not do it automatically. So we have to restrict all programmers to never writing cyclic inclusions? :)
GHC generates .hi files for most modules automatically. The only reason why hi-boot files are needed for cyclic imports is because of the possibility that you can't generate a .hi file from the module alone. If you could do that, then you could support cyclic imports without needing hi-boot files. Cheers, Andrew Bromage

Hello,
The Haskell'98 report does not specify if/how recursive modules should
work. I wrote a paper a long time ago that formalizes and implements
this feature (http://www.purely-functional.net/yav/publications/modules98.pdf).
I very much doubt that separate compilation is much of a problem in
practise because you only need to compile modules that are _recursive_
at the same time, and usually these tend to be fairly small. Figuring
out the interface of the modules is a bit trickier in some
pathological cases involving module re-exports, but this is already
the case with non-recursive modules.
One real technical problem that I remember was implementing
"defaulting", which is specified in terms of a single module. With
recursive modules, one could get mutually recursive functions from
different modules, in which case it is not clear which set of
defaulting rules to apply or how to combine them.
Hope that this helps,
-Iavor
On Wed, Aug 13, 2008 at 4:30 AM,
G'day.
Quoting "C.M.Brown"
: However I saw no real argument for not having cyclic inclusions. You say we shouldn't have to spend time writing hi-boot files, and yet you also think that GHC should not do it automatically. So we have to restrict all programmers to never writing cyclic inclusions? :)
GHC generates .hi files for most modules automatically. The only reason why hi-boot files are needed for cyclic imports is because of the possibility that you can't generate a .hi file from the module alone. If you could do that, then you could support cyclic imports without needing hi-boot files.
Cheers, Andrew Bromage _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

The problem is not mutually recursive modules. Plenty of statically typed languages support mutually recursive modules.
The problem is that it's impossible in general to say what the "interface" of a module is by examining the module alone. This is a very unusual property as real-world programming languages go.
You could fix this by, for example, requiring that all symbols exported from a module have an explicit type annotation. Or, if you think that's not lazy enough, it could be an error to use an imported symbol that doesn't have an explicit type annotation. You could even formalise .hi-boot files if you were truly desperate.
I don't really see this as being any kind of real issue at all. Surely all GHC needs to do is to concatenate all the modules together, alpha-reduce the import/export relations and do a compile/type check over the concatenated module.
If the Haskell spec requires that multiple modules be analysed simultaneously (or multiple times to fixpoint), then that's a bug in the spec. Separate compilation is too important.
Why is it a bug in the spec exactly? And why is separate compilation so important? Chris.

C.M.Brown wrote:
I don't really see this as being any kind of real issue at all. Surely all GHC needs to do is to concatenate all the modules together, alpha-reduce the import/export relations and do a compile/type check over the concatenated module.
FWIW, I agree (in principle -- I haven't looked at the GHC implementation enough to see whether its internal representations could easily do type inference involving symbols in multiple modules, and GHC HQ understandably isn't interested in implementing it themselves http://hackage.haskell.org/trac/ghc/ticket/1409 ). There is a bit of a question in my mind about what to do with .hi/.o files for the combined blob. There's also a scalability / memory use issue, that is also the plug for separate compilation: if you make really big modules (whether manually or programmatically), compiling them takes lots of memory and time, significantly more total time and maximum memory than you need for via separate compilation. I consider this a bug. (Well, I don't know if what I said is true -- it's just an excuse that's often given for why GHC compiling things together is a bad idea. And there are several compile-time performance bugs in Trac.) Admittedly, it's a challenging problem to automatically segment a large module so you only need to do a little work at once. Essentially, the module system partly has the purpose of telling the compiler how not to try to optimize, even if it would be beneficial; and thereby you get reasonable compilation times. This would be inexcusable in my mind if the same sort of module segmentation didn't also help *humans* understand the contents of the modules :-). I wonder if there can be some sort of system with pragmas that accomplishes the same purpose of guiding the compiler... probably not for Haskell so much as in general -Isaac
participants (8)
-
Aaron Denney
-
ajb@spamcop.net
-
C.M.Brown
-
Don Stewart
-
Henning Thielemann
-
Iavor Diatchki
-
Isaac Dupree
-
Thomas Davie