Adding type signature changes semantics (was [Haskell-cafe] Lazy in either argument?)

I'm forwarding this to ghc-users since nobody answered on
haskell-cafe, and I'm still curious... (The original message is at
http://www.nabble.com/Lazy-in-either-argument--t4133855.html)
---------- Forwarded message ----------
From: Tim Chevalier
The non-termination is (probably) due to the fact that you can have uninterruptible threads in ghc. If a thread never allocates it will never be preempted. :(
To elaborate on that, the different behavior between the two versions of Dan's code, one with and one without a type signature, happens because f compiles like so if the type signature isn't given (this is the STG code): f_ri5 = \u [] let-no-escape { f1_sPY = NO_CCS[] \u [] f1_sPY; } in f1_sPY; SRT(f_ri5): [] and like so if the type signature is given: f_ri5 = \u srt:(0,*bitmap*) [] f_ri5; SRT(f_ri5): [f_ri5] If you look at the resulting asm code, the second version of f_ri5 compiles to a loop that allocates on each iteration, whereas the body of the let in the first version of f_ri5 compiles to just: sPY_info: jmp sPY_info (Adding f to the export list so that its SRT is empty doesn't change anything, btw.) This is all with -Onot. So I find this a little confusing. Why is f = let f_1 = f_1 in f_1 compiled so differently from f = f? It seems like f = f should also compile to a loop that doesn't allocate anything. And from the user's perspective, it seems somewhat strange that adding or removing a type signature changes the semantics of their code (although I guess you could make the argument that if you're dealing with threads and nonterminating code, all bets are off.) Can someone better acquainted with the code generator than me explain the rationale behind this? Cheers, Tim -- Tim Chevalier* catamorphism.org *Often in error, never in doubt "Religion is just a fancy word for the Stockholm Syndrome." -- lj user="pure_agnostic"

On 8/2/07, Tim Chevalier
I'm forwarding this to ghc-users since nobody answered on haskell-cafe, and I'm still curious... (The original message is at http://www.nabble.com/Lazy-in-either-argument--t4133855.html)
Replying to myself again, I figured out later that what's really going on in this example is that when you write: f = f (with no type signature) it desugars to: f = \@a -> let f' = f' in f' whereas if you write: f :: Bool f = f the code is left more or less as is (if you're compiling with -Onot). Then the let-body in the unannotated version gets compiled to a loop that doesn't allocate, whereas (and I don't quite follow why) the definition in the annotated version gets compiled to a loop that allocates a new closure on each iteration. Thus, in the example with two threads, the second version will be pre-empted to run the other thread that does terminate, whereas the first version will never be pre-empted. This makes sense at least if you consider the desugaring of polymorphic definitions separately from the semantics of threads... it's only when they interact that something strange happens. Maybe it's an extreme corner case, but it still seems weird to me that adding or removing a type signature should change the semantics of code like this. It seems to me like the problem is that that the transformation from (f = f) to (f = let f' = f' in f') isn't valid in the presence of threads, since in a multithreaded situation the two are observably different... but is that a bug in the desugarer, or a bug in GHC's concurrency semantics? Cheers, Tim -- Tim Chevalier * chevalier@alum.wellesley.edu * Often in error, never in doubt

On Thu, Aug 02, 2007 at 06:26:36PM -0700, Tim Chevalier wrote:
On 8/2/07, Tim Chevalier
wrote: I'm forwarding this to ghc-users since nobody answered on haskell-cafe, and I'm still curious... (The original message is at http://www.nabble.com/Lazy-in-either-argument--t4133855.html)
Replying to myself again, I figured out later that what's really going on in this example is that when you write: f = f (with no type signature) it desugars to: f = \@a -> let f' = f' in f' whereas if you write: f :: Bool f = f the code is left more or less as is (if you're compiling with -Onot). Then the let-body in the unannotated version gets compiled to a loop that doesn't allocate, whereas (and I don't quite follow why) the definition in the annotated version gets compiled to a loop that allocates a new closure on each iteration. Thus, in the example with two threads, the second version will be pre-empted to run the other thread that does terminate, whereas the first version will never be pre-empted.
This makes sense at least if you consider the desugaring of polymorphic definitions separately from the semantics of threads... it's only when they interact that something strange happens. Maybe it's an extreme corner case, but it still seems weird to me that adding or removing a type signature should change the semantics of code like this. It seems to me like the problem is that that the transformation from (f = f) to (f = let f' = f' in f') isn't valid in the presence of threads, since in a multithreaded situation the two are observably different... but is that a bug in the desugarer, or a bug in GHC's concurrency semantics?
If anything I think it's a bug in the code generator/the runtime. We shouldn't be generating uninterruptable loops! Option 1: Don't generate non-allocating loops. In all functions classed as loop-breaker by the occurAnal, generate heap checks even if there are no bytes involved. Pro: Very simple Con: Neil won't be too happy ;) Adds heap checks to pure-arithmetic loops. Option 2: Preempt non-allocating loops. Store somewhere the addresses of a sufficient subset of safe points. Note that the addresses of all info tables is a sufficient subset. On timer interrupt, single step the code until a safepoint is reached. Pro: low overhead and will provide most of the infrastructure I'll need if I ever get around to trying to implement hardware heap checks in GHC. Con: More complicated, especially if single-stepping is so high overhead we wind up needing to implement some other scheme, like linking the bochs core. Stefan

Stefan is right here. - It's not surprising that with -Onot you get different code from different source programs, even if one can readily be transformed into the other. That's what -O does. | If anything I think it's a bug in the code generator/the runtime. We | shouldn't be generating uninterruptable loops! | | Option 1: Don't generate non-allocating loops. | | Option 2: Preempt non-allocating loops. I would urge caution on (2). It's an absolute swamp, especially when you want to be portable across platforms, into which many have ventured but few have re-emerged. Those that have usually do so by adopting some variation on Option 1. However you can do (1) without allocating: you just need to insert a test into the loop that tests the "please yield" flag, which is set by the interrupt. The key thing is that the thread yields voluntarily and tidily rather than being forcibly pre-empted. Simon

On 8/3/07, Simon Peyton-Jones
Stefan is right here.
- It's not surprising that with -Onot you get different code from different source programs, even if one can readily be transformed into the other. That's what -O does.
Yes, but I found it surprising that just removing a type signature should result in markedly different code. Are there other known situations where that can happen? Cheers, Tim -- Tim Chevalier * chevalier@alum.wellesley.edu * Often in error, never in doubt

Tim Chevalier wrote:
On 8/3/07, Simon Peyton-Jones
wrote: Stefan is right here.
- It's not surprising that with -Onot you get different code from different source programs, even if one can readily be transformed into the other. That's what -O does.
Yes, but I found it surprising that just removing a type signature should result in markedly different code. Are there other known situations where that can happen?
It is not _just_ removing a type signature, it is also changing the type from `Bool` to `forall a. a`. An explicit type signature of the latter would have produced the same results as no type signature, I believe. The surprise is that an unconstrained type-variable being variable rather than instantiated to an arbitrary type, makes any difference (since it doesn't, normally, at runtime). I would guess the programs `Bool` and `a` are the same once optimizations are turned on? Maybe GHC could avoid the creation of type-lambdas that are unused (in some sense)... with -Onot... I'm dubious about that. Inserting a preemption test in non-allocating loops seems like a good idea to me (I hate the invisible threat that my program might not thread as threading should work)... any idea how bad the performance impact could be (I guess the test could be specified to branch-predict that the loop wouldn't be interrupted), and whether there could be a pragma to disable that test in certain loops? Is -threaded versus not, relevant here? Isaac

On 8/3/07, Isaac Dupree
The surprise is that an unconstrained type-variable being variable rather than instantiated to an arbitrary type, makes any difference (since it doesn't, normally, at runtime).
Right... it's surprising because types aren't supposed to "matter" at runtime.
I would guess the programs `Bool` and `a` are the same once optimizations are turned on? Maybe GHC could avoid the creation of type-lambdas that are unused (in some sense)... with -Onot... I'm dubious about that.
Right, both programs result in the same runtime behavior if optimizations are turned on. (Which, of course, is another surprising thing: the program with the type signature omitted loops if compiled with -Onot and terminates if compiled with -O.) Cheers, Tim -- Tim Chevalier * chevalier@alum.wellesley.edu * Often in error, never in doubt

On Fri, Aug 03, 2007 at 09:01:18PM -0300, Isaac Dupree wrote:
Tim Chevalier wrote:
On 8/3/07, Simon Peyton-Jones
wrote: Stefan is right here.
- It's not surprising that with -Onot you get different code from different source programs, even if one can readily be transformed into the other. That's what -O does.
Yes, but I found it surprising that just removing a type signature should result in markedly different code. Are there other known situations where that can happen?
It is not _just_ removing a type signature, it is also changing the type from `Bool` to `forall a. a`. An explicit type signature of the latter would have produced the same results as no type signature, I believe. The surprise is that an unconstrained type-variable being variable rather than instantiated to an arbitrary type, makes any difference (since it doesn't, normally, at runtime). I would guess the programs `Bool` and `a` are the same once optimizations are turned on? Maybe GHC could avoid the creation of type-lambdas that are unused (in some sense)... with -Onot... I'm dubious about that.
It *does not change the designed semantics at all*. It *tickles a bug*. In the *absence of code generator bugs* it generates *semantically equivalent code*. Do you want inserting or removing type signatures to always tickle the same bugs? Seems like an awfully constraining desgin choice to me.
Inserting a preemption test in non-allocating loops seems like a good idea to me (I hate the invisible threat that my program might not thread as threading should work)... any idea how bad the performance impact could be (I guess the test could be specified to branch-predict that the loop wouldn't be interrupted), and whether there could be a pragma to disable that test in certain loops? Is -threaded versus not, relevant here?

Hello Isaac, Saturday, August 4, 2007, 4:01:18 AM, you wrote:
Inserting a preemption test in non-allocating loops seems like a good idea to me (I hate the invisible threat that my program might not thread as threading should work)... any idea how bad the performance impact could be
with loop unrolling it can be made smaller than any predefined value -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (5)
-
Bulat Ziganshin
-
Isaac Dupree
-
Simon Peyton-Jones
-
Stefan O'Rear
-
Tim Chevalier