[GHC] #16065: Stack squeezing during context switches makes for non-determinism in allocations

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime | Version: 8.6.3 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: #4450, #8861 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- As #4450 and #8611 show, stack squeezing in the RTS makes allocation numbers between two different runs of the same binary non-deterministic, because its effect is depending on when context switches are bound to happen. A short-term solution might be to deactivate stack squeezing for vulnerable benchmarks with `+RTS -Z` like in Phab:D5460, but IMO a more elegant solution would be to only deactivate stack squeezing in `threadPause` calls that happen due to context switches. Would you agree? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Good catch, but I don't fully understand stack squeezing affects allocation. As far as I'm aware it should just reduce the stack size, not increase it, so it shouldn't require any extra allocation. I'm OK with putting in some fix for this to make the benchmarks stable, but I'd like to completely understand the cause first. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Ah, I just realised what causes the allocation: if we *don't* do stack squeezing, then the larger stack size means that we might overflow the stack in the future, which will entail allocating for the new stack chunk. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1):
because its effect is depending on when context switches are bound to happen
Any ideas why -V0 doesn't fix this? It seems to have fixed the same problem in #4450. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

because its effect is depending on when context switches are bound to happen
Any ideas why -V0 doesn't fix this? It seems to have fixed the same
#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): Replying to [comment:3 osa1]: problem in #4450. Yes, it fixes that, too (https://ghc.haskell.org/trac/ghc/ticket/8611#comment:11). But it isn't enabled by default (and nor should it). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I'm trying to understand; if both `-V0` and `-Z` fix the non-deterministic allocations in some tests, why not enable `-V0` instead of `-Z` in those tests? I don't prefer one over the other, I'm just trying to understand. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by sgraf): No reason, I just thought that `-Z` was less invasive. But I don't really know the implications of `-V0`, so YMMV... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): To eliminate determinism, `-V0` by itself is preferable. Stack squeezing by itself is perfectly deterministic, the non-determinism arises when it is done at non-deterministic times, which is a result of time-based context switches. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Old description:
As #4450 and #8611 show, stack squeezing in the RTS makes allocation numbers between two different runs of the same binary non-deterministic, because its effect is depending on when context switches are bound to happen.
A short-term solution might be to deactivate stack squeezing for vulnerable benchmarks with `+RTS -Z` like in Phab:D5460, but IMO a more elegant solution would be to only deactivate stack squeezing in `threadPause` calls that happen due to context switches. Would you agree?
New description: As #4450 and #8611 show, stack squeezing in the RTS makes allocation numbers between two different runs of the same binary non-deterministic, because its effect is depending on when context switches are bound to happen. A short-term solution might be to deactivate stack squeezing for vulnerable benchmarks with ~~`+RTS -Z`~~ `+RTS -V0` like in Phab:D5460, but IMO a more elegant solution would be to only deactivate stack squeezing in `threadPause` calls that happen due to context switches. Would you agree? -- Comment (by sgraf): Updated the patch for #8611 to reflect this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I think we want to leave stack squeezing as it is. How would you make it deterministic? Just saying "only squeeze if we're about to switch to a different thread" would just move the problem to concurrent programs. `+RTS -V0` is a solution that works for both single-threaded and concurrent programs, because it forces the context switches to happen at deterministic times. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Stack squeezing during context switches makes for non-determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: closed Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * status: new => closed * resolution: => invalid Comment: I agree. Let's fix nofib by doing `+RTS -V0` for all single-threaded benchmarks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#16065: Don't do stack squeezing during context switches in single-threaded programs to guarantee determinism in allocations -------------------------------------+------------------------------------- Reporter: sgraf | Owner: (none) Type: task | Status: closed Priority: normal | Milestone: ⊥ Component: Runtime System | Version: 8.6.3 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #4450, #8861 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16065#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC