The curious case of #367: Infinite loops can hang Concurrent Haskell

Hi there! While working on a NCG, I eventually came across #367[0], which make GHC produce code that looks similar to this: ``` label: [non-branch-instructions]* brach-instruction label ``` so essentially an uninterruptible loop. The solution for GHC to produce code that can be interrupted is to pass -fno-omit-yields. So far so good. Out of curiosity, I did add a small piece of code to detect this to my NCG to complain if code like the above was generated[1]. Three weeks ago, I kind of maneuvered myself into a memory blow up corner, and then life happened, but this weekend I managed to find some time to revert some memory blow up and continue working on the NCG. Turns out I can build a stage2 "quick" flavour of the NCG without dynamic support just fine. I never saw the dead lock detection code fire. Now I did leave the test suite running yesterday night, and when looking through the test suite results, there were quite a few failure. Curiously a lot of them were due to ghc missing dynamic support (doh!). But also quite a few that failed due to the deadlock detection. T12485, hs_try_putmvar003, ds-wildcard, ds001, read029, T2817, tc011, tc021, T4524 So, my question then is this: are we fine with ghc generating this code? Or, if we are not, do we want to figure out if we can eliminate it? The issue 367 goes into quite a bit of detail why this is tricky to handle generally. Or should we add -fno-omit-yields to the test-cases? The ultimate option is to just turn of the detection, and I'm fine with doing so. However I'd rather ask if anyone sees value in detecting this or not. Cheers, Moritz -- [0]: https://gitlab.haskell.org/ghc/ghc/-/issues/367 [1]: https://gitlab.haskell.org/ghc/ghc/-/blob/46fba2c91e1c4d23d46fa2d9b18dcd000c...

Moritz
I'm not getting this.
| So, my question then is this: are we fine with ghc generating this
| code? Or, if we are not, do we want to figure out if we can eliminate
| it?
What exactly is "this code" and "it"?
You could be asking
* Should we switch off -fomit-yields by default?
* Should we implement -fno-omit-yields in a cleverer way that generates less code?
Or you could be asking something else again.
Your deadlock-detection patch (which is presumably not in GHC) is very special-case: it detects some infinite loops, but only some. I'm not sure what role it plays in your thinking.
Simon
| -----Original Message-----
| From: ghc-devs

Hi Simon,
sure, I could have been a bit clearer:
Code we currently generate is:
```
_cCO:
bl _cCO
```
or
```
_czf:
mov x17, x18
bl _czf
```
and the question then becomes, do we want to investigate if we can
a) detect this is dead code
b) remove it in Cmm or higher, or flat out prevent it from being generated.
c) we don't care about producing this code, and hope the linker will
eliminate it.
Cheers,
Moritz
On Mon, Aug 17, 2020 at 5:18 PM Simon Peyton Jones
Moritz
I'm not getting this.
| So, my question then is this: are we fine with ghc generating this | code? Or, if we are not, do we want to figure out if we can eliminate | it?
What exactly is "this code" and "it"?
You could be asking
* Should we switch off -fomit-yields by default? * Should we implement -fno-omit-yields in a cleverer way that generates less code?
Or you could be asking something else again.
Your deadlock-detection patch (which is presumably not in GHC) is very special-case: it detects some infinite loops, but only some. I'm not sure what role it plays in your thinking.
Simon
| -----Original Message----- | From: ghc-devs
On Behalf Of Moritz | Angermann | Sent: 17 August 2020 09:40 | To: ghc-devs | Subject: The curious case of #367: Infinite loops can hang Concurrent | Haskell | | Hi there! | | While working on a NCG, I eventually came across #367[0], which make GHC | produce | code that looks similar to this: | | ``` | label: | [non-branch-instructions]* | brach-instruction label | ``` | | so essentially an uninterruptible loop. The solution for GHC to | produce code that | can be interrupted is to pass -fno-omit-yields. | | So far so good. Out of curiosity, I did add a small piece of code to | detect this to my NCG | to complain if code like the above was generated[1]. | | Three weeks ago, I kind of maneuvered myself into a memory blow up | corner, and then | life happened, but this weekend I managed to find some time to revert | some memory | blow up and continue working on the NCG. Turns out I can build a | stage2 "quick" flavour | of the NCG without dynamic support just fine. I never saw the dead | lock detection code fire. | | Now I did leave the test suite running yesterday night, and when | looking through the | test suite results, there were quite a few failure. Curiously a lot of | them were due to | ghc missing dynamic support (doh!). But also quite a few that failed | due to the deadlock | detection. | | T12485, hs_try_putmvar003, ds-wildcard, ds001, read029, T2817, tc011, | tc021, T4524 | | So, my question then is this: are we fine with ghc generating this | code? Or, if we are not, do we want to figure out if we can eliminate | it? The issue 367 goes into quite a bit of detail why this is tricky | to handle generally. | | Or should we add -fno-omit-yields to the test-cases? The ultimate | option is to just turn of the | detection, and I'm fine with doing so. However I'd rather ask if | anyone sees value in detecting | this or not. | | Cheers, | Moritz | | -- | [0]: | https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.h | askell.org%2Fghc%2Fghc%2F- | %2Fissues%2F367&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e64 | 1e6c16608d842893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637332504 | 423501799&sdata=KKZYaNgl%2FliDXwfcEqWIosjRjDYt%2FDc9i1sBEfS22mQ%3D& | ;reserved=0 | [1]: | https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.h | askell.org%2Fghc%2Fghc%2F- | %2Fblob%2F46fba2c91e1c4d23d46fa2d9b18dcd000c80363d%2Fcompiler%2FGHC%2FCmmT | oAsm%2FAArch64%2FPpr.hs%23L134- | 159&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c16608d84 | 2893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637332504423501799&am | p;sdata=RMXio8BI9tSjWnKK4HSXA3s%2BXNNM7ntk2ftQjmRJxzE%3D&reserved=0 | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc- | devs&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c16608d8 | 42893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637332504423501799&a | mp;sdata=8W595qb3lWsqdAeGeFp0T26DsCXzA6ngrCQLKihCXkA%3D&reserved=0

| and the question then becomes, do we want to investigate if we can
| a) detect this is dead code
| b) remove it in Cmm or higher, or flat out prevent it from being
| generated.
| c) we don't care about producing this code, and hope the linker will
| eliminate it.
I'm still puzzled. Why do you thing _cCO is dead? What alternative code are you thinking we might generate?
S
| -----Original Message-----
| From: Moritz Angermann

On Mon, 17 Aug 2020 at 8:14 PM, Simon Peyton Jones
| and the question then becomes, do we want to investigate if we can
| a) detect this is dead code
| b) remove it in Cmm or higher, or flat out prevent it from being
| generated.
| c) we don't care about producing this code, and hope the linker will
| eliminate it.
I'm still puzzled. Why do you thing _cCO is dead? What alternative code are you thinking we might generate?
My question is earlier: why do we generate code that we will never to get out again? The generated code is effectively: while(true);. This code does not have to be dead, and there may very well be reasons why we want to generate an infinite loop that can only be terminated from the outside. Maybe it’s just my naive expectation that the user more likely did not want to generate this code. Once _cCO is entered, there is no way out for the application. Cheers, Moritz
S
| -----Original Message-----
| From: Moritz Angermann
| Sent: 17 August 2020 10:30
| To: Simon Peyton Jones
| Cc: ghc-devs
| Subject: Re: The curious case of #367: Infinite loops can hang
| Concurrent Haskell
|
| Hi Simon,
|
| sure, I could have been a bit clearer:
|
| Code we currently generate is:
| ```
| _cCO:
| bl _cCO
| ```
|
| or
|
| ```
| _czf:
| mov x17, x18
| bl _czf
| ```
|
| and the question then becomes, do we want to investigate if we can
| a) detect this is dead code
| b) remove it in Cmm or higher, or flat out prevent it from being
| generated.
| c) we don't care about producing this code, and hope the linker will
| eliminate it.|
| Cheers,
| Moritz
|
| On Mon, Aug 17, 2020 at 5:18 PM Simon Peyton Jones
|
wrote: | >
| > Moritz
| >
| > I'm not getting this.
| >
| > | So, my question then is this: are we fine with ghc generating
| this
| > | code? Or, if we are not, do we want to figure out if we can
| eliminate
| > | it?
| >
| > What exactly is "this code" and "it"?
| >
| > You could be asking
| >
| > * Should we switch off -fomit-yields by default?
| > * Should we implement -fno-omit-yields in a cleverer way that
| generates less code?
| >
| > Or you could be asking something else again.
| >
| > Your deadlock-detection patch (which is presumably not in GHC) is
| very special-case: it detects some infinite loops, but only some.
| I'm not sure what role it plays in your thinking.
| >
| > Simon
| >
| >
| > | -----Original Message-----
| > | From: ghc-devs
On Behalf Of Moritz | > | Angermann
| > | Sent: 17 August 2020 09:40
| > | To: ghc-devs
| > | Subject: The curious case of #367: Infinite loops can hang
| Concurrent
| > | Haskell
| > |
| > | Hi there!
| > |
| > | While working on a NCG, I eventually came across #367[0], which
| make GHC
| > | produce
| > | code that looks similar to this:
| > |
| > | ```
| > | label:
| > | [non-branch-instructions]*
| > | brach-instruction label
| > | ```
| > |
| > | so essentially an uninterruptible loop. The solution for GHC to
| > | produce code that
| > | can be interrupted is to pass -fno-omit-yields.
| > |
| > | So far so good. Out of curiosity, I did add a small piece of code
| to
| > | detect this to my NCG
| > | to complain if code like the above was generated[1].
| > |
| > | Three weeks ago, I kind of maneuvered myself into a memory blow
| up
| > | corner, and then
| > | life happened, but this weekend I managed to find some time to
| revert
| > | some memory
| > | blow up and continue working on the NCG. Turns out I can build a
| > | stage2 "quick" flavour
| > | of the NCG without dynamic support just fine. I never saw the
| dead
| > | lock detection code fire.
| > |
| > | Now I did leave the test suite running yesterday night, and when
| > | looking through the
| > | test suite results, there were quite a few failure. Curiously a
| lot of
| > | them were due to
| > | ghc missing dynamic support (doh!). But also quite a few that
| failed
| > | due to the deadlock
| > | detection.
| > |
| > | T12485, hs_try_putmvar003, ds-wildcard, ds001, read029, T2817,
| tc011,
| > | tc021, T4524
| > |
| > | So, my question then is this: are we fine with ghc generating
| this
| > | code? Or, if we are not, do we want to figure out if we can
| eliminate
| > | it? The issue 367 goes into quite a bit of detail why this is
| tricky
| > | to handle generally.
| > |
| > | Or should we add -fno-omit-yields to the test-cases? The ultimate
| > | option is to just turn of the
| > | detection, and I'm fine with doing so. However I'd rather ask if
| > | anyone sees value in detecting
| > | this or not.
| > |
| > | Cheers,
| > | Moritz
| > |
| > | --
| > | [0]:
| > |
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
| ab.h
| > | askell.org%2Fghc%2Fghc%2F-
| > |
| %2Fissues%2F367&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead06
| 2e64
| > |
| 1e6c16608d842893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63733
| 2504
| > |
| 423501799&sdata=KKZYaNgl%2FliDXwfcEqWIosjRjDYt%2FDc9i1sBEfS22mQ%3D
| &
| > | ;reserved=0
| > | [1]:
| > |
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
| ab.h
| > | askell.org%2Fghc%2Fghc%2F-
| > |
| %2Fblob%2F46fba2c91e1c4d23d46fa2d9b18dcd000c80363d%2Fcompiler%2FGHC%2F
| CmmT
| > | oAsm%2FAArch64%2FPpr.hs%23L134-
| > |
| 159&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c1660
| 8d84
| > |
| 2893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63733250442350179
| 9&am
| > |
| p;sdata=RMXio8BI9tSjWnKK4HSXA3s%2BXNNM7ntk2ftQjmRJxzE%3D&reserved=
| 0
| > | _______________________________________________
| > | ghc-devs mailing list
| > | ghc-devs@haskell.org
| > |
| https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.
| hask
| > | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
| > |
| devs&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c166
| 08d8
| > |
| 42893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6373325044235017
| 99&a
| > |
| mp;sdata=8W595qb3lWsqdAeGeFp0T26DsCXzA6ngrCQLKihCXkA%3D&reserved=0

My question is earlier: why do we generate code that we will never to get out again?
Ah, well, if you say, for example
f x = f x
then it seems reasonably to generate an infinite loop. I don’t know if that’s what’s happening here, but it seems reasonable in principle.
I’m unsure about what you are proposing to change.
Simon
From: Moritz Angermann

I'll investigate why we end up generating the loops, and will report
back if I find anything that looks awfully off. I don't dispute that
there might be legitimate reasons to generate code like this. From a
user perspective I'd however be grateful if the compiler warns
me about this. Maybe it was my intention, but maybe it wasn't? Of
course as this might only catch a subset of potential infinite loops
it's not a comprehensive check.
I'll report back if I find anything in the tests that looks off.
Otherwise assume the tests do indeed intend to generate infinite
loops.
Cheers,
Moritz
On Mon, Aug 17, 2020 at 8:38 PM Simon Peyton Jones
My question is earlier: why do we generate code that we will never to get out again?
Ah, well, if you say, for example
f x = f x
then it seems reasonably to generate an infinite loop. I don’t know if that’s what’s happening here, but it seems reasonable in principle.
I’m unsure about what you are proposing to change.
Simon
From: Moritz Angermann
Sent: 17 August 2020 13:28 To: Simon Peyton Jones Cc: ghc-devs Subject: Re: The curious case of #367: Infinite loops can hang Concurrent Haskell On Mon, 17 Aug 2020 at 8:14 PM, Simon Peyton Jones
wrote: | and the question then becomes, do we want to investigate if we can
| a) detect this is dead code
| b) remove it in Cmm or higher, or flat out prevent it from being
| generated.
| c) we don't care about producing this code, and hope the linker will
| eliminate it.
I'm still puzzled. Why do you thing _cCO is dead? What alternative code are you thinking we might generate?
My question is earlier: why do we generate code that we will never to get out again? The generated code is effectively: while(true);.
This code does not have to be dead, and there may very well be reasons why we want to generate an infinite loop that can only be terminated from the outside. Maybe it’s just my naive expectation that the user more likely did not want to generate this code.
Once _cCO is entered, there is no way out for the application.
Cheers,
Moritz
S
| -----Original Message-----
| From: Moritz Angermann
| Sent: 17 August 2020 10:30
| To: Simon Peyton Jones
| Cc: ghc-devs
| Subject: Re: The curious case of #367: Infinite loops can hang
| Concurrent Haskell
|
| Hi Simon,
|
| sure, I could have been a bit clearer:
|
| Code we currently generate is:
| ```
| _cCO:
| bl _cCO
| ```
|
| or
|
| ```
| _czf:
| mov x17, x18
| bl _czf
| ```
|
| and the question then becomes, do we want to investigate if we can
| a) detect this is dead code
| b) remove it in Cmm or higher, or flat out prevent it from being
| generated.
| c) we don't care about producing this code, and hope the linker will
| eliminate it.|
| Cheers,
| Moritz
|
| On Mon, Aug 17, 2020 at 5:18 PM Simon Peyton Jones
|
wrote: | >
| > Moritz
| >
| > I'm not getting this.
| >
| > | So, my question then is this: are we fine with ghc generating
| this
| > | code? Or, if we are not, do we want to figure out if we can
| eliminate
| > | it?
| >
| > What exactly is "this code" and "it"?
| >
| > You could be asking
| >
| > * Should we switch off -fomit-yields by default?
| > * Should we implement -fno-omit-yields in a cleverer way that
| generates less code?
| >
| > Or you could be asking something else again.
| >
| > Your deadlock-detection patch (which is presumably not in GHC) is
| very special-case: it detects some infinite loops, but only some.
| I'm not sure what role it plays in your thinking.
| >
| > Simon
| >
| >
| > | -----Original Message-----
| > | From: ghc-devs
On Behalf Of Moritz | > | Angermann
| > | Sent: 17 August 2020 09:40
| > | To: ghc-devs
| > | Subject: The curious case of #367: Infinite loops can hang
| Concurrent
| > | Haskell
| > |
| > | Hi there!
| > |
| > | While working on a NCG, I eventually came across #367[0], which
| make GHC
| > | produce
| > | code that looks similar to this:
| > |
| > | ```
| > | label:
| > | [non-branch-instructions]*
| > | brach-instruction label
| > | ```
| > |
| > | so essentially an uninterruptible loop. The solution for GHC to
| > | produce code that
| > | can be interrupted is to pass -fno-omit-yields.
| > |
| > | So far so good. Out of curiosity, I did add a small piece of code
| to
| > | detect this to my NCG
| > | to complain if code like the above was generated[1].
| > |
| > | Three weeks ago, I kind of maneuvered myself into a memory blow
| up
| > | corner, and then
| > | life happened, but this weekend I managed to find some time to
| revert
| > | some memory
| > | blow up and continue working on the NCG. Turns out I can build a
| > | stage2 "quick" flavour
| > | of the NCG without dynamic support just fine. I never saw the
| dead
| > | lock detection code fire.
| > |
| > | Now I did leave the test suite running yesterday night, and when
| > | looking through the
| > | test suite results, there were quite a few failure. Curiously a
| lot of
| > | them were due to
| > | ghc missing dynamic support (doh!). But also quite a few that
| failed
| > | due to the deadlock
| > | detection.
| > |
| > | T12485, hs_try_putmvar003, ds-wildcard, ds001, read029, T2817,
| tc011,
| > | tc021, T4524
| > |
| > | So, my question then is this: are we fine with ghc generating
| this
| > | code? Or, if we are not, do we want to figure out if we can
| eliminate
| > | it? The issue 367 goes into quite a bit of detail why this is
| tricky
| > | to handle generally.
| > |
| > | Or should we add -fno-omit-yields to the test-cases? The ultimate
| > | option is to just turn of the
| > | detection, and I'm fine with doing so. However I'd rather ask if
| > | anyone sees value in detecting
| > | this or not.
| > |
| > | Cheers,
| > | Moritz
| > |
| > | --
| > | [0]:
| > |
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
| ab.h
| > | askell.org%2Fghc%2Fghc%2F-
| > |
| %2Fissues%2F367&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead06
| 2e64
| > |
| 1e6c16608d842893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63733
| 2504
| > |
| 423501799&sdata=KKZYaNgl%2FliDXwfcEqWIosjRjDYt%2FDc9i1sBEfS22mQ%3D
| &
| > | ;reserved=0
| > | [1]:
| > |
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
| ab.h
| > | askell.org%2Fghc%2Fghc%2F-
| > |
| %2Fblob%2F46fba2c91e1c4d23d46fa2d9b18dcd000c80363d%2Fcompiler%2FGHC%2F
| CmmT
| > | oAsm%2FAArch64%2FPpr.hs%23L134-
| > |
| 159&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c1660
| 8d84
| > |
| 2893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63733250442350179
| 9&am
| > |
| p;sdata=RMXio8BI9tSjWnKK4HSXA3s%2BXNNM7ntk2ftQjmRJxzE%3D&reserved=
| 0
| > | _______________________________________________
| > | ghc-devs mailing list
| > | ghc-devs@haskell.org
| > |
| https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.
| hask
| > | ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
| > |
| devs&data=02%7C01%7Csimonpj%40microsoft.com%7C06a6ead062e641e6c166
| 08d8
| > |
| 42893959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6373325044235017
| 99&a
| > |
| mp;sdata=8W595qb3lWsqdAeGeFp0T26DsCXzA6ngrCQLKihCXkA%3D&reserved=0
participants (2)
-
Moritz Angermann
-
Simon Peyton Jones