Diagnosing excessive memory usage / crash when compiling - 9.8.1 - ghc-devs - Haskell.org

newer
[ANNOUNCE] GHC 9.8.2 is now...

Diagnosing excessive memory usage / crash when compiling - 9.8.1

older
Who is in charge for approve the...

Justin Bailey

15 Feb 2024 15 Feb '24

7:07 p.m.

Hi! I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2). When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault. With -00, that does not happen. How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster. Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong? Pointers to existing issues or documentation welcome! Thank you! Justin

Reply

Sign in to reply online Use email software

Show replies by date

Teofil Camarasu

15 Feb 15 Feb

7:36 p.m.

From your description, it sounds to me like there's something in your

Hi Justin, source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue. - pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took. - The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes. But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies. - Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket. I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process. One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2. Cheers, Teo On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).

When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Justin Bailey

9:31 p.m.

I did notice this in CI (which are linux machines running x86_64) so at least it is not limited to M2. Great tips! Much appreciated! On Thu, Feb 15, 2024 at 11:36 AM Teofil Camarasu wrote:

Hi Justin,

From your description, it sounds to me like there's something in your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took.

- The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes. But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).

When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Simon Peyton Jones

11:56 p.m.

Using `-dshow-passes` is very helpful too. It shows the program size after each pass of the compiler. Simon On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu wrote:

Hi Justin,

From your description, it sounds to me like there's something in your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took.

- The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes. But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).

When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Justin Bailey

16 Feb 16 Feb

1:35 a.m.

Well, after running with these flags, one of the `.dump-simpl` files is 26 GB! That's also the module it seems to hang on, so pretty sure something is going wrong there! I was seeing output indicating GHC had allocated 146GB during some of the passes ??? ``` *** Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: Result size of Simplifier iteration=1 = {terms: 9,134, types: 49,937, coercions: 388,802,399, joins: 53/289} Result size of Simplifier iteration=2 = {terms: 8,368, types: 46,864, coercions: 176,356,474, joins: 25/200} Result size of Simplifier = {terms: 8,363, types: 46,848, coercions: 176,356,474, joins: 25/200} !!! Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: finished in 294595.62 milliseconds, allocated 146497.087 megabytes ``` So anyways I'll continue whittling this down. This module does use a lot of higher-kinded types and fancy stuff. On Thu, Feb 15, 2024 at 3:56 PM Simon Peyton Jones wrote:

Using `-dshow-passes` is very helpful too. It shows the program size after each pass of the compiler.

Simon

On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu wrote:

...
Hi Justin,

From your description, it sounds to me like there's something in your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took.

- The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes. But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).

When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Ben Gamari

2:23 a.m.

Justin Bailey writes:

Well, after running with these flags, one of the `.dump-simpl` files is 26 GB! That's also the module it seems to hang on, so pretty sure something is going wrong there!

I was seeing output indicating GHC had allocated 146GB during some of the passes ???

The high coercion sizes here suggest that this is some variant of #8095. Having another minimal reproducer is always useful. Cheers, - Ben

Reply

Sign in to reply online Use email software

Simon Peyton Jones

8:47 a.m.

Sorry about that! Maybe you have a giant data type with deriving(Generic)? GHC tends to behave badly on those. And yes, you seem to have a lot of type-family stuff going on! Usually we see 10k coercion sizes; you have 400k. Quite a lot of improvements have happened in this area, which may (or may not) help. Once you have whittled a bit, perhaps it'd be possible to test with HEAD? This was better with ... 9.6? 9.4? Simon On Fri, 16 Feb 2024 at 01:36, Justin Bailey wrote:

Well, after running with these flags, one of the `.dump-simpl` files is 26 GB! That's also the module it seems to hang on, so pretty sure something is going wrong there!

I was seeing output indicating GHC had allocated 146GB during some of the passes ???

```

*** Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: Result size of Simplifier iteration=1 = {terms: 9,134, types: 49,937, coercions: 388,802,399, joins: 53/289} Result size of Simplifier iteration=2 = {terms: 8,368, types: 46,864, coercions: 176,356,474, joins: 25/200} Result size of Simplifier = {terms: 8,363, types: 46,848, coercions: 176,356,474, joins: 25/200} !!! Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: finished in 294595.62 milliseconds, allocated 146497.087 megabytes ```

So anyways I'll continue whittling this down. This module does use a lot of higher-kinded types and fancy stuff.

On Thu, Feb 15, 2024 at 3:56 PM Simon Peyton Jones wrote:

...
Using `-dshow-passes` is very helpful too. It shows the program size

after each pass of the compiler.

...
Simon

On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu

...
...
Hi Justin,

From your description, it sounds to me like there's something in your

...
...
In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file

...
...
- The first step is to hone down on the problematic module or modules.

Maybe you already have a good idea from where in your build the compiler crashes.

...
But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which

wrote: source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). that will show timings information about how long each phase of compilation took. part of your source code is to blame.

...
...
Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an

M2).

...
When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Justin Bailey

21 Feb 21 Feb

9:59 p.m.

I've narrowed this down to a pretty small example, and can show that, as the number of fields in my data type increases, compilation takes longer and longer (seems exponential). For example, on my M2, with GHC 9.8, I get these timings & peak memory usage: * 1 field - 2.5s, 198MB peak * 2 fields - 7s, 1.0GB peak * 3 fields - 26.8s, 4.6GB peak * 4 fields - 82.9s, 14.5GB peak For GHC 9.6, those stay pretty much flat up to 10 fields. I didn't test past 4 with GHC 9.8. The project does use `UndecidableInstances` which worries me. In any case I reported a bug at https://gitlab.haskell.org/ghc/ghc/-/issues/24462. Thanks for help narrowing the problem! On Fri, Feb 16, 2024 at 12:48 AM Simon Peyton Jones wrote:

Sorry about that!

Maybe you have a giant data type with deriving(Generic)? GHC tends to behave badly on those. And yes, you seem to have a lot of type-family stuff going on! Usually we see 10k coercion sizes; you have 400k.

Quite a lot of improvements have happened in this area, which may (or may not) help. Once you have whittled a bit, perhaps it'd be possible to test with HEAD?

This was better with ... 9.6? 9.4?

Simon

On Fri, 16 Feb 2024 at 01:36, Justin Bailey wrote:

...
Well, after running with these flags, one of the `.dump-simpl` files is 26 GB! That's also the module it seems to hang on, so pretty sure something is going wrong there!

I was seeing output indicating GHC had allocated 146GB during some of the passes ???

```

*** Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: Result size of Simplifier iteration=1 = {terms: 9,134, types: 49,937, coercions: 388,802,399, joins: 53/289} Result size of Simplifier iteration=2 = {terms: 8,368, types: 46,864, coercions: 176,356,474, joins: 25/200} Result size of Simplifier = {terms: 8,363, types: 46,848, coercions: 176,356,474, joins: 25/200} !!! Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: finished in 294595.62 milliseconds, allocated 146497.087 megabytes ```

So anyways I'll continue whittling this down. This module does use a lot of higher-kinded types and fancy stuff.

On Thu, Feb 15, 2024 at 3:56 PM Simon Peyton Jones wrote:

...
Using `-dshow-passes` is very helpful too. It shows the program size after each pass of the compiler.

Simon

On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu wrote:

...
Hi Justin,

From your description, it sounds to me like there's something in your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this). In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file that will show timings information about how long each phase of compilation took.

- The first step is to hone down on the problematic module or modules. Maybe you already have a good idea from where in your build the compiler crashes. But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an M2).

When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Carter Schonwald

22 Feb 22 Feb

2:13 a.m.

Undecidable instances should never be the issue. It looks like te example code you shared Is using generic deriving and then using some generic deriving code gen on the Barbie package classes FunctorB and ConstraintB So it seems like, given than O0 doesn’t trigger that problem, that some of the generic deriving code in Barbie trips up ghcs optimizer. On Wed, Feb 21, 2024 at 5:00 PM Justin Bailey wrote:

I've narrowed this down to a pretty small example, and can show that, as the number of fields in my data type increases, compilation takes longer and longer (seems exponential).

For example, on my M2, with GHC 9.8, I get these timings & peak memory usage:

* 1 field - 2.5s, 198MB peak * 2 fields - 7s, 1.0GB peak * 3 fields - 26.8s, 4.6GB peak * 4 fields - 82.9s, 14.5GB peak

For GHC 9.6, those stay pretty much flat up to 10 fields. I didn't test past 4 with GHC 9.8.

The project does use `UndecidableInstances` which worries me.

In any case I reported a bug at https://gitlab.haskell.org/ghc/ghc/-/issues/24462. Thanks for help narrowing the problem!

On Fri, Feb 16, 2024 at 12:48 AM Simon Peyton Jones wrote:

...
Sorry about that!

Maybe you have a giant data type with deriving(Generic)? GHC tends to

behave badly on those. And yes, you seem to have a lot of type-family stuff going on! Usually we see 10k coercion sizes; you have 400k.

...
Quite a lot of improvements have happened in this area, which may (or

may not) help. Once you have whittled a bit, perhaps it'd be possible to test with HEAD?

...
This was better with ... 9.6? 9.4?

Simon

On Fri, 16 Feb 2024 at 01:36, Justin Bailey wrote:

...
Well, after running with these flags, one of the `.dump-simpl` files is 26 GB! That's also the module it seems to hang on, so pretty sure something is going wrong there!

I was seeing output indicating GHC had allocated 146GB during some of the passes ???

```

*** Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: Result size of Simplifier iteration=1 = {terms: 9,134, types: 49,937, coercions: 388,802,399, joins: 53/289} Result size of Simplifier iteration=2 = {terms: 8,368, types: 46,864, coercions: 176,356,474, joins: 25/200} Result size of Simplifier = {terms: 8,363, types: 46,848, coercions: 176,356,474, joins: 25/200} !!! Simplifier [xxx.AirTrafficControl.Types.ATCMessage]: finished in 294595.62 milliseconds, allocated 146497.087 megabytes ```

So anyways I'll continue whittling this down. This module does use a lot of higher-kinded types and fancy stuff.

On Thu, Feb 15, 2024 at 3:56 PM Simon Peyton Jones wrote:

...
Using `-dshow-passes` is very helpful too. It shows the program size

...
...
...
Simon

On Thu, 15 Feb 2024 at 19:36, Teofil Camarasu <

teofilcamarasu@gmail.com> wrote:

...
...
Hi Justin,

From your description, it sounds to me like there's something in

your source code that's causing the optimiser to generate too much code, which then causes the crash because of memory exhaustion (though I might be wrong about this).

...
In the past, when I've run into similar things. I've followed the following vague process to help find a minimal reproducer of the issue.

- pass `-ddump-simpl -ddump-timings -ddump-to-file` to GHC. (See here for docs on these flags: https://downloads.haskell.org/ghc/latest/docs/users_guide/debugging.html) These will write some extra debugging information to either your `dist-newstyle` or `.stack-work` directory depending on whether you use cabal or stack. They will create for each source file a `.dump-simpl` file that will give you the compiler's intermediate output. And a `.dump-timings` file

after each pass of the compiler. that will show timings information about how long each phase of compilation took.

...
...
...
...
- The first step is to hone down on the problematic module or

modules. Maybe you already have a good idea from where in your build the compiler crashes.

...
But if not, you can use the `.dump-timings` files and/or a tool that summarises them like https://github.com/codedownio/time-ghc-modules. To get a sense of where the problem lies.

- Once you've found your worst module, the next step is to determine what about that module is causing the issue. I find that often you can just try to find what top level identifiers in your `.dump-simpl` file are big. This will give a good idea of which part of your source code is to blame. Then I tend to try to delete everything that is irrelevant, and check again. Incrementally you get something that is smaller and smaller, and in time you tend to end up with something that is small enough to write up as a ticket.

I hope that helps. I've found this process to work quite well for hunting down issues where GHC's optimiser goes wrong, but it is a bit of a labour intensive process.

One last thing. You mention that you are on M2. If it's easily doable for you, try to reproduce on x86_64 just to make sure it's not some bug specific to M2.

Cheers, Teo

On Thu, Feb 15, 2024 at 7:08 PM Justin Bailey wrote:

...
Hi!

I'm trying to upgrade our (large) codebase to use 9.8.1. (I'm on an

M2).

...
When building with -01, memory on the GHC process climbs until it reaches the limit of my machine (64G) and then crashes with a segfault.

With -00, that does not happen.

How would I go about diagnosing what's happening? Using RTS flags to limit the heap to 32G produced the same behavior, just faster.

Strangely, `-v5` does not produce any more output in the console (passed via cabal's --ghc-options). Maybe I'm doing it wrong?

Pointers to existing issues or documentation welcome! Thank you!

Justin _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

498

Age (days ago)

505

Last active (days ago)

Download

8 comments

5 participants

tags

participants (5)

Ben Gamari
Carter Schonwald
Justin Bailey
Simon Peyton Jones
Teofil Camarasu