ghc-prof-options and libraries on Hackage

Hi all, I'd like to start a conversation around the ghc-prof-options option that can be used in Cabal-ized libraries. As you may know, this option specifies extra GHC options that are used when --enable-library-profiling is enabled. This is very convenient for local development, but I argue that it can be counter-productive for releases onto Hackage. To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me. Roman Cheplyaka also points out that by doing this, profiles are skewed - -auto-all is not free, so we actually pay in runtime performance every time a library does this. I would like to hear from others if we should consider managing this option a little more. My personal feeling is that this flag shouldn't be used unless in local development, so guarded by an off-by-default build flag. I think cabal should also warn authors who are using this flag, and encourage them to place a guard around this option. Thoughts? - ocharles

Agreed. Cabal should warn.
On Sep 21, 2014 8:13 PM, "Oliver Charles"
Hi all,
I'd like to start a conversation around the ghc-prof-options option that can be used in Cabal-ized libraries. As you may know, this option specifies extra GHC options that are used when --enable-library-profiling is enabled. This is very convenient for local development, but I argue that it can be counter-productive for releases onto Hackage.
To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me.
Roman Cheplyaka also points out that by doing this, profiles are skewed - -auto-all is not free, so we actually pay in runtime performance every time a library does this.
I would like to hear from others if we should consider managing this option a little more. My personal feeling is that this flag shouldn't be used unless in local development, so guarded by an off-by-default build flag. I think cabal should also warn authors who are using this flag, and encourage them to place a guard around this option.
Thoughts? - ocharles
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me.
While in this case the extra profiling data may be useless, it seems to me that in general you won't know a priori. I would prefer to have as much data available as possible, and then filter it. HPC works sort of like this, it gathers coverage data for all of the modules compiled with -fhpc, but has command-line options to only report coverage for certain modules. Perhaps GHC's rts could add a similar flag to only report profiling data from certain modules or functions?

On 22 September 2014 06:52, Eric Seidel
To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me.
While in this case the extra profiling data may be useless, it seems to me that in general you won't know a priori. I would prefer to have as much data available as possible, and then filter it.
Agreed: I've often had to do a "cabal unpack <foo>", edit the .cabal file to add the profiling options and re-install that package (which is admittedly easier/nicer now with sandboxes). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

On 22/09/14 01:58, Ivan Lazar Miljenovic wrote:
On 22 September 2014 06:52, Eric Seidel
wrote: To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me.
While in this case the extra profiling data may be useless, it seems to me that in general you won't know a priori. I would prefer to have as much data available as possible, and then filter it.
Agreed: I've often had to do a "cabal unpack <foo>", edit the .cabal file to add the profiling options and re-install that package (which is admittedly easier/nicer now with sandboxes).
Why would you need to edit the cabal file? Are you aware of --ghc-options? Roman

On Mon, Sep 22, 2014 at 4:52 AM, Eric Seidel
To provide an example, I'm currently working on a little game engine that uses JuicyPixels to load images. I have a problem in my code that needs optimizing, but the current state of things results in profiles that are very difficult to work with. JuicyPixels specifies -auto-all in its cabal file, which means I have no alternative but to profile JuicyPixels code. In this scenario, the bottleneck is actually within my FRP game loop and nothing to do with image loading! As a result, the profiles are fairly useless to me.
While in this case the extra profiling data may be useless, it seems to me that in general you won't know a priori. I would prefer to have as much data available as possible, and then filter it.
HPC works sort of like this, it gathers coverage data for all of the modules compiled with -fhpc, but has command-line options to only report coverage for certain modules. Perhaps GHC's rts could add a similar flag to only report profiling data from certain modules or functions?
I am sympathetic to this concern, but the main problem here is that -fprof-auto isn't free. It can interfere quite heavily with optimization passes, meaning that it ends up attributing much higher costs to certain functions than they would have in a normal compilation path, actively misleading users. I for one would have no confidence that profiles were pointing to the correct place if everything were built with -fprof-auto from the start. It's much better to start with just a few high-level cost-centers and drill down from there. This can be a tedious process, but it's much more reliable. As an alternative, compiling libraries with -fprof-auto-exported is fairly reasonable IMHO. John L.

On Mon, 2014-09-22 at 07:28 +0800, John Lato wrote:
compiling libraries with -fprof-auto-exported
That's what I was going to pipe up with. In my usage, having dependencies at -fprof-auto-exported seems to work quite well. On the rare occasion that I need *more*, then off to `cabal unpack` we go, but that's a better trade off I think. AfC Sydney

On Mon, Sep 22, 2014 at 07:28:41AM +0800, John Lato wrote:
As an alternative, compiling libraries with -fprof-auto-exported is fairly reasonable IMHO.
It would be nice to be able to overwrite e.g. the ghc-prof-options, then you could just build everything in a cabal sandbox with the desired options. Greetings, Daniel
participants (9)
-
Andrew Cowie
-
Daniel Trstenjak
-
Eric Seidel
-
Ivan Lazar Miljenovic
-
Johan Tibell
-
John Lato
-
Michael Snoyman
-
Oliver Charles
-
Roman Cheplyaka