[GHC] #8472: Primitive string literals prevent optimization

#8472: Primitive string literals prevent optimization ------------------------------+-------------------------------------------- Reporter: akio | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Keywords: | Operating System: Linux Architecture: x86_64 | Type of failure: Runtime performance bug (amd64) | Test Case: Difficulty: Unknown | Blocking: Blocked By: | Related Tickets: | ------------------------------+-------------------------------------------- Using an Addr# literal seems to result in less aggressive optimization. If I compile the attached program like this: {{{ ghc -O2 -fforce-recomp -ddump-simpl addr.hs }}} the code is optimized nicely. Everything are inlined into {{{t}}}, intermediate pairs are eliminated, etc. However, when I replace the Int# literals in the code with Addr# literals by defining the {{{ADDR}}} macro: {{{ ghc -O2 -fforce-recomp -ddump-simpl -DADDR addr.hs }}} GHC now creates 2 extra top-level bindings, each of which allocates a pair. I don't see why the presence of Addr# literals should prevent inlining, so I'm reporting a bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization --------------------------------------------+------------------------------ Reporter: akio | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime performance bug | (amd64) Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by simonpj): Great example. It's an example of #2840. The problem is that we get something like this: {{{ f = let a::Addr# = "foo"# in \x -> blah g y = ...(f e)... }}} We can't float the binding for `a` to the top level because Core doesn't allow top-level bindings. But by not floating it we prevent `f` being inlined, which is pretty terrible. I think the solution is simply to '''allow top level bindings of form `a::Addr# = "foo"#`'''. That is: * The type is `Addr#` * The RHS is a string literal; in particular NOT a string computation Things that would need doing: * Modify the test `isUnLiftedType ty` in `SetLevels.lvlMFE`, which stops unlifted things getting floated to top level. * Similarly `Simplify.bindingOk`. * Make `CmmLint` check the new invariant. * The STG->Cmm code generator would need to generate some suitable `CmmData` stuff. This is a fairly easy job. Any volunteers? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Runtime | Difficulty: Unknown performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by xnyhps): * owner: => xnyhps -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): See also #11312 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by mpickering): * keywords: => newcomer Comment: This seems well-specified for an newcomer. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by gridaphobe): While working on Phab:D1259 I came across another example of this issue (NB it requires my patch to trigger). The idea in that patch is to avoid inlining `String` literals and share them as top-level values instead. In order to keep using the REWRITE rules, we pretend that `unpackCString#` is CONLIKE. This has a side-effect of making GHC float the unboxed string literal out into a separate let-binder, which then prevents us from floating the `unpackCString#` application to the top-level, as unboxed literals aren't allowed there. Here's a simple example that triggers the behavior with the Phab:D1259 patch, but properly floats the `String`s on master. {{{ module Foo where draw xs = a ++ b ++ xs [a,b] = ["aa", "bb"] }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): See also #12585 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: xnyhps Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): More intersting string-literal stuff * #9577 * #10922 * #11312 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by gridaphobe): * owner: xnyhps => gridaphobe Comment: @simonpj I'm working on this and have a patch nearly ready, but I'm a bit confused by your suggestion (in comment:1) to update `CmmLint`. It's not clear to me how to check this at the `Cmm` level, how do we determine what is top level? Perhaps you meant to check the invariant in `CoreLint`? I'm having to tweak a few of the checks there anyway that expect all top-level binders to have lifted types. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by akio): Ack, I have also been working on this, but it seems like @gridaphobe's patches are closer to be ready (I haven't updated the core lint to check the new invariant). Here is a link to my patch in case it contains anything useful: https://github.com/takano- akio/ghc/commit/abd74a0d1fdde0e75d46bf1ff255ed4966a020ad -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by gridaphobe): @akio sorry about the duplicate work! It looks like our patches are very similar, though I introduce a new StgRhs constructor rather than your StgTopBinding type. I think your approach is a bit better, as I end up having to deal with the (im)possibility of let-binding a literal anywhere (it looks like Stg uses `case <lit> of <var> { __DEFAULT -> ... }` to bind literals, which makes sense given that they can't be lazy). Does your patch validate? I just noticed this afternoon that although my patch works for the example and parts of nofib, it causes a linker error when building ghc-stage2, due to undefined symbols. I'm setting the label for the string literal a bit differently from you, so your patch might be fine. My CoreLint patch is at https://github.com/ghc/ghc/compare/master...gridaphobe:T8472#diff- 9ad7456ebf7fad38de8b24ddceb9bb3c. Do you want to submit your patch + my CoreLint pass, that ought to make for a complete patch :) (I also notice that both of our patches could use a nice Note explaining why we want to bind string literals at the top level, especially since the logic is spread across multiple phases of the compiler) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Excellent work!
Perhaps you meant to check the invariant in `CoreLint`?
Yes I did. Sorry for the confusion here.
I introduce a new `StgRhs` constructor rather than your `StgTopBinding` type. I think your approach is a bit better,
I have not looked at the details, but since we are only talking about introducing a new ''top-level'' form, it makes sense for the data type to reflect that if it's convenient to do so.
(I also notice that both of our patches could use a nice Note explaining why we want to bind string literals at the top level, especially since the logic is spread across multiple phases of the compiler)
YES PLEASE! Also document invariants in `CoreSyn`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Does your patch validate? I just noticed this afternoon that although my
#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by akio): Replying to [comment:10 gridaphobe]: patch works for the example and parts of nofib, it causes a linker error when building ghc-stage2, due to undefined symbols. I'm setting the label for the string literal a bit differently from you, so your patch might be fine. I don't get link errors, but I do get several failures when I run `make slow` in `testsuite/`. Some of the failures look harmless (the expected output has to be adjusted) but I'll need to look more carefully at other failures.
My CoreLint patch is at
https://github.com/ghc/ghc/compare/master...gridaphobe:T8472#diff- 9ad7456ebf7fad38de8b24ddceb9bb3c. Do you want to submit your patch + my CoreLint pass, that ought to make for a complete patch :) Thanks, I've taken your patch to CoreLint. I'm happy to sort out the remaining issues and submit a Differential revision, but I won't have much time to work on this until this weekend. Feel free to go ahead and do it yourself if you like. In any case, my work-in-progress branch can be found at https://github.com/takano-akio/ghc/commits/top-level-string. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D2554 Wiki Page: | -------------------------------------+------------------------------------- Changes (by gridaphobe): * status: new => patch * differential: => D2554 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Changes (by akio): * differential: D2554 => Phab:D2554, Phab:D2605 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by simonpj): Phab:D2554 is abandoned; let's remove it from the ticket sinceit is now irrelevant (I assume). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by akio): My patch (Phab:D2605) causes some compiler perf regressions. This happens because {{{ f x = ... "foo"# ... }}} now gets transformed into {{{ foo = "foo"# f x = ... foo ... }}} and this means a larger code size. For example, on `perf/compiler/T1969`, the `peak_megabytes_allocated` goes from 63 to 68 (a 8% increase), the code size (in terms of the `Term` component of `CoreStats`) goes from 12603 to 13271 (a 10% increase), and this is fully explained by the above effect, which increases the code size by 2 for each top-level string literal. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by simonpj):
this means a larger code size.
Why?? In both cases I'd ultimately expect to see a top-level machine-code label for a literal string, and a reference to that label in the compiled code. I see no reason for increased code size, or increased bytes- allocated. Can you explain? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by akio): Replying to [comment:17 simonpj]:
this means a larger code size.
Why?? In both cases I'd ultimately expect to see a top-level machine- code label for a literal string, and a reference to that label in the compiled code. I see no reason for increased code size, or increased bytes-allocated.
Sorry, I meant a larger size of the core. Presumably the compiler needs to use more memory to store a larger syntax tree. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by simonpj): Oh OK, now I understand. I think that's fine: * peak allocation is very vulnerable to the moment at which major gc runs. * "Term" component of core-stats is fine. Thanks Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization
-------------------------------------+-------------------------------------
Reporter: akio | Owner: gridaphobe
Type: bug | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: | Keywords: newcomer
Operating System: Linux | Architecture: x86_64
Type of failure: Runtime | (amd64)
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2554,
Wiki Page: | Phab:D2605
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: fixed | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Changes (by mpickering): * status: patch => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization
-------------------------------------+-------------------------------------
Reporter: akio | Owner: gridaphobe
Type: bug | Status: closed
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: fixed | Keywords: newcomer
Operating System: Linux | Architecture: x86_64
Type of failure: Runtime | (amd64)
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2554,
Wiki Page: | Phab:D2605
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: fixed | Keywords: newcomer, | strings Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Changes (by bgamari): * keywords: newcomer => newcomer, strings -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8472: Primitive string literals prevent optimization -------------------------------------+------------------------------------- Reporter: akio | Owner: gridaphobe Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: fixed | Keywords: newcomer, | strings Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2554, Wiki Page: | Phab:D2605 -------------------------------------+------------------------------------- Comment (by bgamari): The patch described in comment:22 is in the `master` branch and GHC 8.2.1 and is a temporary workaround for some of the breakage caused by this change. The general theme of breakage-due-to-ticks is tracked in #14123. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8472#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC