
#15113: Do not make CAFs from literal strings -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.10.1 Component: Compiler | Version: 8.2.2 Resolution: | Keywords: CAFs Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #16014 | Differential Rev(s): Phab:D4717 Wiki Page: | -------------------------------------+------------------------------------- Old description:
Currently (as I discovered in #15038), we get the following code for `GHC.Exception.Base.patError`: {{{ lvl2_r3y3 :: [Char] [GblId] lvl2_r3y3 = unpackCString# lvl1_r3y2
-- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0} patError :: forall a. Addr# -> a [GblId, Arity=1, Str=x, Unf=OtherCon []] patError = \ (@ a_a2kh) (s_a1Pi :: Addr#) -> raise# @ SomeException @ 'LiftedRep @ a_a2kh (Control.Exception.Base.$fExceptionPatternMatchFail_$ctoException ((untangle s_a1Pi lvl2_r3y3) `cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0]) :: (String :: *) ~R# (PatternMatchFail :: *)))) }}} That stupid `lvl2_r3y3 :: String` is a CAF, and hence `patError` has CAF- refs, and hence so does any function that calls `patError`, and any function that calls them.
That's bad! Lots more CAF entries in SRTs, lots more work traversing those SRTs in the garbage collector. And for what? To share the work of unpacking a C string! This is nuts.
What to do?
* Somehow refrain from floating `unpackCSTring# lit` to top level, even if you could otherwise do so. But that seems very ad-hoc, and it make the function bigger and less inlinable.
* Treat a top level definition {{{ x :: [Char] x = unpackCString# y }}} as NOT a CAF, and make it single-entry so that the thunk is not updated. Then every use of `x` will unpack the string afresh, which is probably a good idea anyhow.
I like this more. It would be implemented somewhere in the code generator.
New description: Currently (as I discovered in #15038), we get the following code for `GHC.Exception.Base.patError`: {{{ lvl2_r3y3 :: [Char] [GblId] lvl2_r3y3 = unpackCString# lvl1_r3y2 -- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0} patError :: forall a. Addr# -> a [GblId, Arity=1, Str=x, Unf=OtherCon []] patError = \ (@ a_a2kh) (s_a1Pi :: Addr#) -> raise# @ SomeException @ 'LiftedRep @ a_a2kh (Control.Exception.Base.$fExceptionPatternMatchFail_$ctoException ((untangle s_a1Pi lvl2_r3y3) `cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0]) :: (String :: *) ~R# (PatternMatchFail :: *)))) }}} That stupid `lvl2_r3y3 :: String` is a CAF, and hence `patError` has CAF- refs, and hence so does any function that calls `patError`, and any function that calls them. That's bad! Lots more CAF entries in SRTs, lots more work traversing those SRTs in the garbage collector. And for what? To share the work of unpacking a C string! This is nuts. What to do? 1. Somehow refrain from floating `unpackCSTring# lit` to top level, even if you could otherwise do so. But that seems very ad-hoc, and it make the function bigger and less inlinable. 2. Treat a top level definition {{{ x :: [Char] x = unpackCString# y }}} as NOT a CAF, and make it single-entry so that the thunk is not updated. Then every use of `x` will unpack the string afresh, which is probably a good idea anyhow. I like this more. It would be implemented somewhere in the code generator. -- Comment (by simonpj): Looking at #16014, I like alternative (2) from the Description better and better. If we spot {{{ x = unpackCString# "blah"# }}} in the code generator, we could allocate a top-level closure with * Info-pointer: `rtsUnpackString_info` * One word of payload, a pointer to the literal string `"blah"#`. Now we can hand-write the single blob of code (plus info table) `rtsUnpackString_info` to unpack the string. Easy! And the overhead per string is only two words (for the closure) rather than all the stuff described in #16014. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15113#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler