[GHC] #11565: Restore code to handle '-fmax-worker-args' flag

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- When i had a pass through DynFlags I've noticed dead code: https://phabricator.haskell.org/D1727 It was accidentally lost 3 years ago in major update changeset:0831a12ea2fc73c33652eeec1adc79fa19700578 The consensus is to try to put option handling back. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by slyfox): * keywords: => newcomer -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by slyfox: @@ -4,1 +4,1 @@ - It was accidentally lost 3 years ago in major update + maxWorkerArgs handling was accidentally lost 3 years ago in major update New description: When i had a pass through DynFlags I've noticed dead code: https://phabricator.haskell.org/D1727 maxWorkerArgs handling was accidentally lost 3 years ago in major update changeset:0831a12ea2fc73c33652eeec1adc79fa19700578 The consensus is to try to put option handling back. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * keywords: newcomer => * type: bug => feature request -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag
-------------------------------------+-------------------------------------
Reporter: slyfox | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by slyfox):
Current motivating example to fix it is DynFlags example itself.
I was profiling perf build of GHC and noticed a function that pushes
whole DynFlags from stack to heap. This small piece of code emits
10 pages of mov instructions.
https://git.haskell.org/ghc.git/blob/HEAD:/compiler/nativeGen/AsmCodeGen.hs#...
{{{#!hs
1086 cmmExprNative :: ReferenceKind -> CmmExpr -> CmmOptM CmmExpr
1087 cmmExprNative referenceKind expr = do
1088 dflags <- getDynFlags
1089 let platform = targetPlatform dflags
1090 arch = platformArch platform
1091 case expr of
...
1106 CmmLit (CmmLabel lbl)
1107 -> do
1108 cmmMakeDynamicReference dflags referenceKind lbl
...
}}}
{{{
│ cmmExprNative :: ReferenceKind -> CmmExpr -> CmmOptM CmmExpr
│ cmmExprNative referenceKind expr = do
0,11 │ cmp $0x3,%rax
│ ↑ jb 3ceb930 ]
ww86_al2o
ww87_al2p ww88_al2q ww89_al2r ww90_al2s ww91_al2t
ww92_al2u
ww93_al2v ww94_al2w ww95_al2x ww96_al2y ww97_al2z
ww98_al2A
ww99_al2B ww100_al2C ww101_al2D ww102_al2E ww103_al2F
ww104_al2G
ww105_al2H ww106_al2I ww107_al2J ww108_al2K ww109_al2L
ww110_al2M
ww111_al2N ww112_al2O ww113_al2P ww114_al2Q ww115_al2R
ww116_al2S [Dmd=

#11565: Restore code to handle '-fmax-worker-args' flag
-------------------------------------+-------------------------------------
Reporter: slyfox | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by slyfox):
And '''dumpIfSet_dyn''' (used across the GHC including AsmCodegen) is
exported as a 141-ary function (along with 5-ary function):
{{{#!hs
$ inplace/bin/ghc-stage1 --show-iface
compiler/stage2/build/ErrUtils.dyn_hi
...
31b85108354ff085ace45a61abe9a220
$wdumpIfSet_dyn ::
GhcMode
-> GhcLink
-> HscTarget
-> Settings
-> SigOf
-> Int
-> Int
-> Int
-> Int
...
-> SDoc
-> State# RealWorld
-> (# State# RealWorld, () #)
{- Arity: 140,
Strictness:
}}}
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:5
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler,
Inline: [0] -}
...
dumpIfSet_dyn :: DynFlags -> DumpFlag -> String -> SDoc -> IO ()
{- Arity: 5,
Strictness:
,
Unfolding: InlineRule (0, True, True)
dumpIfSet_dyn1
`cast`
(<DynFlags>_R
->_R <DumpFlag>_R
->_R <String>_R
->_R <SDoc>_R
->_R Sym (N:IO[0] <()>_R)) -}

#11565: Restore code to handle '-fmax-worker-args' flag
-------------------------------------+-------------------------------------
Reporter: slyfox | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by slyfox):
Here comes minimal example for one direction: heap to stack.
The trigger is a function with many USED record fields: show in this case.
In case of DynFlags it's a full (or large) subset of fields used
in various GHC subsystems.
{{{#!hs
-- A.hs
module A(D) where
-- like DynFlgs in GHC
data D = D { f_00, f_01, f_02, f_03, f_04
, f_10, f_11, f_12, f_13, f_14
, f_20, f_21, f_22, f_23, f_24
, f_30, f_31, f_32, f_33, f_34
, f_40, f_41, f_42, f_43, f_44
, f_50, f_51, f_52, f_53, f_54
, g_00, g_01, g_02, g_03, g_04
, g_10, g_11, g_12, g_13, g_14
, g_20, g_21, g_22, g_23, g_24
, g_30, g_31, g_32, g_33, g_34
, g_40, g_41, g_42, g_43, g_44
, g_50, g_51, g_52, g_53, g_54
, h_00, h_01, h_02, h_03, h_04
, h_10, h_11, h_12, h_13, h_14
, h_20, h_21, h_22, h_23, h_24
, h_30, h_31, h_32, h_33, h_34
, h_40, h_41, h_42, h_43, h_44
, h_50, h_51, h_52, h_53, h_54
, i_00, i_01, i_02, i_03, i_04
, i_10, i_11, i_12, i_13, i_14
, i_20, i_21, i_22, i_23, i_24
, i_30, i_31, i_32, i_33, i_34
, i_40, i_41, i_42, i_43, i_44
, i_50, i_51, i_52, i_53, i_54 :: Int
} deriving Show
}}}
{{{#!hs
-- B.hs
module B (tiny_foo) where
import qualified A
tiny_foo :: A.D -> Bool
tiny_foo d = null (show d)
}}}
Let's look at the size of module B on '''-O0''' and '''-O1''' while A is
compiled -O2:
'''-O0''', no unboxing happens.
{{{#!hs
$ ghc -c -O2 A.hs && ghc -c -O0 B.hs -ddump-stg -fforce-recomp
compilation IS NOT required
==================== STG syntax: ====================
$trModule1_r3JD :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Str=DmdType, Unf=OtherCon []] =
NO_CCS GHC.Types.TrNameS! ["main"#];
$trModule2_r3P0 :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Str=DmdType, Unf=OtherCon []] =
NO_CCS GHC.Types.TrNameS! ["B"#];
B.$trModule :: GHC.Types.Module
[GblId, Caf=NoCafRefs, Str=DmdType, Unf=OtherCon []] =
NO_CCS GHC.Types.Module! [$trModule1_r3JD $trModule2_r3P0];
B.tiny_foo :: A.D -> GHC.Types.Bool
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []] =
\r srt:SRT:[r30 :-> A.$fShowD,
rAj :-> Data.Foldable.$fFoldable[]] [d_s3P4]
let {
sat_s3P5 [Occ=Once] :: [GHC.Types.Char]
[LclId, Str=DmdType] =
\u srt:SRT:[r30 :-> A.$fShowD] [] GHC.Show.show A.$fShowD
d_s3P4;
} in Data.Foldable.null Data.Foldable.$fFoldable[] sat_s3P5;
}}}
'''-O1''', unboxing hapened:
{{{#!hs
$ ghc -c -O2 A.hs && ghc -c -O1 B.hs -ddump-stg -fforce-recomp
compilation IS NOT required
==================== STG syntax: ====================
B.$trModule2 :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Str=DmdType m1, Unf=OtherCon []] =
NO_CCS GHC.Types.TrNameS! ["main"#];
B.$trModule1 :: GHC.Types.TrName
[GblId, Caf=NoCafRefs, Str=DmdType m1, Unf=OtherCon []] =
NO_CCS GHC.Types.TrNameS! ["B"#];
B.$trModule :: GHC.Types.Module
[GblId, Caf=NoCafRefs, Str=DmdType m, Unf=OtherCon []] =
NO_CCS GHC.Types.Module! [B.$trModule2 B.$trModule1];
B.$wtiny_foo [InlPrag=[0]]
:: GHC.Types.Int
-> GHC.Types.Int
-> GHC.Types.Int
...
-> GHC.Types.Int
-> GHC.Types.Bool
[GblId,
Arity=120,
Str=DmdType
,
Unf=OtherCon []] =
\r srt:SRT:[r47O :-> B.$wtiny_foo] [w_s49R]
case w_s49R of _ [Occ=Dead] {
A.D ww1_s49T [Occ=Once]
ww2_s49U [Occ=Once]
ww3_s49V [Occ=Once]
ww4_s49W [Occ=Once]
ww5_s49X [Occ=Once]
...
ww118_s4bM [Occ=Once]
ww119_s4bN [Occ=Once]
ww120_s4bO [Occ=Once] ->
B.$wtiny_foo
ww1_s49T
ww2_s49U
ww3_s49V
ww4_s49W
ww5_s49X
ww6_s49Y
ww7_s49Z
ww8_s4a0
ww9_s4a1
ww10_s4a2
ww11_s4a3
ww12_s4a4
ww13_s4a5
ww14_s4a6
ww15_s4a7
ww16_s4a8
}}}
This causes a lot of 'mov' instructions from heap to stack to be generated
at each callsite.
In this case it's 9 pages:
{{{#!hs
$ ghc -c -O2 A.hs && ghc -c -O1 B.hs -ddump-asm -fforce-recomp
...
movq %rbx,856(%rbp)
movq 880(%rbp),%rbx
movq %rbx,864(%rbp)
movq 888(%rbp),%rbx
movq %rbx,872(%rbp)
movq 896(%rbp),%rbx
movq %rbx,880(%rbp)
movq 904(%rbp),%rbx
movq %rbx,888(%rbp)
movq %rax,896(%rbp)
movq $GHC.Types.[]_closure+1,904(%rbp)
addq $-24,%rbp
jmp A.$w$cshowsPrec_info
}}}
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:6
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Yes, it's bad for worker/wrapper to generate a worker function with a vast number of arguments. Some limit in the worker/wrapper generator would be a Good Thing. Should not be too hard. Unlike the old days, we don't need to trim the strictness signature. In the old days, the strictness signature was used by importing modules to generate an appropriate wrapper; but now the wrapper is conveyed by an ordinary inlining. So there is just one place the choice is made, namely when generating the worker/wrapper split. I can advise if someone wants to try this Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag
-------------------------------------+-------------------------------------
Reporter: slyfox | Owner:
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Sergei Trofimovich

#11565: Restore code to handle '-fmax-worker-args' flag -------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: feature request | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 7.10.3 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by slyfox): * status: new => closed * failure: None/Unknown => Runtime performance bug * resolution: => fixed * milestone: => 8.2.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11565#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11565: Restore code to handle '-fmax-worker-args' flag
-------------------------------------+-------------------------------------
Reporter: slyfox | Owner:
Type: feature request | Status: closed
Priority: normal | Milestone: 8.2.1
Component: Compiler | Version: 7.10.3
Resolution: fixed | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Sergei Trofimovich
participants (1)
-
GHC