
#15192: Refactor of Coercion -------------------------------------+------------------------------------- Reporter: ningning | Owner: ningning Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4747 Wiki Page: | -------------------------------------+------------------------------------- Comment (by ningning): I did some experiments recently. What I did: - add back {{{CoherenceCo}}} and related functions, such as {{{mkCoherenceLeftCo}}} (renaming it to {{{mkCoherenceLeftCo'}}}). - replace uses of functions defined in terms of {{{GRefl}}} to corresponding functions defined in terms of {{{CoherenceCo}}}; for example, replace `mkCoherenceLeftCo` with `mkCoherenceLeftCo'` - test allocation after each change A general observation is, {{{mkCoherenceLeftCo'}}} saves more allocation than {{{mkCoherenceLeftCo}}} when {{{kind_co}}} is not reflexive (similarly for {{{mkCoherenceRightCo}}}) {{{#!hs -- | Given @ty :: k1@, @co :: k1 ~ k2@, @co2:: ty ~ ty'@, -- produces @co' :: (ty |> co) ~r ty' mkCoherenceLeftCo r ty co co2 | isGReflCo co = co2 | otherwise = (mkSymCo co $ GRefl r ty (MCo co)) `mktransCo` co2 -- stores an extra r and ty mkCoherenceLeftCo' co (Refl _) = co mkCoherenceLeftCo' (CoherenceCo co1 co2) co3 = CoherenceCo co1 (co2 `mkTransCo` co3) mkCoherenceLeftCo' co1 co2 = CoherenceCo co1 co2 }}} In the test case {{{T9872d}}}, {{{TcFlatten.homogenise_result}}} is called 340k+ times, which means {{{mkCoherenceLeftCo}}} is called 340k+ times. (not exactly as {{{mkCoherenceLeftCo}}} is inlined by hand in {{{TcFlatten.homogenise_result}}} now. Bur morally it is still true.) If I leave everything unchanged, except for using {{{mkCoherenceLeftCo'}}} instead of {{{mkCoherenceLeftCo}}} in {{{TcFlatten.homogenise_result}}}, I save allocation in {{{T9872d}}} by ~2.5% and it passes the test before the regression. (I also tried to replace the suspicious use of {{{zipWith3}}} in {{{TcFlatten}}} with the original implementation by {{{zipWith}}}, but it save little allocation.) I propose to update {{{Note [flatten_exact_fam_app_fully performance]}}} in {{{TcFlatten}}} to include the analysis. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15192#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler