[GHC] #14955: Musings on manual type class desugaring

#14955: Musings on manual type class desugaring -------------------------------------+------------------------------------- Reporter: mpickering | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Keywords: SpecConstr | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I recently wrote a short post explaining why manual type class desugaring was different to actually writing a type class because of how they are optimised. http://mpickering.github.io/posts/2018-03-20-recordsvstypeclasses.html I implement 4 different equivalent programs which are all optimised differently. I paste the whole file below as it is not very big. Implementation 1 is in terms of a type class. Implementation 2 is in terms of explicit dictionary passing. Implementation 3 wraps a dictionary in a type class Implementation 4 wraps a dictionary in a type class with an additional dummy argument. Naively, a user would expect all 4 implementations to be as fast as each other. {{{ {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE AllowAmbiguousTypes #-} module Prop where import Prelude (Bool(..), (||), (&&)) -- Implementation 1 class Prop r where or :: r -> r -> r and :: r -> r -> r true :: r false :: r instance Prop Bool where or = (||) and = (&&) true = True false = False -- Implementation 2 data PropDict r = PropDict { dor :: r -> r -> r , dand :: r -> r -> r , dtrue :: r , dfalse :: r } boolDict = PropDict { dor = (||) , dand = (&&) , dtrue = True , dfalse = False } -- Implementation 3 class PropProxy r where propDict :: PropDict r instance PropProxy Bool where propDict = boolDict -- Implementation 4 class PropProxy2 r where propDict2 :: PropDict r dummy :: () instance PropProxy2 Bool where propDict2 = boolDict dummy = () ors :: Prop r => [r] -> r ors [] = true ors (o:os) = o `or` ors os {-# INLINABLE ors #-} dors :: PropDict r -> [r] -> r dors pd [] = dtrue pd dors pd (o:os) = dor pd o (dors pd os) pors :: PropProxy r => [r] -> r pors [] = dtrue propDict pors (o:os) = dor propDict o (pors os) {-# INLINABLE pors #-} porsProxy :: PropProxy2 r => [r] -> r porsProxy [] = dtrue propDict2 porsProxy (o:os) = dor propDict2 o (porsProxy os) {-# INLINABLE porsProxy #-} }}} Then using the 4 different implementations of `ors` in another module implementations 1 and 4 are fast whilst 2 and 3 are slow. https://github.com/mpickering/rtcwrao-benchmarks/blob/master/Prop2.hs {{{ benchmarking tc/Implementation 1 time 3.510 ms (3.509 ms .. 3.512 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 2.976 ms (2.886 ms .. 3.060 ms) std dev 241.1 μs (195.4 μs .. 293.1 μs) variance introduced by outliers: 51% (severely inflated) benchmarking tc/Implementation 2 time 25.05 ms (21.16 ms .. 30.43 ms) 0.912 R² (0.849 R² .. 0.984 R²) mean 19.18 ms (16.20 ms .. 21.45 ms) std dev 5.627 ms (4.710 ms .. 6.618 ms) variance introduced by outliers: 89% (severely inflated) benchmarking tc/Implementation 3 time 20.06 ms (15.33 ms .. 23.57 ms) 0.856 R² (0.755 R² .. 0.934 R²) mean 18.43 ms (16.92 ms .. 19.85 ms) std dev 3.490 ms (3.003 ms .. 4.076 ms) variance introduced by outliers: 74% (severely inflated) benchmarking tc/Implementation 4 time 3.498 ms (3.484 ms .. 3.513 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 3.016 ms (2.935 ms .. 3.083 ms) std dev 205.7 μs (162.6 μs .. 261.8 μs) variance introduced by outliers: 42% (moderately inflated) }}} I compiled the module with `-O2`. If I turn off `-fno-worker-wrapper` and `-fno-spec-constr` then implementation 3 is also fast. Implementation 2 is always slow. This ticket is querying what could be done to improve the robustness of these different refactorings. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14955 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14955: Musings on manual type class desugaring
-------------------------------------+-------------------------------------
Reporter: mpickering | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Resolution: | Keywords: SpecConstr
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones

#14955: Musings on manual type class desugaring
-------------------------------------+-------------------------------------
Reporter: mpickering | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Resolution: | Keywords: SpecConstr
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
(2) is slow becuase we don't (yet) do cross-module !SpecConstr. If `dors`
was `INLINEABLE` then in principle !SpecConstr could specialise it in
`Main`, in a similar way to `Specialise`. But that is a whole new thing.
(A very good thing, incidentally: that is Trac #10346.
(3) jolly well ought to be fast. That's a bug. Here is what is
happening.
The `INLINABLE` pragma on `pors` (for (3)) or `porsProxy` (for (4)) should
arrange that these functions can be speicalised in a separate module. But
in fact `pors` is worker/wrappered by the demand analyser! We get this in
the .hi file:
{{{
pors :: PropProxy r => [r] -> r
{- Arity: 2, HasNoCafRefs,
Strictness: , Inline: [0],
Unfolding: InlineRule (2, True, False)
(\ @ r (w :: PropProxy r) (w1 :: [r]) ->
case w `cast`
(N:PropProxy[0] <r>_N) of ww { PropDict ww1 ww2 ww3
ww4 ->
$wpors @ r ww1 ww3 w1 }) -}
porsProxy :: PropProxy2 r => [r] -> r
{- Arity: 2, HasNoCafRefs,
Strictness: , Inline:,
Unfolding(loop-breaker): <stable> (\ @ r
($dPropProxy2 :: PropProxy2 r)
(ds :: [r]) ->
case ds of wild {
[]
-> case propDict2
@ r
$dPropProxy2 of wild1
{ PropDict ds1 ds2 ds3 ds4 ->
ds3 }
: o os
-> case propDict2
@ r
$dPropProxy2 of wild1
{ PropDict ds1 ds2 ds3 ds4 ->
ds1 o (porsProxy @ r
$dPropProxy2 os) } }) -}
}}}
So `porsProxy` can be specialised in the calling file -- good. And so
can `pors` -- but the specialised version of its code is silly, just a
call to the un-specialised `$wpors`; still a higher order function,
which is bad bad bad.
But I fixed this more 6 years ago. See this Note in `WwLib`:
{{{
Note [Do not unpack class dictionaries]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If we have
f :: Ord a => [a] -> Int -> a
{-# INLINABLE f #-}
and we worker/wrapper f, we'll get a worker with an INLINABLE pragma
(see Note [Worker-wrapper for INLINABLE functions] in WorkWrap), which
can still be specialised by the type-class specialiser, something like
fw :: Ord a => [a] -> Int# -> a
BUT if f is strict in the Ord dictionary, we might unpack it, to get
fw :: (a->a->Bool) -> [a] -> Int# -> a
and the type-class specialiser can't specialise that. An example is
Trac #6056.
Moreover, dictionaries can have a lot of fields, so unpacking them can
increase closure sizes.
Conclusion: don't unpack dictionaries.
}}}
So why isn't this preventing the w/w for `pors`? Because the bit that
prevents the unpacking is here:
{{{
deepSplitProductType_maybe fam_envs ty
| let (co, ty1) = topNormaliseType_maybe fam_envs ty
`orElse` (mkRepReflCo ty, ty)
, Just (tc, tc_args) <- splitTyConApp_maybe ty1
, Just con <- isDataProductTyCon_maybe tc
, not (isClassTyCon tc) -- See Note [Do not unpack class dictionaries]
, let arg_tys = dataConInstArgTys con tc_args
strict_marks = dataConRepStrictness con
= Just (con, tc_args, zipEqual "dspt" arg_tys strict_marks, co)
}}}
Ha! We do `topNormaliseType_maybe` before checking `isClassTyCon`. And
when the class is a newtype, that `topNormaliseType_maybe` unwraps it.
It's easy to fix: just move the test earlier. Patch coming -- but maybe
after next week when I'm on holiday.
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14955#comment:2
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#14955: Musings on manual type class desugaring
-------------------------------------+-------------------------------------
Reporter: mpickering | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Resolution: | Keywords: SpecConstr
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones

#14955: Musings on manual type class desugaring -------------------------------------+------------------------------------- Reporter: mpickering | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: fixed | Keywords: SpecConstr Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * status: new => closed * resolution: => fixed Comment: Thanks for highlighting this, Matthew. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14955#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14955: Musings on manual type class desugaring
-------------------------------------+-------------------------------------
Reporter: mpickering | Owner: (none)
Type: bug | Status: closed
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Resolution: fixed | Keywords: SpecConstr
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones
participants (1)
-
GHC