Help needed with optimization

23 Jan 2024

      Hello GHC devs,

I'm trying to understand why my code is not being optimized in the way
I would expect.  I'm completely stuck and I think I need the advice of
an expert.

I'm writing an effect system on top of transformers.  The effect
system wraps monad transformers in a newtype that encodes the
composition structure of the transformers at the type level.  Because
it's a newtype all of the class members are inherited directly from
the underlying type using coerce.  When I implement something using
this effect system I would expect to generate exactly the same code as
if I had written it using transformers directly.  However, it
generates significantly worse code, even in a very simple case.

Firstly, a case where I do get the same code.  All of these compile to
the constant 1.  Hooray!

https://github.com/tomjaguarpaw/ad/blob/cd0d876ddb448fe611515e8768dee66dc02e...

Secondly, a simple cases where I do not get the same code.  `mySumMTL`
and `mySumNewtype` yield the same code, as expected.  After all,
`mySumNewtype` does exactly the same thing as `mySumMTL`, it's just
wrapped in some newtypes.  However, `mySumEff` yields worse code,
despite *also* being the same thing as `mySumMTL` just wrapped in some
newtypes.

https://github.com/tomjaguarpaw/ad/blob/cd0d876ddb448fe611515e8768dee66dc02e...

You can compare the generated loops at:

https://github.com/tomjaguarpaw/ad/blob/cd0d876ddb448fe611515e8768dee66dc02e...

Does anyone have a clue what's going wrong in the optimizer here?  I
don't think the singleton that I pass around to access the type level
index at runtime has anything to do with it.  That seems to be
optimized away by inlining.  Is the simplifier confused by all the
coercions?

Thanks for any help anyone may be able to shed,

Tom

Help needed with optimization

Tom Ellis