
#14941: Switching direct type family application to EqPred (~) prevents inlining in code using vector (10x slowdown) -------------------------------------+------------------------------------- Reporter: nh2 | Owner: davide Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by davide): == Regarding simple example `f :: forall a. (a ~ Int) => a -> a`, the difference in performance is somewhat expected. This may be a different issue than the example given in the ticket description. In short, `a ~ Int` is a proof that type `a` is equal to type `Int`. In core, `a ~ Int` is a regular ''boxed'' GADT meaning that it could be bottom i.e. an invalid prove (this is the main mechanism behind [https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.... #deferring-type-errors-to-runtime -fdefer-type-errors]). Unboxing `a ~ b` at corresponds to checking the proof which is required to coerce the input binding from `a` to `Int`. Normally the `(a ~ Int)` would be optimized away (as described [http://dreixel.net/research/pdf/epdtecp.pdf here] in section 7.3), but that requires a worker wrapper transformation that never happens. Removing `NOINLINE` allows `f` to be optimized across modules, which closes the performance gap. == Regarding original example Unlike my simple example, all the code is in one module, so I expect the equality proof `VG.Mutable v ~ vm` to be optimized away (again see [http://dreixel.net/research/pdf/epdtecp.pdf here] section 7.3). With ghc 3.2.2, when compiling the slow version, I see `selectVectorDestructive2` is specialized to `$sselectVectorDestructive2 :: Int -> Vector Int -> MVector (PrimState IO) Int -> Int -> Int -> IO ()` (pass 2). This is good, but for some reason myread and partitionLoop2 are not specialized even though they are used by `$sselectVectorDestructive2`: {{{#!haskell $sselectVectorDestructive2 = ... let $dMVector = Data.Vector.Generic.Base.$p1Vector @Vector @Int Data.Vector.Unboxed.Base.$fVectorVectorInt in ... (Main.myread @IO @MVector @Int Control.Monad.Primitive.$fPrimMonadIO $dMVector GHC.Classes.$fOrdInt GHC.Show.$fShowInt v begin) ... (Main.partitionLoop2 @IO @MVector @Int Control.Monad.Primitive.$fPrimMonadIO $dMVector GHC.Classes.$fOrdInt GHC.Show.$fShowInt v begin pivot (GHC.Types.I# ...) }}} In the fast version, myread and partitionLoop2 are specialized in this pass. I noticed 2 other differences: * fast version floats `$dMVector` to a top level binding. * fast version specializes to `Mutable Vector (PrimState IO) Int` instead of `MVector (PrimState IO) Int`. Note `Mutable` is a type family and `Mutable Vector = MVector` -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14941#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler