
Using "ghc -O2" while tuning a class instance for performance, I obtained a 13% speedup by applying the transformation
instance (Ord a, Num b) ⇒ Sum PSum a b where empty = empty insert = insert union = union unions = unions extractMin = extractMin fromList = fromList toList = toList map = map mapMaybe = mapMaybe
and defining the instance functions outside the instance declaration, rather than inside the instance declaration. Conceptually, I understand this as follows: After this transformation, none of the recursive calls have to go through the class dictionary. Is this a transformation that ghc could automatically apply while optimizing? It is clear at compile time that the recursive calls are from this instance. Every 13% helps. (My example is adapting a pairing heap to sums of terms, e.g. to define an algebra from a small category. So far, I can't beat Data.Map, but I'm not done tuning. Other examples might not show the same performance increase from this transformation.)