Inlining and generic programming

Hello all, First of all, I'm sorry that this email is so absurdly long. But it's not easy to explain the problem at hand, so I took a step-by-step approach. The executive summary is: GHC can do a *great *job with inlining, but often it doesn't, and I don't understand why. So I have some questions, which are highlighted in the text below. In general, any insights regarding inlining or improving the performance of generics are welcome. My final goal is to be able to state that generic functions (in particular using GHC.Generics) will have no runtime overhead whatsoever when compared to a handwritten type-specific version. The setting Generic programming is based on representing datatypes in a uniform way using a small set of representation types. Functions defined on those representation types can then be applied to all datatypes, because we can convert between datatypes and their representations. However, generic functions tend to be slower than their specialised counterparts, because they have to deal with the conversions. But clever inlining (together with other compiler optimisations) can completely remove this overhead. The problem I'm tackling is how to tell GHC exactly what it should in the particular case of optimisation of generic code. Simplified example I'll focus on the problem of optimising a non-trivial function for generic enumeration of terms. My experience shows that GHC does quite good at optimising simple functions, especially consumers (like generic equality). But producers are trickier. First, we'll need some auxiliary functions: -- | Interleave elements from two lists. Similar to (++), but swap left and
-- right arguments on every recursive application. -- -- From Mark Jones' talk at AFP2008 {-# NOINLINE (|||) #-} (|||) :: [a] -> [a] -> [a] [] ||| ys = ys (x:xs) ||| ys = x : ys ||| xs
-- | Diagonalization of nested lists. Ensure that some elements from every -- sublist will be included. Handles infinite sublists. -- -- From Mark Jones' talk at AFP2008 {-# NOINLINE diag #-} diag :: [[a]] -> [a] diag = concat . foldr skew [] . map (map (\x -> [x]))
skew :: [[a]] -> [[a]] -> [[a]] skew [] ys = ys skew (x:xs) ys = x : combine (++) xs ys
combine :: (a -> a -> a) -> [a] -> [a] -> [a] combine _ xs [] = xs combine _ [] ys = ys combine f (x:xs) (y:ys) = f x y : combine f xs ys
The particular implementation of these functions doesn't really matter. What's important is that we have a way to interleave lists (|||) and a way to diagonalise a matrix into a list (diag). We mark these functions as NOINLINE because inlining them will only make the core code more complicated (and may prevent rules from firing). Suppose we have a type of Peano natural numbers: data Nat = Ze | Su Nat deriving Eq
Implementing enumeration on this type is simple: enumNat :: [Nat]
enumNat = [Ze] ||| map Su enumNat
Now, a generic representation of Nat in terms of sums and products could look something like this: type RepNat = Either () Nat
That is, either a singleton (for the Ze case) or a Nat (for the Su case). Note that I am building a shallow representation, since at the leaves we have Nat, and not RepNat. This mimics the situation with current generic programming libraries (in particular GHC.Generics). We'll need a way to convert between RepNat and Nat: toNat :: RepNat -> Nat
toNat (Left ()) = Ze toNat (Right n) = Su n
fromNat :: Nat -> RepNat fromNat Ze = Left () fromNat (Su n) = Right n
(In fact, since we're only dealing with a generic producer we won't need the fromNat function.) To get an enumeration for RepNat, we first need to know how to enumerate units and sums: enumU :: [()]
enumU = [()]
enumEither :: [a] -> [b] -> [Either a b] enumEither ea eb = map Left ea ||| map Right eb
Now we can define an enumeration for RepNat: enumRepNat :: [RepNat]
enumRepNat = enumEither enumU enumNatFromRep
With the conversion function toNat, we can use enumRepNat to get an enumeration for Nat directly: enumNatFromRep :: [Nat]
enumNatFromRep = map toNat enumRepNat
First, convince yourself that enumNatFromRep and enumNat are equivalent functions: take 100 enumNat == take 100 enumNatFromRep
Now, what I want is that enumNatFromRep generates the same core code as enumNat. That should be possible; here are the necessary steps: map toNat enumRepNat
== { inline enumRepNat }
map toNat (enumEither enumU enumNatFromRep)
== { inline enumEither }
map toNat (map Left enumU ||| map Right enumNatFromRep)
== { inline enumU }
map toNat (map Left [()] ||| map Right enumNatFromRep)
== { inline map }
map toNat ([Left ()] ||| map Right enumNatFromRep)
== { free theorem (|||): forall f a b. map f (a ||| b) = map f a ||| map f b }
map toNat [Left ()] ||| map toNat (map Right enumNatFromRep)
== { inline map }
[toNat (Left ())] ||| map toNat (map Right enumNatFromRep)
== { definition of toNat (or inline toNat + case of constant) }
[Ze] ||| map toNat (map Right enumNatFromRep)
== { functor composition law: forall f g l. map f (map g l) = map (f . g) l }
[Ze] ||| map (toNat . Right) enumNatFromRep
== { definition of toNat (or inline toNat + case of constant) }
[Ze] ||| map Su enumNatFromRep
Now let's see what the compiler generates. I'm using GHC-7.4.1. Let's compile with -O1 and use -ddump-simpl to see the final simplifier output (core code) for enumNatFromRep: EnumAlone.enumNatFromRep :: [EnumAlone.Nat]
[GblId, Str=DmdType, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False, ConLike=False, Cheap=False, Expandable=False, Guidance=IF_ARGS [] 30 0}] EnumAlone.enumNatFromRep = GHC.Base.map @ EnumAlone.RepNat @ EnumAlone.Nat EnumAlone.toNat EnumAlone.enumRepNat
EnumAlone.enumRepNat [Occ=LoopBreaker] :: [EnumAlone.RepNat] [GblId, Str=DmdType] EnumAlone.enumRepNat = EnumAlone.||| @ (Data.Either.Either () EnumAlone.Nat) lvl4_rvV lvl5_rvW
Ah, it didn't even inline enumRepNat because it made it a loop breaker. We certainly want to inline it, so let's add a pragma: {-# INLINE enumRepNat #-}
Recompiling, we get: EnumAlone.enumRepNat [InlPrag=INLINE (sat-args=0)]
:: [EnumAlone.RepNat] [GblId, Str=DmdType, Unf=Unf{Src=InlineStable, TopLvl=True, Arity=0, Value=False, ConLike=False, Cheap=False, Expandable=False, Guidance=ALWAYS_IF(unsat_ok=False,boring_ok=False) Tmpl= EnumAlone.enumEither @ () @ EnumAlone.Nat EnumAlone.enumU EnumAlone.enumNatFromRep}] EnumAlone.enumRepNat = EnumAlone.||| @ (Data.Either.Either () EnumAlone.Nat) lvl4_rvV lvl5_rvW
EnumAlone.enumNatFromRep [Occ=LoopBreaker] :: [EnumAlone.Nat] [GblId, Str=DmdType] EnumAlone.enumNatFromRep = GHC.Base.map @ EnumAlone.RepNat @ EnumAlone.Nat EnumAlone.toNat EnumAlone.enumRepNat
So no real difference in the generated code, other than the reassignment of loop breakers. For some reason enumRepNat still doesn't get inlined. *Question: why won't GHC inline enumRepNat, even when I tell it to do so with an INLINE pragma?* Well, let's inline it ourselves, then. We redefine enumNatFromRep to: enumNatFromRep = map toNat (enumEither enumU enumNatFromRep)
This however doesn't help much. We get the following core: lvl6_rw5 :: [Data.Either.Either () EnumAlone.Nat]
[GblId] lvl6_rw5 = EnumAlone.||| @ (Data.Either.Either () EnumAlone.Nat) lvl4_rw3 lvl5_rw4
EnumAlone.enumNatFromRep [Occ=LoopBreaker] :: [EnumAlone.Nat] [GblId, Str=DmdType] EnumAlone.enumNatFromRep = GHC.Base.map @ EnumAlone.RepNat @ EnumAlone.Nat EnumAlone.toNat lvl6_rw5
GHC is really keen on floating that (|||) out. Let's be very explicit about inlining: {-# INLINE toNat #-}
{-# INLINE enumU #-} {-# INLINE enumEither #-}
Also, maybe it's floating it out because it doesn't have anything else to do to it. Let's add the free theorem of (|||) as a rule: {-# RULES "ft |||" forall f a b. map f (a ||| b) = map f a ||| map f b #-}
We needed this in our manual derivation, so GHC should need it too. Recompiling, we see we've made some progress: lvl5_ryv :: [Data.Either.Either () EnumAlone.Nat]
[GblId] lvl5_ryv = GHC.Base.map @ EnumAlone.Nat @ (Data.Either.Either () EnumAlone.Nat) (Data.Either.Right @ () @ EnumAlone.Nat) EnumAlone.enumNatFromRep
lvl6_ryw :: [EnumAlone.Nat] [GblId] lvl6_ryw = GHC.Base.map @ (Data.Either.Either () EnumAlone.Nat) @ EnumAlone.Nat EnumAlone.toNat lvl5_ryv
EnumAlone.enumNatFromRep [Occ=LoopBreaker] :: [EnumAlone.Nat] [GblId, Str=DmdType] EnumAlone.enumNatFromRep = EnumAlone.||| @ EnumAlone.Nat lvl4_ryu lvl6_ryw
enumNatFromRep finally starts with (|||) directly. But its second argument, lvl6_ryw, is a map of lvl5_ryv, which is itself a map! At this stage I expected GHC to be aware of the fusion law for map, but it seems that it isn't. *Question: why is map fusion not happening automatically?* Let's add it as a rule: {-# RULES "map/map1" forall f g l. map f (map g l) = map (f . g) l #-}
And now we're in a much better situation: lvl3_ryD :: [Data.Either.Either () EnumAlone.Nat]
[GblId, Caf=NoCafRefs] lvl3_ryD = GHC.Types.: @ (Data.Either.Either () EnumAlone.Nat) EnumAlone.fromNat1 (GHC.Types.[] @ (Data.Either.Either () EnumAlone.Nat))
lvl4_ryE :: [EnumAlone.Nat] [GblId] lvl4_ryE = GHC.Base.map @ (Data.Either.Either () EnumAlone.Nat) @ EnumAlone.Nat EnumAlone.toNat lvl3_ryD
lvl5_ryF :: [EnumAlone.Nat] [GblId] lvl5_ryF = GHC.Base.map @ EnumAlone.Nat @ EnumAlone.Nat EnumAlone.Su EnumAlone.enumNatFromRep
EnumAlone.enumNatFromRep [Occ=LoopBreaker] :: [EnumAlone.Nat] [GblId, Str=DmdType] EnumAlone.enumNatFromRep = EnumAlone.||| @ EnumAlone.Nat lvl4_ryE lvl5_ryF
Note how toNat is entirely gone from the second part of the enumeration (lvl5_ryF). Strangely enough, the enumerator for Ze (lvl4_ryE) is still very complicated: map toNat ([Left ()]). Why doesn't GHC simplify this to just [Ze]? Apparently because GHC doesn't simplify map over a single element list. *Question: why doesn't GHC optimise map f [x] to [f x]?* Let's tell it to do so: {-# RULES "map/map2" forall f x. map f (x:[]) = (f x):[] #-}
Now we're finally where we wanted: lvl_ryA :: [EnumAlone.Nat]
[GblId, Caf=NoCafRefs] lvl_ryA = GHC.Types.: @ EnumAlone.Nat EnumAlone.Ze (GHC.Types.[] @ EnumAlone.Nat)
lvl3_ryD :: [EnumAlone.Nat] [GblId] lvl3_ryD = GHC.Base.map @ EnumAlone.Nat @ EnumAlone.Nat EnumAlone.Su EnumAlone.enumNatFromRep
EnumAlone.enumNatFromRep [Occ=LoopBreaker] :: [EnumAlone.Nat] [GblId, Str=DmdType] EnumAlone.enumNatFromRep = EnumAlone.||| @ EnumAlone.Nat lvl_ryA lvl3_ryD
This is what I wanted: no more representation types (Either or ()), and the code looks exactly like what is generated for the handwritten enumNat. More realistic generic programming Now let's see if we can transport this to the setting of a generic programming library. I'll use a bare-bones version of GHC.Generics: infixr 5 :+:
infixr 6 :*:
data U = U deriving (Show, Read) data a :+: b = L a | R b deriving (Show, Read) data a :*: b = a :*: b deriving (Show, Read) newtype Var a = Var a deriving (Show, Read) newtype Rec a = Rec a deriving (Show, Read)
class Representable a where type Rep a to :: Rep a -> a from :: a -> Rep a
Let's represent Nat in this library: instance Representable Nat where
type Rep Nat = U :+: (Rec Nat) from Ze = L U from (Su n) = R (Rec n) to (L U) = Ze to (R (Rec n)) = Su n
(Note, in particular, that we do not need INLINE pragmas on the from/to methods. This might just be because GHC thinks these are small and inlines them anyway, but in general we want to make sure they are inlined, so we typically use pragmas there.) Now we need to implement enumeration generically. We do this by giving an instance for each representation type: class GEnum' a where
genum' :: [a]
instance GEnum' U where {-# INLINE genum' #-} genum' = [U]
instance (GEnum a) => GEnum' (Rec a) where {-# INLINE genum' #-} genum' = map Rec genum
instance (GEnum a) => GEnum' (Var a) where {-# INLINE genum' #-} genum' = map Var genum
instance (GEnum' f, GEnum' g) => GEnum' (f :+: g) where {-# INLINE genum' #-} genum' = map L genum' ||| map R genum'
instance (GEnum' f, GEnum' g) => GEnum' (f :*: g) where {-# INLINE genum' #-} --genum' = diag [ [ x :*: y | y <- genum' ] | x <- genum' ] genum' = diag (map (\x -> map (\y -> x :*: y) genum') genum')
We explicitly tell GHC to inline each case, as before. Note that for products I'm not using the more natural list comprehension syntax because I don't quite understand how that gets translated into core. In the cases for Var and Rec we use genum from the GEnum class: class GEnum a where
genum :: [a] {-# INLINE genum #-} default genum :: (Representable a, GEnum' (Rep a)) => [a] genum = map to genum'
GEnum' is the class used for instantiating the generic representation types, and GEnum is used for user types. We use a default signature to provide a default method that can be used when we have a Representable instance for the type in question. This makes instantiating Nat very easy: instance GEnum Nat
Unfortunately, the core code generated in this situation (with the same RULES as before) is not nice at all: Main.$fGEnumNat_$cgenum [Occ=LoopBreaker] :: [Base.Nat]
[GblId, Str=DmdType] Main.$fGEnumNat_$cgenum = GHC.Base.map @ (Base.Rep Base.Nat) @ Base.Nat Base.$fRepresentableNat_$cto (lvl37_r79y `cast` (Sym (GEnum.NTCo:GEnum')
; (GEnum.GEnum' (Sym (Base.TFCo:R:RepNat)) ; GEnum.NTCo:GEnum' ) :: [Base.C Base.Nat_Ze_ Base.U Base.:+: Base.C Base.Nat_Su_ (Base.Rec Base.Nat)] ~# [Base.Rep Base.Nat]))
We see a map of the `to` function, which is definitely not what we want. Oddly enough, if we give an explicit definition of genum for Nat, with the inlined default... instance GEnum Nat where genum = map to genum'
... then we get the optimised code we want: lvl34_r79p :: [Base.Nat]
[GblId, Caf=NoCafRefs] lvl34_r79p = GHC.Types.: @ Base.Nat Base.Ze (GHC.Types.[] @ Base.Nat)
lvl35_r79q :: [Base.Nat] [GblId] lvl35_r79q = GHC.Base.map @ Base.Nat @ Base.Nat Base.Su Main.$fGEnumNat_$cgenum
Main.$fGEnumNat_$cgenum [Occ=LoopBreaker] :: [Base.Nat] [GblId, Str=DmdType] Main.$fGEnumNat_$cgenum = GEnum.||| @ Base.Nat lvl34_r79p lvl35_r79q
Again, no representation types, no `to`, just the same code that is generated for enumNat. Perfect. But we had to avoid using the default definition, which is a pity. *Question: why won't GHC inline the default method of a class, even when I have a pragma telling it to do so?* Let's look at one more datatype, because Nat does not use products. So let's consider trees: data Tree a = Leaf | Bin a (Tree a) (Tree a)
instance Representable (Tree a) where type Rep (Tree a) = U :+: (Var a :*: Rec (Tree a) :*: Rec (Tree a)) from (Bin x l r) = R (Var x :*: Rec l :*: Rec r) from Leaf = L U to (R (Var x :*: (Rec l) :*: (Rec r))) = Bin x l r to (L U) = Leaf
We give a GEnum instance using the same trick as before: instance GEnum (Tree Int) where genum = map to genum'
(For simplicity only for trees of integers.) The generated code for trees is unfortunately not as nice: a2_r79M
:: [Base.Rec (Base.Tree GHC.Types.Int) Base.:*: Base.Rec (Base.Tree GHC.Types.Int)] [GblId, Str=DmdType] a2_r79M = GEnum.diag @ (Base.Rec (Base.Tree GHC.Types.Int) Base.:*: Base.Rec (Base.Tree GHC.Types.Int)) lvl8_r79L
lvl9_r79N :: [Base.Tree GHC.Types.Int] [GblId] lvl9_r79N = GHC.Base.map @ (Base.Rec (Base.Tree GHC.Types.Int) Base.:*: Base.Rec (Base.Tree GHC.Types.Int)) @ (Base.Tree GHC.Types.Int) lvl5_r79H a2_r79M
Main.$fGEnumTree_$cgenum [Occ=LoopBreaker] :: [Base.Tree GHC.Types.Int] [GblId, Str=DmdType] Main.$fGEnumTree_$cgenum = GEnum.||| @ (Base.Tree GHC.Types.Int) lvl4_r79G lvl9_r79N
Note how lvl9_r79N is a map over a2_r79M, and a2_r79M is a `diag`. Ok, we need the free theorem of `diag` to tell GHC how to commute the `diag` with map: {-# RULES "ft/diag" forall f l. map f (diag l) = diag (map (map f) l) #-}
Unfortunately this doesn't change the generated core code. With some more debugging looking at the generated code at each simplifier iteration, I believe that this is because a2_r79M got lifted out too soon, prevent the rule from applying. With some imagination I decided to try the -fno-full-laziness flag to prevent let-floating. I'm not sure this is a good idea in general, but in this particular case it gives much better results: a1_r72i :: [Base.Rec (Base.Tree GHC.Types.Int)]
[GblId, Str=DmdType] a1_r72i = GHC.Base.map @ (Base.Tree GHC.Types.Int) @ (Base.Rec (Base.Tree GHC.Types.Int)) ((\ (tpl_B1 :: Base.Tree GHC.Types.Int) -> tpl_B1) `cast` (
-> Sym (Base.NTCo:Rec ) :: (Base.Tree GHC.Types.Int -> Base.Tree GHC.Types.Int) ~# (Base.Tree GHC.Types.Int -> Base.Rec (Base.Tree GHC.Types.Int)))) Main.$fGEnumTree_$cgenum Main.$fGEnumTree_$cgenum [Occ=LoopBreaker] :: [Base.Tree GHC.Types.Int] [GblId, Str=DmdType] Main.$fGEnumTree_$cgenum = GEnum.||| @ (Base.Tree GHC.Types.Int) (GHC.Types.: @ (Base.Tree GHC.Types.Int) (Base.Leaf @ GHC.Types.Int) (GHC.Types.[] @ (Base.Tree GHC.Types.Int))) (GEnum.diag @ (Base.Tree GHC.Types.Int) (GHC.Base.map @ (Base.Rec (Base.Tree GHC.Types.Int)) @ [Base.Tree GHC.Types.Int] (\ (x_a1yQ :: Base.Rec (Base.Tree GHC.Types.Int)) -> GHC.Base.map @ (Base.Rec (Base.Tree GHC.Types.Int)) @ (Base.Tree GHC.Types.Int) (\ (x1_X1zB :: Base.Rec (Base.Tree GHC.Types.Int)) -> Base.Bin @ GHC.Types.Int a_r72h (x_a1yQ `cast` (Base.NTCo:Rec
:: Base.Rec (Base.Tree GHC.Types.Int) ~# Base.Tree GHC.Types.Int)) (x1_X1zB `cast` (Base.NTCo:Rec :: Base.Rec (Base.Tree GHC.Types.Int) ~# Base.Tree GHC.Types.Int))) a1_r72i) a1_r72i))
Note how our enum is now of the shape `[Leaf] ||| diag y`, which is good. The only catch is that there are some `Rec`s still laying around, with their associated newtype coercions, and a function a1_r72i that basically wraps the recursive enumeration in a Rec, only to be unwrapped in the body of `$fGEnumTree_$cgenum`. I don't know how to get GHC to simplify this code any further. *Question: why do I need -fno-full-laziness for the ft/diag rule to apply?* *Question: why is GHC not getting rid of the Rec newtype in this case?* I have also played with -O2, in particular because of the SpecConstr optimisation, but found that it does not affect these particular examples (perhaps it only becomes important with larger datatypes). I have also experimented with phase control in the rewrite rules and the inline pragmas, but didn't find it necessary for this example. In general, anyway, my experience with the inliner is that it is extremely fragile, especially across different GHC versions, and it's hard to get any guarantees of optimisation. I have also played with the -funfolding-* options before, with mixed results. [1] It's also a pity that certain flags are not explained in detail in the user's manual [2,3], like -fliberate-case, and -fspec-constr-count and threshold, for instance. Thank you for reading this. Any insights are welcome. In particular, I'm wondering if I might be missing some details regarding strictness. Cheers, Pedro [1] http://dreixel.net/research/pdf/ogie.pdf [2] http://www.haskell.org/ghc/docs/latest/html/users_guide/flag-reference.html [3] http://www.haskell.org/ghc/docs/latest/html/users_guide/options-optimise.htm...

Pedro Some responses to your long message! (Which I attach for background) Your example was unusual in that it used a lot of top-level definitions. GHC treats them slightly specially. Given: x = g 4 y = f x GHC does not transform into this: y = f (g 4) which it would do in a nested let. Why not? Because the latter will generate code that dynamically allocates a thunk for (g 4), while the former will make a static thunk. (An alternative would be to treat them uniformly and only pull out those nested thunks at the very last minute; but GHC doesn't do that right now.) A disadvantage is that it's not statically visible to the simplifier that x is used once. If we have a RULE for f (g n), it might not fire -- because of the worry that someone else might be sharing x. I think this is the root cause of much of your trouble. Incidentally , it makes no difference giving x an INLINE pragma. GHC is very cautious about duplicating non-values and currently not even INLINE will make it less cautious. That's another thing we could consider changing. I'll respond to part 2 (about generic programming) separately Simon

Pedro The second part of your message (attached), about generics, turned out to be *much* more subtle. You wondered why it made a difference whether you said instance GEnum Nat where genum = map to genum' or instance GEnum Nat -- Fill in from default method Well, it turns out that the difference is largely accidental. Here are the types of the functions involved: to :: Representable a => Rep a -> a genum' :: GEnum' a => [a] type instance Rep Nat = RepNat type RepNat = U :+: (Rec Nat) Consider the instance definition genum = map to genum' There are two different ways of typing it: (A) map @RepNat @Nat (to @Nat dReprNat |> g1) (genum' @RepNat dGEnum'_RepNat) where g1 :: Rep Nat -> Nat ~ RepNat -> Nat dReprNat :: Representable Nat dGEnum'Nat :: GEnum' RepNat or (B) map @(Rep Nat) @Nat (to @Nat dReprNat) (genum' @(Rep Nat) dGEnum'_Rep_Nat) where dReprNat :: Representable Nat dGEnum'Nat :: GEnum' (Rep Nat) Which of these is chosen depends on accidental things in the constraint solver; it's not supposed to matter. But it DOES affect whether the map/(|||) rule fires. {-# RULES "ft |||" forall f a b. map f (a ||| b) = map f a ||| map f b #-} It makes a difference because in (A) we have an instance for GEnum' RepNat that uses ||| directly, instance (GEnum' f, GEnum' g) => GEnum' (f :+: g) where genum' = map L genum' ||| map R genum' so we get map ... (blah1 ||| blah2) But in (B) we need an instance for GEnum' (Rep Nat) and that has an extra cast, so we get map ... ((blah1 ||| blah2) |> g) And the cast stops the RULE for map/(|||) firing. =============== Parametricity to the rescue ============= Note that (|||) :: [a] -> [a] -> [a] So by parametricity we know that if g :: [T1] ~ [T2] then ((|||) @T1 xs ys |> g) = ((|||) @T2 (xs |> g) (ys |> g) If we used that to push the cast inwards, the RULE would match. Likewise, map is polymorphic: map :: (a->b) -> [a] -> [b] So by parametricity if :: [T1] -> [T2] then map @T2 @TR f (xs |> g)] = map @T1 @TR (f |> sym (right g) -> TR) xs If we used that to move the cast out of the way, the RULE would match too. But GHC is nowhere near clever enough to do either of these things. And it's far from obvious what to do in general. ================= Bottom line: the choices made by the constraint solver can affect exactly where casts are inserted into the code. GHC knows how to move casts around to stop them getting in the way of its own transformations, but is helpless if they get in the way of RULES. I am really not sure how to deal with this. But it is very interesting! Simon

Simon,
Thanks a lot for looking into this. One question regarding maps that I
still don't understand: can you explain me if it is indeed to be expected
that GHC won't fuse `map f . map g` into `map (f . g)` by default? Also,
same for `map f [x]` ~> `[f x]`?
Regarding your reply:
2012/3/9 Simon Peyton-Jones
You wondered why it made a difference whether you said
instance GEnum Nat where genum = map to genum' or instance GEnum Nat -- Fill in from default method
Well, it turns out that the difference is largely accidental.
I understand your explanation for why this can result in different behaviour. But I think we should try to find a way to address this. We came up with DefaultSignatures to simplify instantiating generic functions, but now it turns out that using them can make the code slower! This is rather unexpected. Could we perhaps have a way to let users specify rewrite rules involving `cast`? Cheers, Pedro
participants (2)
-
José Pedro Magalhães
-
Simon Peyton-Jones