
Edward Z. Yang wrote:
* The bog-easy way of pulling this off (from the implementation side) would just be "seq", but ignore that it exists when you're performing rewrite rules. [1]
Yes, though I doubt it's all that easy. It isn't necessarily clear under exactly which conditions this should happen. We could say that we want the rule "f (g x) = f' x" to fire in this case: case g x of y -> f y What about this case, though? case g x of y -> case h x of z -> f y Or this? case g x of y -> case y of (y1,y2) -> case y1 of _ -> case y2 of _ -> f y In general, two function applications can be quite far apart in the presence of seqs and they won't be brought together by the simplifier because those seqs introduce an artificial ordering on the computations.
So it seems to me that once we decide that a structure ought to be unboxed (strict), then there's no point in applying anymore rewrite rules: we presumably want the unboxed structure because it's now going to be shipped to the far ends of the earth, traversed in non-rewritable ways, etc. So in this sense, the failure of strict rewrite rules to make sense is sensible.
I disagree. Here is an example from DPH. A central operation is mapD which applies in parallel a function to a distributed value (Dist). The intention is that each thread will apply this function to its private value: mapD :: (a -> b) -> Dist a -> Dist b We have the obvious rewrite rule: "mapD/mapD" mapD f (mapD g x) = mapD (f . g) x Now, mapD fully evaluates its argument before passing it to the function so using strict types here seems like a good idea. Suppose we use strict pairs: data a :*: b = !a :*: !b and get an intermediate term like this: mapD (\(x:*:y) -> f1 x :*: g1 y) $ mapD (\(x:*:y) -> f2 x :*: g2 y) xy The "mapD/mapD" rule gives us this: mapD (\(x :*: y) -> case f2 x of x' -> case g2 y of y' -> case f1 x' of x'' -> case g1 y' of y'' -> x'' :*: y'') xy If we have rules for f1/f2 and g1/g2, they won't fire even though we really want them to. Contrast this with lazy pairs: mapD (\(x,y) -> (f1 x, g1 y)) $ mapD (\(x,y) -> (f2 x, g2 y)) xy = mapD (\(x,y) -> (f1 (f2 x), g1 (g2 y))) xy Here, the rules will fire without any problems. But now GHC doesn't know that x and y are fully evaluated (or, rather, that it's ok to evaluate and unbox x and y before doing anything else like executing loops). Note that f1 and g1 will usually be strict so the two versions are really semantically equivalent. But GHC won't necessarily be able to figure this out on its own. Roman