
So a first comment on this. I spoke too soon, ghc clearly has a bug here.
It shouldn't reorder those matches against literals like that.
I suggest you report that bug, because, as you say, it violates the H98 report.
But I don't think that bug at all affects the function you had in your
original email.
-- Lennart
On Thu, Mar 26, 2009 at 5:39 PM, Claus Reinke
Sorry to be the odd man out - perhaps an example will help to clarify my reading of the language definition.
I find this "reordering" discussion somewhat nonsensical. Haskell specifies top-to-botton, left-to-right matching. This specifies exactly which tests that have to be made and in what order, and ghc does exactly those and in the correct order.
One can have a perception that when there are multiple arms in a case decided by a single test, then the first arm should somehow be reached quicker than the second one etc But that is something that the Haskell standard has never promised, nor has any compiler ever promised this.
When you say "test", which can decide multiple arms in a case, do you mean that, say 'null e' being 'True' implies that matching '[]' against 'e' will succeed while matching '_:_' against 'e' will fail? Because that kind of test is not what the Haskell'98 report talks about. It talks about individual matches of expressions against alternatives, and it does specify precisely in what order these are to be performed, hence which pattern is reached first:
A case expression is evaluated by pattern matching the expression e against the individual alternatives. The alternatives are tried sequentially, from top to bottom.
Patterns are matched against values. Attempting to match a pattern can have one of three results: it may fail; it may succeed, returning a binding for each variable in the pattern; or it may diverge (i.e. return _|_). Pattern matching proceeds from left to right, ..
Nothing abstract about that. So for a function application 'f e', where
f [] = True f (_:_) = False
the Haskell'98 report specifies that 'e' is first matched against '[]', then (if that fails) against (_:_). So the first pattern is reached/tried before the second. Of course, we can make use of the fact that these two matches are complementary, so we only need one "test" to decide, and if there are further '[]' patterns after the first, we don't have to match them again, but that is all in the realm of optimization, not language definition. The definition explicitly provides room for such optimization, but it still requires conformance to the rules set out in the definition, which include:
case e of {p1->e1;p2->e2} = case e of {p1->e1;_->case e of {p2->e2;_->error "No match"}}
GHC violates that rule, as we can demonstrate:
newtype N = N Int deriving (Show,Eq) instance Num N where fromInteger 0 = error "0" fromInteger 1 = N 0 fromInteger _ = N 1 f x = case x of 1 -> False 0 -> True g x = case x of 1 -> False _ -> case x of 0 -> True _ -> error "No match" main = do print $ g (N 0) print $ f (N 0)
-- ghc $ ghc -w -e main PMOrderSpec.hs False <interactive>: 0
-- hugs Main> main False False
One can presumably construct similar examples using 'Data.String.IsString', or using pattern guards, so just fixing the special case of 'Num' is not going to help, but this example seems firmly within Haskell'98.
And to me such a perception is counter-intuitive; Haskell is about specifying functions abstractly so order should only matter when it's a matter of semantics.
Any semantics should conform to the language definition, right?
On the other hand, adding some kind of pragma that indicates the likelyhood of a branch seems quite sensible to me.
Which doesn't mean that I wouldn't try such a pragma, if it existed. I'm just having difficulties understanding this universal resistance to what seems one of the few unambiguously defined parts of Haskell'98;-)
Claus