
Dear all, This post is partly a gripe about how poor the formal documentation for various GHC extensions is, partly a gripe about how GHC blurs the lines between syntactic and type-level issues as well as between various extensions, and partly a gripe about how the Haskell 98 report is sometimes similarly blurred where syntax is concerned (or not). All these things make the life of a poor parser implementor quite miserable at times. All in good jest of course, but with an edge of truth, especially regarding (lack of) formal documentation. The issue at hand which has caused my frustration is the FlexibleContexts [1] and FlexibleInstances [2] extensions, which lift restrictions imposed by Haskell 98 on the forms of contexts and instances that may be defined. Great extensions both of them - but what do they do, really really? The following toy program requires MultiParamTypeClasses OR FlexibleContexts in order to be accepted by GHC(i):
f :: (T a b) => a -> Int f _ = 0
This of course assumes that we import the definition of T, we *must* have MultiParamTypeClasses enabled if we want to declare T. Both extensions thus enable classes with more than one argument to appear in contexts. Changing the program to
f :: (T a ()) => a -> Int f _ = 0
i.e. changing the second argument to T to () instead, means we now *must* have FlexibleInstances, in order to allow the non-tyvar argument. This is nothing surprising, this is what FlexibleInstances are supposed to do. But the question is, is this a syntactic issue or a typing issue? In GHC proper this doesn't really matter much, as long as it is caught *somewhere* then all is dandy. GHC's parser lets everything pass, and it's the type checker that balks at this program. But for someone like me with *only* a parser, this is a question that needs a clear answer. Looking at the online report, the productions regarding contexts are context -> class | ( class1 , ... , classn ) (n>=0) class -> qtycls tyvar | qtycls ( tyvar atype1 ... atypen ) (n>=1) qtycls -> [ modid . ] tycls tycls -> conid tyvar -> varid Ok, so clearly the () is a syntactic extension enabled by FlexibleContexts, as it is not a tyvar nor a tyvar applied to a sequence of types. So this is something that a parser should handle. FlexibleContexts also enables similar parses of contexts in other places, for instance in class declarations, for which the Haskell 98 report says topdecl -> class [scontext =>] tycls tyvar [where cdecls] scontext -> simpleclass | ( simpleclass1 , ... , simpleclassn ) (n>=0) simpleclass -> qtycls tyvar The difference here is that the simpleclass doesn't allow the tyvar applied to a sequence of types bit. FlexibleContexts lifts that restriction too, so there should be no difference between the two kinds of contexts. So the new formal productions for flexible contexts should be something like fcontext -> fclass | ( fclass1 , ... , fclassn ) (n>=0) fclass -> qtycls type1 ... typen (n>=1) topdecl -> data [fcontext =>] simpletype = constrs [deriving] | newtype [fcontext =>] simpletype = newconstr [deriving] | class [fcontext =>] tycls tyvar [where cdecls] | instance [fcontext =>] qtycls inst [where idecls] gendecl -> vars :: [fcontext =>] type Does this seem correct? Now let's turn to FlexibleInstances, which similarly lifts restrictions, only to instance declarations instead of contexts. The Haskell 98 report says on instance declarations: topdecl -> instance [scontext =>] qtycls inst [where idecls] inst -> gtycon | ( gtycon tyvar1 ... tyvark ) (k>=0, tyvars distinct) | ( tyvar1 , ... , tyvark ) (k>=2, tyvars distinct) | [ tyvar ] | ( tyvar1 -> tyvar2 ) (tyvar1 and tyvar2 distinct) Note the re-appearance of scontext, which is the same as above. The instance head must be a type constructor, possibly applied to a number of type variables, or one of three built-in syntactic cases. This is where I consider the Haskell 98 report blurry - the fact that the tyvars must be distinct, is that truly a syntactic issue? It might be, it's certainly something that could be checked syntactically. But when you take into account that with the proper extensions, they no longer need to be distinct, at what level would we expect such a check to happen? My gut feeling is that this check for distinctness is something that a type checker might do better than a parser, though it's not clear cut by any means. But since I don't do any other kind of name resolution or checking in my parser even if it would be possible (e.g. multiple declarations of the same symbol), I would find it a bit anomalous to check this too. Turning on FlexibleInstances, we shouldn't need to follow any of the above restrictions on inst. In other words, the flexible production should simply be something like finst -> type Right? Now, FlexibleInstances *also* lifts the restriction on contexts, just like FlexibleContexts - but *only* for the contexts of instance declarations. This may seem like a reasonable thing, but it certainly gives me some headaches. It means I could not treat the contexts uniformly, but would need to have separate syntactic categories (or rather post-parse checks) that look different between instance contexts and other contexts (including class). So with FlexibleInstances on, there are *three* different kinds of contexts allowed: scontext for class declaration, fcontext for instance declarations, and context for all other uses of contexts (type signatures, data/newtype declarations). Just a small headache, since I already apparently needed two categories from Haskell 98, but still. I'm not sure I find it reasonable, that flexible instances are enabled just for instance declarations but not elsewhere, but I'm sure a lot of thought was given to that. If it was up to *me* though, I would leave the flexible contexts with FlexibleContexts entirely, which means you would have to use both flags if you wanted both in your instance declarations. Would that be a bad thing? Separation of concern seems desirable to me (and no, I'm not saying that just because it would be easier to implement in the parser)... At any rate, to make a long rant short: * Are my interpretations of the lifted restrictions by FlexibleContexts and FlexibleInstances correct? * Is it reasonable that the issue of checking that tyvars are distinct should not be considered syntactic? * Would it also be reasonable to make the separation of concern between FlexibleContexts and FlexibleInstances more clean? Thanks for reading, and please give me input! Cheers, /Niklas [1] http://www.haskell.org/ghc/docs/latest/html/users_guide/other-type-extension... [2] http://www.haskell.org/ghc/docs/latest/html/users_guide/type-class-extension...