
On Mon, Nov 24, 2008 at 12:23 AM, Simon Peyton-Jones
In my case, we had rigid type signatures all over the place. The wiki document says that the type must be rigid at the point of the match. I guess that's what we were violating. If the code I posted isn't supposed to type check then I would like to report, as user feedback, that GADTs have become unwieldy.
I grant that it's less convenient than one would like. The difficulty is that GADTs get you into territory where it's easy to write programs that have multiple **incomparable** types. That is, there is no "best" type (unlike Hindley-Milner). So we pretty much have to ask the programmer to express the type.
Once we are in that territory, we need to give simple rules that say when a type signature is needed. I know that I have not yet found a way to express these rules -- perhaps GHC's users can help. My initial shot is http://haskell.org/haskellwiki/Upgrading_packages%23Changes_to_GADT_matching...
I couldn't figure out how to fix that code by just adding a type signature.
Did you try giving a type signature to the (entire) case expression, as I suggested? That should do it.
Perhaps I don't understand the suggestion, but for me the only way I could fix it was to put all the pattern matches into local functions with type signatures. I can show you the diffs if you would find it useful to see how 'programmers in the wild' react to this.
I urge you to consider designing a modified or new syntactic form for working with GADT pattern matches. The quasi-dependent typing that GADTs give developers is very powerful and it would seem that GHC Haskell with GADTs is as close to dependent typing that developers writing "real-world" software can get. I know of no other production ready compilers that provide dependent or close to dependent typing. Dependent typing seems to be a growing area of interest. For these reasons I think it's important for GHC to focus on making them pleasanter to work with again; even if it means adding to the language again.
If I knew how to do that, I'd love to. Here's one idea you might not like: restrict GADT matching to function definitions only (not case expressions), and require a type signature for such pattern matches. That wouldn't require adding new stuff. But GHC's current story is a bit more flexible.
In practice that's what we are doing. Any new code we write using our GADT data types is written in exactly that style because removing case-expressions on each GHC release is mildly annoying. As a user of GHC I can't predict when a case-expression on a GADT will be valid and when it give an error. Thankfully there ended up being less than 10 functions that we had to modify to become compatible with ghc-6.10. Some tasks look more daunting from the outset than they are in reality.
I also feel that the type errors given when working with existential types, especially GADTs with existentials, are confusing. I think mostly because the types of the sub-expressions in the program are not visible to the user. More introspection into the inferred types would help users. I have some ideas on how to improve this, what the output should look like and I would like to implement it too, but I haven't had a chance to start a prototype yet.
I agree they are confusing. I always encourage people to suggest alternative wording for an error message that would make it less confusing (leaving aside how feasible or otherwise it'd be to generate such wording). So next time you are confused but figure it out, do suggest an alternative.
The first two stumbling blocks I ever had with the type errors was the meaning of rigid type and the difference between inferred and expected type. These days neither of those are an issue for me, but maybe its useful to hear. I take rigid type to mean a type that I specified. It may not be 100% accurate but it's close enough for intuiton. I think the problem with inferred vs. expected is that the difference in their meanings is too narrow. My understanding is that the expected type is based on rigid types and the inferred type is based on the type inference. Using words that are more polar in meaning, but still accurate, could potentially help here. Such as inferred vs. provided. But this is very minor. I think the real problem here is that the type checker does a very non-trivial amount of reasoning about your code then it spits out some well written error messages with a good deal of static only context. What I mean is, the errors do a good job of giving me back static information thus saving me some time of looking up the context and also helping me zero in on the trouble area, but it doesn't help me to understand the why of the error. I think the information that is still lacking is knowledge about the process that the type checker used to arrive at its conclusion. Every time I've been really stumped by the type checker I used the exact same trick to get unstuck. I mimicked a type checker on paper to the best of my ability until I had annotated a paper version of the code with types at each sub-expression. Then I compared that against the output from GHC. Only once I had that annotated version of the code could I sit back and think about what was wrong with the code to fix it. Obviously this annotation could be done by a machine. I think if the error message could dump your program annotated with types to html that we could achieve this introspection. I think there is a lot of fine tuning that would need to be done here, but that basic idea could be carried very far. For example, rendering all the eigenvariables that result from existential types in a different alphabet, such as greek, may be a useful fine tuning. I picture this as a -ddump-types flag or something of that nature and it would need to be able to bump the inferred versus expected types as well. Eventually it may even be handy if the programmer could 'single step' through the type check process. I would like to work on the above feature, and I really probably should have developed that before my work with GADTs, but it's hard to know ahead of time what will be a hurdle. I admit I know very little about how GHC is implemented, so do you think that the above should be possible? Do you have advice on getting started over that which can be found on the ghc trac? Thanks, Jason