Error message reform (was: Strange type error with associated type synonyms)

On Wed, May 27, 2009 at 11:05 PM, Bulat Ziganshin
i mean just changing the words to make obvious what type was got in what way. and check it on beginners who don't yet read your explanations, for example teachers may test it on their students
my English is limited... for example, it may be like this:
read x
Error: type of x is Integer while type of read argument should be String
For me, this word order is an improvement, but it doesn't help to leave out the words "expected" and "inferred". To me, those words add explanation. Of course they don't help if you don't know what they mean. I think they add something because they explains where the types "came from". When I get type errors, I often think "why do you say True has type Bool" (of course in most cases it is not so transparent). Although the compiler doesn't say *why* it inferred that type, at least it says it did infer it. I prefer this wording: The inferred type of `True' is `Bool', while the type of the first argument of `f' should be `Int'. In the expression: f True I prefer all three to Hugs's ERROR - Type error in application *** Expression : f True *** Term : True *** Type : Bool *** Does not match : Int --Max

Hello Max, Thursday, May 28, 2009, 1:30:28 AM, you wrote:
I prefer this wording:
The inferred type of `True' is `Bool', while the type of the first argument of `f' should be `Int'. In the expression: f True
yes, it's also self-explanatory
I prefer all three to Hugs's
ERROR - Type error in application *** Expression : f True *** Term : True *** Type : Bool *** Does not match : Int
for me, it was better than ghc errmsg. main thing is that i don't feel automatically what is expected and what is inferred. here Hugs says that True is Bool and the remaining is Int, so i "feel" the situation overall, this expected/inferred words are probably go directly from compiler algorithms. it looks natural for compiler developer - saying that some term has different types evaluated by two different compiler parts. but for me as a user it's natural to think in terms of function having parameter of some type and term having another type, so i need to fix the call giving it the right parameter:
The type of the first argument of `f' should be `Int', while the inferred type of `True' is `Bool'. In the expression: f True
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, 2009-05-28 at 01:45 +0400, Bulat Ziganshin wrote:
for me, it was better than ghc errmsg. main thing is that i don't feel automatically what is expected and what is inferred. here Hugs says that True is Bool and the remaining is Int, so i "feel" the situation
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each. Say, in this simple example:
Prelude> let f = (+5) Prelude> f "abc"
<interactive>:1:2: Couldn't match expected type `Integer' against inferred type `[Char]' In the first argument of `f', namely `"abc"' In the expression: f "abc" In the definition of `it': it = f "abc"
Does expected mean that, based on the type signature, it should be an Integer, or based on the argument that I provided, it should be a String? The same goes for the inferred type: it knows what the type of the literal argument (String), so I would assume the inferred type was the type in the function's signature. Unfortunately, my reasoning in both cases can go the wrong way . . . Better language may be much more helpful, although I'm not sure what may be easier to interpret. Jeff Wheeler

On Thu, May 28, 2009 at 12:03 AM, Jeff Wheeler
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each.
That's actually true for me too. When you say it like that, I remember times when I've had the same confusion.
Better language may be much more helpful, although I'm not sure what may be easier to interpret.
I think one big improvement (demonstrated in Bulat's proposal) is to put the two types near where those types come from. GHC gives you the two types, and then the context in which they arose. Hugs gives the context and then two types, but doesn't say which type is which. So we say: you have here `True', it is a `Bool'. But the first argument of `f' should be an `Int'. Thus it's clearly indicated where the two types came from. My preference is still to include the words "expected" and "inferred" which indicate (to me) that *either* could be wrong. And they don't seem to me to be compiler writers' jargon. They're both ordinary (if not everyday) English words. But definitely we should use language which more clearly indicates which is which. I'll pay more attention to error messages in the future. GHC devs: would patches for error message language be considered? --Max

Hello Max, Thursday, May 28, 2009, 2:14:19 AM, you wrote:
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each.
That's actually true for me too. When you say it like that, I remember times when I've had the same confusion.
it's why i asked beginners. it seems that we all go through times when ghc errmsgs looks cryptic but then we start to live with it and forget the first period actually, i don't have much problems with errrmsgs now, but trying to grok how i interpret them i've found that i mainly use *position* part of message, it's enough for me most times :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin schrieb:
actually, i don't have much problems with errrmsgs now, but trying to grok how i interpret them i've found that i mainly use *position* part of message, it's enough for me most times :)
I have heard the statement "users are only interested in the error position" in a talk 2007 in RISC in Hagenberg. But I still try to understand error messages.

Hello Henning, Thursday, May 28, 2009, 2:30:18 AM, you wrote:
Bulat Ziganshin schrieb:
actually, i don't have much problems with errrmsgs now, but trying to grok how i interpret them i've found that i mainly use *position* part of message, it's enough for me most times :)
I have heard the statement "users are only interested in the error position" in a talk 2007 in RISC in Hagenberg. But I still try to understand error messages.
i'm not "uninterested". it's just faster for me to find message looking at the code rather than to decipher errmsg so while beginners crying at cryptic errmsgs, advanced beginners like me don't waste time trying to decrypt them :) but don't take it too seriously - for this particular message problem is only those inferred/expected interpretation, types by itself are helpful -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed, May 27, 2009 at 3:24 PM, Bulat Ziganshin
Hello Max,
Thursday, May 28, 2009, 2:14:19 AM, you wrote:
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each.
That's actually true for me too. When you say it like that, I remember times when I've had the same confusion.
it's why i asked beginners. it seems that we all go through times when ghc errmsgs looks cryptic but then we start to live with it and forget the first period
actually, i don't have much problems with errrmsgs now, but trying to grok how i interpret them i've found that i mainly use *position* part of message, it's enough for me most times :)
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
Hi, I like the "expected/inferred" vocabulary. Maybe it comes from being a native English speaker, but to me, it says "this is what we expected to get, but instead (through type inference), we got this type for this term". Of course, I've also been reading GHC error messages for a while, so I've gotten used to understanding what they mean. When I was new, I had more of a problem...but I'm not sure you can really eliminate that. Everything takes practice. :) Alex

I like the "expected/inferred" vocabulary. Maybe it comes from being a native English speaker, but to me, it says "this is what we expected to get, but instead (through type inference), we got this type for this term".
As another native English speaker, I found "expected/inferred" very intuitive when I was new to GHC, and to Haskell. I even think that "expected/inferred" helped me form my intuition about Haskell's type inference. There was one hang-up; it wasn't at all clear which referred to the term, and which referred to the context. (Really both types are inferred.) This stopped bothering me when I decided it didn't matter which was which, and I could generally find the problem pretty quickly just knowing the location and the types involved. Of course, I can see how the messages are probably much less useful to non-native speakers, and that's quite important. Something along the lines of "inferred type <droozle> for term, but expected type <snidgit> in context". John

On Wed, 2009-05-27 at 23:59 -0400, John Dorsey wrote:
There was one hang-up; it wasn't at all clear which referred to the term, and which referred to the context. (Really both types are inferred.) This stopped bothering me when I decided it didn't matter which was which, and I could generally find the problem pretty quickly just knowing the location and the types involved.
Of course, I can see how the messages are probably much less useful to non-native speakers, and that's quite important. Something along the lines of "inferred type <droozle> for term, but expected type <snidgit> in context".
As a native English speaker myself, I've also found it awkward because both types are inferred, I suppose. The alternate format you've suggested would make it much more clear, in my opinion, and I strongly feel that the current version should be replaced with yours. Jeff Wheeler

John Dorsey
As another native English speaker, I found "expected/inferred" very intuitive when I was new to GHC, and to Haskell. I even think that "expected/inferred" helped me form my intuition about Haskell's type inference.
First off, me too, and I'm not a native speaker. OTOH, while "expected" is not ambiguous, "inferred" is: After all, both types are inferred. I opt for expected/encountered -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or quoting of this signature prohibited.

On Fri, May 29, 2009 at 2:17 AM, Simon Michael
Achim Schneider wrote:
expected/encountered
Expected/actual ? Familiar to users of test frameworks.
That does sound better than expected/inferred to me. -- Johan

Max Rabkin wrote:
Jeff Wheeler wrote:
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each.
That's actually true for me too. When you say it like that, I remember times when I've had the same confusion.
[...]
My preference is still to include the words "expected" and "inferred" which indicate (to me) that *either* could be wrong. And they don't seem to me to be compiler writers' jargon. They're both ordinary (if not everyday) English words. But definitely we should use language which more clearly indicates which is which.
" Inferred the type `Bool' for the argument `True', but couldn't match it against the type `Int' expected for the first argument of `f'. in the ... at ... blah ... "? The problem with "expecting" is that there are many different actors (the function, the argument, the type inferer,...) and they could all plausibly be expecting something. Similarly, the types of both the argument and the function are inferred. The current message doesn't clarify the subjects of "expected" and "inferred", which is what makes it confusing. I think the vocabulary is very good, the message just needs to clear up the subjects of the two verbs. -- Live well, ~wren

One user's view of error message history, perhaps helpful to reformers:-) Once upon a time, Hugs tended to have "better" error messages than GHC. They still weren't perfect, mostly when begginners where confronted with messages referring to advanced concepts - eg, Simon Thompson had a list of translations of the kind "if Hugs reports this, it most likely means that", especially targetted to beginners in his courses, ie users who where unlikely to want things like 'Num (a->b)' instances, or find non-existing semicolons. (the advice to focus on the error location usually means that either the error message is misleading or too difficult for the user to interpret - it is a standard fallback in all programming language implementations, but no more than that; if we rely on it too often, the messages aren't doing their job) Even for more advanced users, it was helpful to get messages from several implementations (at least Hugs and GHC) for tricky cases, just to get different views from which to piece together a picture. This, sadly, is not as well supported these days as it used to be, but getting multiple "opinions" on type errors is still useful advice (if your code can be handled by multiple implementations, or if a single implementation can offer multiple views, eg, beginner/advanced or top-down/bottom-up or ..). Then Simon PJ invested a lot of energy into improving GHC's error messages, so the balance changed. Error message complaints didn't stop, though, and the error messages were improved further, with more text, and suggestions for possible fixes, etc. The suggestions are sometimes misleading, and I've felt there has been too much weight on narrative text, to the exclusion of actual type information (it is a programming language, after all, so why don't the error messages give type signatures for context instead of trying to "talk" me through it in natural language, without further references to types!-), but discussions like the present show that Simon has been on the right track with that, for many users. Still, I would really like a "just the facts, please" mode for GHC, with less text and more type signatures (especially for the contexts of type mismatches). Error messages simply not including the information I need has become my main issue with GHC messages, and seems to be a common thread over many tickets and threads apparently discussing phrasing (eg, expected/inferred: who expects and why, and what is the inference based on?). Somewhere in between, much research has focussed on type errors, and how to report them in general, and implementations like Helium have set out to target beginners with simpler error messages. (*) As for fixes, concrete suggestions are most likely to be adopted, but sometimes it just isn't clear how to improve the situation, and sometimes, there is no single best solution (which has spawned countless interesting papers on type-error reporting;-). It you want to help, file tickets for messages you find unhelpful, and make suggestions for possible improvements. Unfortunately, there doesn't seem to be an "error message" component in GHC's trac, so querying for relevant tickets is not straightforward. Simon keeps a partial list of error message related tickets here (keeping track of trac tickets - meta-trac?-): http://hackage.haskell.org/trac/ghc/wiki/Status/SLPJ-Tickets Just, please, keep in mind that there is no one-size-fits-all: improving a message for one group of users might well make it less useful for another group. Claus (*) A beginner mode for Haskell systems has often been suggested, even before Helium. Instead, error messages, and even language design decisions have been constrained by trying to serve everyone with a single mode. I'd have thought that Helium, and Dr Scheme, have sufficiently demonstrated the value of having separate levels of a language and the corresponding error messages.

Claus Reinke wrote:
Still, I would really like a "just the facts, please" mode for GHC, with less text and more type signatures (especially for the contexts of type mismatches). Error messages simply not including the information I need has become my main issue with GHC messages, and seems to be a common thread over many tickets and threads apparently discussing phrasing (eg, expected/inferred: who expects and why, and what is the inference based on?).
FWIW, the suggestion I floated has less text than the current message while preserving all the type and expression information ;) (Though it doesn't necessarily generalize to cover similar messages like: Prelude> :t (\x -> x) :: a -> b <interactive>:1:7: Couldn't match expected type `b' against inferred type `a' `b' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 `a' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 In the expression: x )
Just, please, keep in mind that there is no one-size-fits-all: improving a message for one group of users might well make it less useful for another group.
+1. There're worlds of difference between the kinds of error messages suitable for beginners, vs "regular users", vs type-level programmers. Also, while a user may have some given level of understanding, the level of messaging they'll want can vary a lot depending on the task a hand. -- Live well, ~wren

wren ng thornton wrote:
(Though it doesn't necessarily generalize to cover similar messages like:
Prelude> :t (\x -> x) :: a -> b <interactive>:1:7: Couldn't match expected type `b' against inferred type `a' `b' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 `a' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 In the expression: x )
I find this slightly more complicated case quite confusing with the current wording: Prelude> :t (\x -> x) :: (a -> b) -> (a -> a) <interactive>:1:7: Couldn't match expected type `a' against inferred type `b' `a' is a rigid type variable bound by an expression type signature at <interactive>:1:14 `b' is a rigid type variable bound by an expression type signature at <interactive>:1:19 In the expression: x In the expression: (\ x -> x) :: (a -> b) -> (a -> a) This message suggests that ghc has inferred type b for x. Tillmann

I find this slightly more complicated case quite confusing with the current wording:
Prelude> :t (\x -> x) :: (a -> b) -> (a -> a) <interactive>:1:7: Couldn't match expected type `a' against inferred type `b' `a' is a rigid type variable bound by an expression type signature at <interactive>:1:14 `b' is a rigid type variable bound by an expression type signature at <interactive>:1:19 In the expression: x In the expression: (\ x -> x) :: (a -> b) -> (a -> a)
This message suggests that ghc has inferred type b for x.
Not really; but this looks like a nice simple example of what I was referring to: GHC is so careful not to bother the user with scary type stuff that it mentions types as little as possible. In particular, it never mentions the type of 'x'! It only mentions that it has run into 'a' and 'b' somewhere *while looking into 'x's type*, so those are conflicting types for some parts of 'x's type, not 'x's type itself. In more complex code, this peephole view can be really unhelpful, which is why I suggested [1] that type errors should give types for all expressions mentioned in the type error context (here 'x' and '(\x->x)', the latter's type is there only because it was part of the input). Claus [1] http://hackage.haskell.org/trac/ghc/ticket/1928#comment:2

Tillmann Rendel wrote:
wren ng thornton wrote:
(Though it doesn't necessarily generalize to cover similar messages like:
Prelude> :t (\x -> x) :: a -> b <interactive>:1:7: Couldn't match expected type `b' against inferred type `a' `b' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 `a' is a rigid type variable bound by the polymorphic type `forall a b. a -> b' at <interactive>:1:0 In the expression: x
I find this slightly more complicated case quite confusing with the current wording:
Prelude> :t (\x -> x) :: (a -> b) -> (a -> a) <interactive>:1:7: Couldn't match expected type `a' against inferred type `b' `a' is a rigid type variable bound by an expression type signature at <interactive>:1:14 `b' is a rigid type variable bound by an expression type signature at <interactive>:1:19 In the expression: x In the expression: (\ x -> x) :: (a -> b) -> (a -> a)
This message suggests that ghc has inferred type b for x.
I agree with Claus that this (both of these, actually) are symptoms of not giving enough type information for the user to reconstruct what the problem actually is. I think an initial solution should be to print out _some_ type inferred for x. We can't print out "the" type, because the error is that there isn't one. As for which type to print for x, it'd be most informative (though perhaps not most helpful) to print both types that are being unified, and to alter the rigidity messages to indicate which type they're rigid for. For example, perhaps something like this: Couldn't match expected type `a' against inferred type `b' In the expression: x which should have type: (a -> a) where `a' is a rigid type variable bound by an expression with type signature at <interactive>:1:14 but actually has type: (a -> b) where `b' is a rigid type variable bound by an expression with type signature at <interactive>:1:19 In the expression: (\ x -> x) :: (a -> b) -> (a -> a) It's too wordy, but it's a start. This is also prime ground for wanting to have configurable levels of error reports, since some users will find it helpful to see both types but others will find it confusing. Sometimes, even when the complete type is given, that's not enough information because each of the type variables are non-linear and so it's not clear where exactly the problem crept up. We can see a little of this in the above example where, even with the new message, `a' occurs twice in the first type. By presenting both types, users can figure it out here--- though with only the last type signature given it wouldn't be clear whether it wants ((b->b)->a->a) or ((a->b)->a->b) or even ((a->b)->b->a). In more complex examples I've often found non-linearity to be the most uninformative part of the current messages. Perhaps a message like above which gives both types will be enough in practice to clear it up. If not, then it may be worth providing some sort of metasyntax to distinguish what subpart of the type is being discussed (e.g. bold face, or an underlining carrot on the next line, etc). For really intricate type hacking, even this isn't enough because the programming errors often happen far from where the type errors are finally caught. In an ideal world, ghc could dump the entire proof forest generated by inference, so an external tool[1] could be used to browse through it and track the complete history of where inferred types came from. This gives a partial view of the inferred types for a compilation unit, something I've often wanted (rather than the manual comment/reload/uncomment routine). The proof forest could even be used as an interlingua over which people can write filters to generate messages they find most helpful. Ah the joys of ideal worlds... ;) [1] Something like the Dynasty[2] debugger for logic programming in Dyna. [2] http://www.cs.jhu.edu/~jason/papers/eisner+al.infovis06-poster.pdf -- Live well, ~wren

It's too wordy, but it's a start. This is also prime ground for wanting to have configurable levels of error reports, since some users will find it helpful to see both types but others will find it confusing.
Indeed. For this simple example, I find Hugs' message nearly optimal, but as one could just as well construct examples where GHC's message was nearly optimal, I've tried instead to extend the example, until neither Hugs nor GHC gives an optimal message. Even so, it is worth comparing the two, and their complementary approaches to the problem: t = (let q=0 in(\y->(\x->(\z->let r=1 in x z))) ()) :: (a->b)->(a->a) D:\home\Haskell\tmp\desktop\errormessages.hs:2:41: Couldn't match expected type `a' against inferred type `b' `a' is a rigid type variable bound by an expression type signature at D:\home\Haskell\tmp\desktop\errormessages.hs:2:56 `b' is a rigid type variable bound by an expression type signature at D:\home\Haskell\tmp\desktop\errormessages.hs:2:59 In the expression: x z In the expression: let r = 1 in x z In the expression: (\ z -> let r = 1 in x z) ERROR file:.\errormessages.hs:2 - Inferred type is not general enough *** Expression : let {...} in (\y -> \x -> \z -> let {...} in x z) () *** Expected type : (a -> b) -> a -> a *** Inferred type : (a -> a) -> a -> a GHC delivers its messages from in-the-middle of its typing process, which one hopes might give as little information as neccessary to identify the issue, without irrelevant context. It then tries to explain that information nicely, followed by some value-level context. Hugs delivers its messages from on-the-outside-looking-in, which one hopes might give the maximum potentially relevant information. It makes no attempt to cushion the blow. In practice, neither approach works all the time, and even for this tiny contrived example, one would have to engage in some "type debugging" (inserting type signatures, to trigger type-level printfs; commenting out large parts of code to see whether they contribute to the issue; ..) to figure it out. Note that both approaches are fairly precise in locating the error at the type level - it is the cross-reference from type to value level that runs into trouble, especially if the former is only implicit in the latter (which is why type debugging by adding signatures helps).
For really intricate type hacking, even this isn't enough because the programming errors often happen far from where the type errors are finally caught. In an ideal world, ghc could dump the entire proof forest generated by inference, so an external tool[1] could be used to browse through it and track the complete history of where inferred types came from. This gives a partial view of the inferred types for a compilation unit, something I've often wanted (rather than the manual comment/reload/uncomment routine). The proof forest could even be used as an interlingua over which people can write filters to generate messages they find most helpful.
Yes, most research papers on this ended up suggesting visual representations and separation between generating and presenting/querying error information.
Ah the joys of ideal worlds... ;)
"if you don't have a dream, how you gonna have a dream come true?" (Bloody Mary) Claus

On Thu, 28 May 2009, Claus Reinke wrote:
Just, please, keep in mind that there is no one-size-fits-all: improving a message for one group of users might well make it less useful for another group.
I once thought, that error messages must be configurable by libraries, too. This would be perfect for EDSLs that shall be used by non-Haskellers. But I have no idea how to design that.

I once thought, that error messages must be configurable by libraries, too. This would be perfect for EDSLs that shall be used by non-Haskellers.
Yes, that is a problem.
But I have no idea how to design that.
There was some work in that direction in the context of the Helium project. See the publication lists of Bastiaan Heeren and Jurriaan Hage: http://people.cs.uu.nl/bastiaan/#Publications http://www.cs.uu.nl/research/techreps/aut/jur.html Claus

Hello Jeff, Thursday, May 28, 2009, 2:03:30 AM, you wrote:
I absolutely agree about expected/inferred. I always forget which is which, because I can figure both could apply to each.
thank you, it's what i meant! compiler infers types of both caller and its argument and then expect to see types inferred. these two words are actually describe two stages of process, not two opposite processes! -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (13)
-
Achim Schneider
-
Alexander Dunlap
-
Bulat Ziganshin
-
Claus Reinke
-
Henning Thielemann
-
Henning Thielemann
-
Jeff Wheeler
-
Johan Tibell
-
John Dorsey
-
Max Rabkin
-
Simon Michael
-
Tillmann Rendel
-
wren ng thornton