Human-friendly compiler errors for GHC

I had some free time this afternoon so I put together an (experimental) patch for GHC that implements helpful errors messages. Have a look at this GHCi session to see what I mean: "" $ stage2/ghc-inplace --interactive -fhelpful-errors GHCi, version 6.9.20080711: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. Prelude> let foo = 10 Prelude> foa <interactive>:1:0: Not in scope: `foa' Maybe you meant one of: `foo' `fst' `not' -- Maybe the matching threshold could stand to be tweaked a bit, e.g. scaled with identifier string length.. Prelude> let myIdentifier = 10 Prelude> myIdentfiier <interactive>:1:0: Not in scope: `myIdentfiier' Maybe you meant `myIdentifier' Prelude> "" The feature was inspired by the equivalent feature in the Boo programming language (http://boo.codehaus.org/). I use the restricted Damerau–Levenshtein distance to do the fuzzy match (http://en.wikipedia.org/wiki/Damerau-Levenshtein_distance). What do you think about this feature? Would it be genuinely helpful or annoying? Max

What do you think about this feature? Would it be genuinely helpful or annoying?
It could be handy if it understands qualified names. Occasionally typos e.g. just now Confg.default_x are surprisingly hard to see and I go around making sure Config is imported, making sure it exports default_x, etc. before finally figuring it out.

2008/7/12 Evan Laforge
What do you think about this feature? Would it be genuinely helpful or annoying?
It could be handy if it understands qualified names. Occasionally typos e.g. just now Confg.default_x are surprisingly hard to see and I go around making sure Config is imported, making sure it exports default_x, etc. before finally figuring it out.
Good point. It turns out that actually since my implementation doesn't include module names in the match at all, this works without writing more code on my part. But perhaps I should make it module-aware as I think that will allow more accurate matching. I've also changed the output format and tweaked the match threshold algorithm, with this result: "" mbolingbroke@Equinox ~/Programming/Checkouts/ghc.working/compiler $ stage2/ghc-inplace --interactive -fhelpful-errors GHCi, version 6.9.20080711: http://www.haskell.org/ghc/ :? for help Prelude> let foo = 10 Prelude> foa <interactive>:1:0: Not in scope: `foa' Maybe you meant `foo' Prelude> fts <interactive>:1:0: Not in scope: `fts' Maybe you meant `fst' Prelude> let foa = 20 Prelude> fof <interactive>:1:0: Not in scope: `fof' Maybe you meant `foo' or `foa' Prelude> import Data.Lost Could not find module `Data.Lost': -- Maybe it should do something better here.. Use -v to see a list of the files searched for. Prelude> :q $ stage2/ghc-inplace --make Test.hs [1 of 1] Compiling Main ( Test.hs, Test.o ) Test.hs:5:14: Not in scope: `Chbr.isSpoce' Maybe you meant `Char.isSpace' """ Max

On Sat, Jul 12, 2008 at 10:44 AM, Max Bolingbroke
I had some free time this afternoon so I put together an (experimental) patch for GHC that implements helpful errors messages. Have a look at this GHCi session to see what I mean:
"" $ stage2/ghc-inplace --interactive -fhelpful-errors GHCi, version 6.9.20080711: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. Prelude> let foo = 10 Prelude> foa
<interactive>:1:0: Not in scope: `foa' Maybe you meant one of: `foo' `fst' `not' -- Maybe the matching threshold could stand to be tweaked
That's pretty cool. Unfortunately in my early Haskell days the 'not in scope' errors were the only ones I _did_ understand. It would be nice to human-friendlify the other types of errors. I'm not judging your work though, this is helpful, and the other types of errors are of course much harder to friendlify. On the topic of things that aren't stupid complaints by me, a typo is the most likely cause for not in scope errors. As Evan points out, I think it would be more helpful to search for matching names in imported modules to see if the name was accidentally not qualified or exported. I don't know about this fuzzy matching business, since when I go to the line of the error message, I'm going to see my typo and what I meant. I don't think I'd ever use the suggestions... Luke

That's pretty cool. Unfortunately in my early Haskell days the 'not in scope' errors were the only ones I _did_ understand.
Heh :-)
It would be nice to human-friendlify the other types of errors. I'm not judging your work though, this is helpful, and the other types of errors are of course much harder to friendlify.
Yep, this would only be one small step forward in error message quality.
On the topic of things that aren't stupid complaints by me, a typo is the most likely cause for not in scope errors. As Evan points out, I think it would be more helpful to search for matching names in imported modules to see if the name was accidentally not qualified or exported.
Agreed: I've implemented this too. I've also added fuzzy matching to package search: """ $ stage2/ghc-inplace --make ../Test1.hs ../Test1.hs:3:7: Could not find module `Data.Lost': Use -v to see a list of the files searched for. Maybe you meant `Data.List' $ stage2/ghc-inplace --make ../Test2.hs [1 of 1] Compiling Main ( ../Test2.hs, ../Test2.o ) ../Test2.hs:7:14: Not in scope: `isSpace' Maybe you meant `Char.isSpace' """
I don't know about this fuzzy matching business, since when I go to the line of the error message, I'm going to see my typo and what I meant. I don't think I'd ever use the suggestions...
I can think of a few times it would have helped me out, with identifiers that may or may not be pluralized or have suprising capitalisation. I don't know though, I guess you'd have to work with the feature turned on for a while to work out if it really was useful. I think this feature has shaped up pretty nicely after the helpful suggestions I recieved. I don't know if I'll be able to get the patch into GHC proper, though.. Max

Max Bolingbroke wrote:
Agreed: I've implemented this too. I've also added fuzzy matching to package search:
""" $ stage2/ghc-inplace --make ../Test1.hs
../Test1.hs:3:7: Could not find module `Data.Lost': Use -v to see a list of the files searched for. Maybe you meant `Data.List'
$ stage2/ghc-inplace --make ../Test2.hs [1 of 1] Compiling Main ( ../Test2.hs, ../Test2.o )
../Test2.hs:7:14: Not in scope: `isSpace' Maybe you meant `Char.isSpace' """
In terms of making error messages more helpful, I don't find general typos are much of an issue, but this part would be really nice! I've always been annoyed that GHC just says "no" rather than offering suggestions (-v is rarely helpful), especially since it knows about what modules are installed et al. Granted it's still an easy class of bugs to fix, but this is a much friendlier way of fixing them. -- Live well, ~wren

wren ng thornton wrote:
Max Bolingbroke wrote:
Agreed: I've implemented this too. I've also added fuzzy matching to package search:
""" $ stage2/ghc-inplace --make ../Test1.hs
../Test1.hs:3:7: Could not find module `Data.Lost': Use -v to see a list of the files searched for. Maybe you meant `Data.List'
$ stage2/ghc-inplace --make ../Test2.hs [1 of 1] Compiling Main ( ../Test2.hs, ../Test2.o )
../Test2.hs:7:14: Not in scope: `isSpace' Maybe you meant `Char.isSpace' """
In terms of making error messages more helpful, I don't find general typos are much of an issue, but this part would be really nice! I've always been annoyed that GHC just says "no" rather than offering suggestions (-v is rarely helpful), especially since it knows about what modules are installed et al.
It sounds like it only searches for modules you've imported (after all they might've been brought into scope with an as-clause, and ghc has no business poking in un-imported modules), but perhaps since in GHCi all modules are in scope (under their original names -- in addition to any imports in an interpreted file), it (ghci) searches all modules then? If I want a "perhaps you meant to import" message, I want it to be a *complete* listing of modules available that export that symbol, both local ones and library ones, ideally :-) (certainly don't want a recommendation to import e.g. List without at least an equal recommendation of Data.List...) -Isaac

2008/7/13 Isaac Dupree
wren ng thornton wrote:
In terms of making error messages more helpful, I don't find general typos are much of an issue, but this part would be really nice! I've always been annoyed that GHC just says "no" rather than offering suggestions (-v is rarely helpful), especially since it knows about what modules are installed et al.
It sounds like it only searches for modules you've imported (after all they might've been brought into scope with an as-clause, and ghc has no business poking in un-imported modules), but perhaps since in GHCi all modules are in scope (under their original names -- in addition to any imports in an interpreted file), it (ghci) searches all modules then?
You're right: if there is no qualified import of Char then it won't suggest Char.isSpace. The reason for this is that doing so would require GHC to load all the interface files for all exposed modules on disk in order to search exported names, which doesn't sound like a great idea performance-wise. The same thing applies to GHCi.
If I want a "perhaps you meant to import" message, I want it to be a *complete* listing of modules available that export that symbol, both local ones and library ones, ideally :-) (certainly don't want a recommendation to import e.g. List without at least an equal recommendation of Data.List...)
I agree this feature would be cool, I'm just not sure the possible memory/performance problems associated with loading all interfaces of all exposed modules in all packages (it would be most helpful but even more of a performance problem to even scan non-exposed packages) would be worth it. Probably it could be made practical by building a sort of index, which is cached on disk.. Max

I agree this feature would be cool, I'm just not sure the possible memory/performance problems associated with loading all interfaces of all exposed modules in all packages (it would be most helpful but even more of a performance problem to even scan non-exposed packages) would be worth it. Probably it could be made practical by building a sort of index, which is cached on disk..
That is the approach the haskellmode plugins for Vim take: build an index of Haddocumented identifiers once per installation, build an index of import-available identifiers once per :make. The latter turns out to be a performance issue for large libraries, such as Gtk2hs, so some users had to switch it off, updating less frequently, the former doesn't seem to cause any problems. Then again, we don't do fuzzy matching, only completion of partial identifiers and suggesting possible qualified names and imports for unqualified ones. Completion alone can result in many matches, I'd expect fuzzy matching to be worse, and while edit distance is a useful criterion for well-chosen examples, I'd have to agree with those who have their doubts about its use in this context (but then, those who don't need a feature shouldn't stand in the way of those who do;-). A general comment: once there was a time when GHC error messages were inferior to Hugs ones, then Simon PJ spent a lot of time improving GHC's messages, until there came a time when some of GHC's messages were too helpful - you couldn't find the error in all the help provided. It used to be a good idea to have several Haskell installations, just to be able to compare error messages for tricky code. Things have levelled off since then, but there are still cases were longish fuzzy messages are provided by GHC when brief harsh messages would be more to the point - which wouldn't be a problem if the longer friendlier messages at least provided all the details of the short unfriendly ones, which isn't always the case. See, eg, #1928, #2360, #956, #589, #451, .. By all means, record unhelpful error messages as tickets, especially, but not only, when you have an idea of the information that is missing/misleading in them. It is an ongoing process, and balance is as important as perceived friendlyness, and lots of "friendly" suggestions without concrete, specific and useful help may result in a very unfriendly effect (think telephone/email support..). Claus

2008/7/13 Claus Reinke
That is the approach the haskellmode plugins for Vim take: build an index of Haddocumented identifiers once per installation, build an index of import-available identifiers once per :make.
The latter turns out to be a performance issue for large libraries, such as Gtk2hs, so some users had to switch it off, updating less frequently, the former doesn't seem to cause any problems.
Interesting. GHC would probably want to do something a bit smarter than an index at /installation/ time as the package environment is dynamic: 1. Create the index upon the first error message that needs it 2. On all subsequent error messages, update the index by rescanning packages whose versions have changed or have been newly installed 3. Cache this index on disk in some quick-to-read binary format This wouldn't actually be too hard, but is probably more effort than I'm willing to put in for an experimental weekend project!
Then again, we don't do fuzzy matching, only completion of partial identifiers and suggesting possible qualified names and imports for unqualified ones.
Agreed: doing fuzzy matching on >every< available identifier from all packages would truly suck. I would propose just looking for exact matches in non-imported modules for identifiers that are not in scope.
Completion alone can result in many matches, I'd expect fuzzy matching to be worse, and while edit distance is a useful criterion for well-chosen examples, I'd have to agree with those who have their doubts about its use in this context (but then, those who don't need a feature shouldn't stand in the way of those who do;-).
Well, noone has actually said they think fuzzy matching would be useful yet, so I suspect this patch is dead on the vine :). I've filed a ticket with the code anyway (http://hackage.haskell.org/trac/ghc/ticket/2442) so at least it's available for others to look at.
Things have levelled off since then, but there are still cases were longish fuzzy messages are provided by GHC when brief harsh messages would be more to the point - which wouldn't be a problem if the longer friendlier messages at least provided all the details of the short unfriendly ones, which isn't always the case. See, eg, #1928, #2360, #956, #589, #451, ..
...
It is an ongoing process, and balance is as important as perceived friendlyness, and lots of "friendly" suggestions without concrete, specific and useful help may result in a very unfriendly effect (think telephone/email support..).
These are interesting comments and tickets: thanks! I don't know where this kind of unbound-name suggestion we've talked about here fits in the spectrum of useful<->overwhelming, but I would like to think it's more towards the left end. It's easy to see why opinions would differ though. Cheers, Max

On 2008.07.13 14:36:03 +0100, Max Bolingbroke
Well, noone has actually said they think fuzzy matching would be useful yet, so I suspect this patch is dead on the vine :). I've filed a ticket with the code anyway (http://hackage.haskell.org/trac/ghc/ticket/2442) so at least it's available for others to look at. ,,, Cheers, Max
Don't be discouraged; I think it would be useful. IMO, Haskellers tend to be pretty insensitive when it comes to usability. For example, before the GHC devs were asked for that feature, I wonder whether anyone ever thought: "Hey, when ghc -Wall complains about having no type signature - why doesn't it print out the inferred type signature? That would be helpful and convenient." I don't think it's because there's any sort of gung-ho elitism there, it's just not something anyone but the educators like the Helium folk really think about much. -- gwern virginia spies ISADC in rounds GRU Alex CQB Lander Elvis

Don't be discouraged; I think it would be useful. IMO, Haskellers tend to be pretty insensitive when it comes to usability.
For example, before the GHC devs were asked for that feature, I wonder whether anyone ever thought: "Hey, when ghc -Wall complains about having no type signature - why doesn't it print out the inferred type signature? That would be helpful and convenient."
I don't think it's because there's any sort of gung-ho elitism there, it's just not something anyone but the educators like the Helium folk really think about much.
I've been thinking about stuff like this as well. It happens to me as well that I spend minutes on finding an error till I notice that I've misspelled a word.. but this has been while programming PHP. In haskell this kind of error tends to take much less time IMHO. There is a "friendlier" shell (don't remember it's name) which takes another approach: Change color if the word is known.. In case of ghci the commandline could switch color to green before pressing enter to indicate there are no errors left.. Marc

Marc Weber wrote:
There is a "friendlier" shell (don't remember it's name) which takes another approach: Change color if the word is known.. In case of ghci the commandline could switch color to green before pressing enter to indicate there are no errors left..
"fish", the "friendly interactive shell", which I use, although it has lots of minor friendliness problems (compared to ideal, not to bash) that could be fixed by more hacking on its C code :-P -Isaac

On 2008 Jul 13, at 9:36, Max Bolingbroke wrote:
2008/7/13 Claus Reinke
: Then again, we don't do fuzzy matching, only completion of partial identifiers and suggesting possible qualified names and imports for unqualified ones.
Agreed: doing fuzzy matching on >every< available identifier from all packages would truly suck. I would propose just looking for exact matches in non-imported modules for identifiers that are not in scope.
I would fuzzy match on unqualified names in the current module (or directory), qualified names in the module in question, and if that doesn't exist (== have a .hi) then try fuzzy-matching the module name. If the unqualified fuzzy match doesn't work, look for exact matches in non-imported modules.
Well, noone has actually said they think fuzzy matching would be useful yet, so I suspect this patch is dead on the vine :). I've filed
Huh? Edit distance is a good way to handle typoes --- and, while some people have asserted that "we don't need that because I can see it already", most people *don't* see it right away. (I do tend to spot typoes quickly --- in prose. It's nowhere near as accurate for code, and even when it does work it isn't exact: I'll have a "something's wrong" about some spot in the code and still take an hour to spot the typo.) I'm also *really* good at typoes, which you won't see in email but will see regularly in #haskell. :) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Following up on my own message... talking to yourself is a bad sign, right? :) On 2008 Jul 13, at 11:21, Brandon S. Allbery KF8NH wrote:
Huh? Edit distance is a good way to handle typoes --- and, while some people have asserted that "we don't need that because I can see it already", most people *don't* see it right away.
It occurs to me to mention, in the vein of "over-friendly-ing is bad", the example of WATFOR/WATFIV: even a well thought out typo detection facility can badly obscure things. But as long the proposal is to conditionalize these on a compile flag, that isn't too much of an issue. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 13 jul 2008, at 15:36, Max Bolingbroke wrote: [snip, patches towards friendlier error messages in GHC]
Then again, we don't do fuzzy matching, only completion of partial identifiers and suggesting possible qualified names and imports for unqualified ones.
Agreed: doing fuzzy matching on >every< available identifier from all packages would truly suck. I would propose just looking for exact matches in non-imported modules for identifiers that are not in scope.
The approach taken by Helium (which doesn't quite do full Haskell''98 yet) is to not just do fuzzy matching, but to figure out if the identifiers that were fuzzily matched also have the right type. For more info, read Bastiaan Heeren's PhD Thesis: Top Quality Type Error Messages. available at http://people.cs.uu.nl/bastiaan/phdthesis/
Well, noone has actually said they think fuzzy matching would be useful yet, so I suspect this patch is dead on the vine :). I've filed a ticket with the code anyway (http://hackage.haskell.org/trac/ghc/ticket/2442) so at least it's available for others to look at.
More help is very much useful. However, there are different use cases that imply different types of help and diffent types of searching. With kind regards, Arthur van Leeuwen. -- /\ / | arthurvl@cs.uu.nl | Work like you don't need the money /__\ / | A friend is someone with whom | Love like you have never been hurt / \/__ | you can dare to be yourself | Dance like there's nobody watching
participants (10)
-
Arthur van Leeuwen
-
Brandon S. Allbery KF8NH
-
Claus Reinke
-
Evan Laforge
-
Gwern Branwen
-
Isaac Dupree
-
Luke Palmer
-
Marc Weber
-
Max Bolingbroke
-
wren ng thornton