[GHC] #11418: Suggest correct spelling when module is not found because of typo

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature | Status: new request | Priority: lowest | Milestone: Component: Compiler | Version: Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Given these two modules: Aaa.hs: {{{#!hs module Aaa where import BBb main :: IO () main = putStrLn myString }}} Bbb.hs: {{{#!hs module Bbb where myString :: String myString = "hi" }}} There's a typo in `Aaa.hs`, the import should be `Bbb` instead of `BBb`. Running `runhaskell Aaa.hs` results in this error: {{{ Aaa.hs:3:18: Could not find module ‘BBb’ Use -v to see a list of the files searched for. }}} The request is to have the compiler suggest that this is a typo and that it should be `Bbb` instead. It already does this for misspelled functions, if I recall correctly. Because the compiler will not continue when finding an error like this, it won't be harmful to spend a little extra time looking for possible misspellings. How typo's should be recognized is something that should be fleshed out later. For example, a hamming distance of 2 or less could indicate a typo. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by syd: Old description:
Given these two modules:
Aaa.hs: {{{#!hs module Aaa where
import BBb
main :: IO () main = putStrLn myString }}}
Bbb.hs: {{{#!hs module Bbb where
myString :: String myString = "hi" }}}
There's a typo in `Aaa.hs`, the import should be `Bbb` instead of `BBb`.
Running `runhaskell Aaa.hs` results in this error:
{{{ Aaa.hs:3:18: Could not find module ‘BBb’ Use -v to see a list of the files searched for. }}}
The request is to have the compiler suggest that this is a typo and that it should be `Bbb` instead. It already does this for misspelled functions, if I recall correctly.
Because the compiler will not continue when finding an error like this, it won't be harmful to spend a little extra time looking for possible misspellings.
How typo's should be recognized is something that should be fleshed out later. For example, a hamming distance of 2 or less could indicate a typo.
New description: Given these two modules: Aaa.hs: {{{#!hs module Aaa where import BBb main :: IO () main = putStrLn myString }}} Bbb.hs: {{{#!hs module Bbb where myString :: String myString = "hi" }}} There's a typo in `Aaa.hs`, the import should be `Bbb` instead of `BBb`. Running `runhaskell Aaa.hs` results in this error: {{{ Aaa.hs:3:18: Could not find module ‘BBb’ Use -v to see a list of the files searched for. }}} The request is to have the compiler suggest that this is a typo and that it should be `Bbb` instead. It already does this for misspelled functions, if I recall correctly. Because the compiler will not continue when finding an error like this, it won't be harmful to spend a little extra time looking for possible misspellings. How typo's should be recognized is something that should be fleshed out later. For example, a hamming distance of 2 or less could indicate a typo. I'd like to pick this one up and try to implement it as my first patch to GHC. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): I think the difference to functions is that with modules, there is no list of modules in scope. And hammering the file system with random guesses is probably a bad idea... It might work for modules that are imported from packages. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I think the difference to functions is that with modules, there is no
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:2 nomeata]: list of modules in scope. And hammering the file system with random guesses is probably a bad idea...
It might work for modules that are imported from packages.
I haven't really looked into how GHC handles imports to deeply but: Does that mean ghc hammers the file system with guesses for every import? Is there not some sort of cache of previously found modules that we can check? That won't be able to correct every typo but at least every typo for modules that have been compiled already. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): This is the ticket that added the `-fhelpful-errors` flag (yes you can turn it off with `-fno-helpful-errors`): #2442. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata):
Is there not some sort of cache of previously found modules that we can check? That won't be able to correct every typo but at least every typo for modules that have been compiled already.
That might be possible, but such erratic randomness might not give the best user experience. Not sure. I suggest to first implement this (good idea!) for package imports, and when that is done, whoever did it is surely in a good position to assess whether it is feasible for other imports as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): Replying to [comment:2 nomeata]:
It might work for modules that are imported from packages.
Like this you mean? {{{#!hs import Data.Lis }}} {{{ Test.hs:1:8: Could not find module ‘Data.Lis’ Perhaps you meant Data.List (from base-4.8.2.0) Data.Bits (from base-4.8.2.0) Data.DList (from dlist-0.7.1.2@23izrBUDDH96xJKcDju2CZ) Use -v to see a list of the files searched for. }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): Eh, right. Should have known. Sorry for the noise then :-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): syd: so what is your plan here? `ls` all files in current and parent directories, and then try to guess what the user meant? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

syd: so what is your plan here? `ls` all files in current and parent
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:8 thomie]: directories, and then try to guess what the user meant? That is a possibility. If that's the other option, compared to looking at the modules already compiled, we have to trade off looking into the directories for a possible false negative. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie):
compared to looking at the modules already compiled
I don't understand. What are these "modules already compiled", in your example from the description. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:10 thomie]:
compared to looking at the modules already compiled
I don't understand. What are these "modules already compiled", in your example from the description.
As said above:
Is there not some sort of cache of previously found modules that we can check? That won't be able to correct every typo but at least every typo for modules that have been compiled already.
I didn't really get an answer then. If there is no such cache, then looking into the directories shouldn't be a problem because that would mean that GHC does that for every import. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): Ok, let's back up.
Does that mean ghc hammers the file system with guesses for every import?
Given the two files from the description, here's what happens when you run `ghc Aaa.hs`: * GHC asks the OS to open `./Aaa.hs`, and reads its contents * GHC figures out it needs module `BBb` * GHC "guesses" that `BBb` is either in `./BBb.hs` or in `./BBb.lhs` * GHC asks the OS to open those two files in order * Since neither file exists, an error message is shown (after also consulting the package database, but let's ignore that for the moment) So in total GHC tries to open 3 files: `Aaa.hs`, `BBb.hs` and `BBb.lhs`. It doesn't have to ask the OS for a list of all files in the directory. If you run `ghc Aaa.hs` again later, it will do the exact same thing. There is no cache. Even if there are another one million files in the current directory, GHC still has to only open those 3 files. It doesn't "hammer" the file system.
Is there not some sort of cache of previously found modules that we can check?
Not in the above scenerio. But just to be clear: do you mean a cache that could be used between separate invocations of GHC, or within a single invocation of GHC? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Ok, let's back up.
Does that mean ghc hammers the file system with guesses for every import?
Given the two files from the description, here's what happens when you run `ghc Aaa.hs`: * GHC asks the OS to open `./Aaa.hs`, and reads its contents * GHC figures out it needs module `BBb` * GHC "guesses" that `BBb` is either in `./BBb.hs` or in `./BBb.lhs` * GHC asks the OS to open those two files in order * Since neither file exists, an error message is shown (after also consulting the package database, but let's ignore that for the moment)
So in total GHC tries to open 3 files: `Aaa.hs`, `BBb.hs` and `BBb.lhs`. It doesn't have to ask the OS for a list of all files in the directory. Does that mean GHC tries both of those (multiplied by each extra source
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:12 thomie]: directory specified) for every import of `Aaa`? That may be something that could be optimized for `ghc --make`.
If you run `ghc Aaa.hs` again later, it will do the exact same thing. There is no cache. Thank you for this clarification!
Even if there are another one million files in the current directory, GHC still has to only open those 3 files. It doesn't "hammer" the file system.
Is there not some sort of cache of previously found modules that we can check?
Not in the above scenerio. But just to be clear: do you mean a cache that could be used between separate invocations of GHC, or within a single invocation of GHC?
I keep forgetting that there is a big difference between `ghc` and `ghc --make`. I meant `ghc --make`, I think, so a cache that works across seperate invocations. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): If the `.hs` file exists, GHC doesn't check for the `.lhs` file. I didn't make that clear. There is no difference between `ghc` and `ghc --make`: `--make` is the default mode. There is a difference between `-c` and `--make`. It would help if you described how you would like this caching mechanism to work, and how it would help with suggesting the correct spelling when a module is not found. Use the example from the description, or a different example if that makes things more clear. Don't worry too much about how GHC currently works; we can get back to that afterwards. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:14 thomie]:
If the `.hs` file exists, GHC doesn't check for the `.lhs` file. I didn't make that clear. I assumed as much. I was just stating the worst-case.
There is no difference between `ghc` and `ghc --make`: `--make` is the default mode. There is a difference between `-c` and `--make`. Okay. Thanks for clarrifying.
It would help if you described how you would like this caching mechanism to work, and how it would help with suggesting the correct spelling when a module is not found. Use the example from the description, or a different example if that makes things more clear. Don't worry too much about how GHC currently works; we can get back to that afterwards.
The caching system for module locations could work like this: Before compiling anything, ghc could scan the directory to find all the Haskell modules that are there and build a `Map ModuleName FilePath` (pseudocode). This assumes that there is not much in the directories other than the code you're trying to compile. It also assumes that you're not moving around modules during compilation. Both relatively valid assumptions. Applied to the example above: `ghc --make Aaa.hs` first looks through the directory and finds `Aaa` in file `Aaa.hs` and `Bbb` in file `Bbb.hs`. It then gives this information to any subsequent compilation procedures. Next, when going through `Aaa`, `ghc` figures out it needs to compile `Bbb` so it looks in the map to find the filepath where it's stored. (No more lookups in the filesystem at this point.) When `ghc` encounters `import BBb` (the spelling error) it looks in the map and notices that it cannot find `BBb`. It then goes through all the keys of the map and calculates the hamming distance/insert distance/some other metric between between the key and `BBb`. It notices that `Bbb` and `BBb` are similar and outputs the following helpful error message. {{{ Aaa.hs:3:1: Could not find module BBb. Perhaps you meant Bbb instead of BBb at line 3 of module Aaa (Aaa.hs) }}} Perhaps it could even locate the spelling mistake: {{{ Actual BBb Expected Bbb Diff _b_ }}} If anything is still unclear about this idea. Please don't hesitate to ask. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): * Would ghc scan subdirectories as well (recursively?), when creating the map? * Would ghc persist this map to disk? (so it could be used on the next `ghc --make` run) * You mentioned a trade-off before. What would the other side of the trade-off look like? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:16 thomie]:
* Would ghc scan subdirectories as well (recursively?), when creating
the map? It would have to, I think. The idea is to have a comprehensive mapping of all the (non-package) modules that are available for imports.
* Would ghc persist this map to disk? (so it could be used on the next `ghc --make` run) No. Between runs of `ghc --make` the user may want to add new modules or move modules and that would invalidate this map.
* You mentioned a trade-off before. What would the other side of the trade-off look like? If you mean the 'trade-off' in comment 9, that is no longer applicable.
What we are trading off here is: Loss: - Cost of scanning the entire directory. Gain: - Comprehensive mapping of modules that can be imported. - No file-system calls to find modules while resolving the imports. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): This feature could be called `-fhelpful-import-errors`. Do you think it should be enabled by default? We would have to ask others as well. Not only does scanning a directory take time, there is also a memory cost. A pretty bad scenario would be running ghc in $HOME. On my system: {{{ $ find ~ > all-files-in-home $ wc -l all-files-in-home 168014 all-files-in-home $ du -h all-files-in-home 14M all-files-in-home }}} You could exclude subdirectories that don't start with an uppercase letter from the scan. But Windows paths are case-insensitive, so it wouldn't help there. Or stop scanning after the first N=1000 or so files (do some measurements to see what's reasonable). How about this partial solution: * A lot of people use Cabal for library development * Cabal already asks you to specify all known modules in either `exposed- modules` or `other-modules` (I guess you could make typos here..) * Cabal already passes this list of modules to GHC * So GHC already knows the names of all modules that could possibly be imported (not quite, see https://github.com/haskell/cabal/issues/2982#issuecomment-169786310) * Use that list of modules to make spelling suggestions on import errors -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:18 thomie]:
This feature could be called `-fhelpful-import-errors`. Do you think it should be enabled by default? That depends. See below. We would have to ask others as well.
Not only does scanning a directory take time, there is also a memory cost. A pretty bad scenario would be running ghc in $HOME.
On my system: {{{ $ find ~ > all-files-in-home $ wc -l all-files-in-home 168014 all-files-in-home $ du -h all-files-in-home 14M all-files-in-home }}}
Yes. If we take this approach, the flag should definitely not be enabled by default.
Or stop scanning after the first N=1000 or so files (do some measurements to see what's reasonable). That is not feasable. We could miss out on modules we need. How about this partial solution: * A lot of people use Cabal for library development * Cabal already asks you to specify all known modules in either `exposed-modules` or `other-modules` (I guess you could make typos here..) * Cabal already passes this list of modules to GHC * So GHC already knows the names of all modules that could possibly be imported (not quite, see https://github.com/haskell/cabal/issues/2982#issuecomment-169786310) * Use that list of modules to make spelling suggestions on import errors
If we take this approach, then I think the flag should be enabled by default. However: I, for example, use makefiles during development instead of cabal because it allows for a faster code/compile/fix type errors cycle. This solution would therefore not help me at all. I prefer the situation in which this is a non-default flag and cabal is not required for it to function. Feedback welcome. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie):
I, for example, use makefiles during development instead of cabal because it allows for a faster code/compile/fix type errors cycle.
Just curious: wouldn't `cabal repl` be faster? Or adding `ghc-options: -O0` to your `.cabal` file? If you want to proceed with the scanning the filesystem plan, I think it is best create a small wiki page, with a specification. See https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/AddingFeatures. Basically copy what you wrote in comment:15 + clarifications from later comments. That way it is easier to digest for others, so they don't have to read through all these comments. Paste a link here, and send it to glasgow-haskell-users@haskell.org, requesting feedback. Hammering the file system might be frowned upon by some, so make sure you explain the flag (name to be decided by you) is off by default. In case you don't get any reactions, go to #ghc on the freenode irc channel, and ask bgamari (ghc maintainer) if he thinks it is ok to implement this. Then do it and submit a patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): To be clear: I'm not against this feature. Good error messages are important. I just think you should get some more feedback from others before spending time on implementation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:21 thomie]:
To be clear: I'm not against this feature. Good error messages are important. I just think you should get some more feedback from others before spending time on implementation. Oh you're absolutely right. Your suggestion of a workflow sounds entirely agreeable.
At the moment I'm in the middle of exams so that wikipage won't be created until at least Feb 13 2016. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): For the record, I created the wiki page: https://ghc.haskell.org/trac/ghc/wiki/Proposal/HelpfulImportError I'm sending it to glasgow-haskell-users@ now. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hvr): Replying to [comment:24 syd]:
For the record, I created the wiki page: https://ghc.haskell.org/trac/ghc/wiki/Proposal/HelpfulImportError
Just to clarify, this feature is only relevant to `ghc --make` (and therefore may not benefit future cabal versions, as there's a chance that cabal will move away from using `ghc --make` altogether)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Replying to [comment:24 syd]:
For the record, I created the wiki page: https://ghc.haskell.org/trac/ghc/wiki/Proposal/HelpfulImportError
Just to clarify, this feature is only relevant to `ghc --make` (and
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:25 hvr]: therefore may not benefit future cabal versions, as there's a chance that cabal will move away from using `ghc --make` altogether)? The mapping suggestion is only relevant to `ghc --make`, yes. If cabal moves away from using `ghc --make`, then that becomes cabal's responsibility. The import typo warning will stay relevant either way, but it may be implemented by using cabal's `exposed-modules` or `other- modules` instead. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): From Sven Panne: Just a few quick remarks: * Whatever you do, never walk the file system tree up or down in an uncontrolled way, this will kill basically all benefits and is a show- stopper. File systems like NFS, NTFS, stuff on USB sticks etc. are so **horribly** slow when used that way that the walks will probably dominate your compilation time. And even under Linux it's not uncommon to have a few dozen directory levels and hundreds of thousands of files below our cwd: Just check out a few repositories, have some leftovers from compilations, tons of documentations in small HTML files etc., and this sums up quickly. Git walks up the tree, but only looking for a specific directory and will e.g. not cross mount points under normal circumstances. This is probably the limit of what you can do. * Caching between runs will be tricky: How will you invalidate the cache? People can (and will :-) do all kinds of evil things between runs, so how can you (in-)validate the cache quicker than re-scanning the file system again? * As a general rule to keep in mind during the design: Successful compiler runs should not pay a price. It's OK if things are a little bit slower when an error occurs, but the main use case is successful compilation. This is a bit like exceptions in most programming language implementations: They are more or less for free when you don't use them (yes, they have a cost even then because they complicate/invalidate some compiler optimizations, but let's forget that for now), and are often costly when you actually raise them. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

From Sven Panne: Just a few quick remarks:
* Whatever you do, never walk the file system tree up or down in an uncontrolled way, this will kill basically all benefits and is a show- stopper. File systems like NFS, NTFS, stuff on USB sticks etc. are so **horribly** slow when used that way that the walks will probably dominate your compilation time. And even under Linux it's not uncommon to have a few dozen directory levels and hundreds of thousands of files below our cwd: Just check out a few repositories, have some leftovers from compilations, tons of documentations in small HTML files etc., and this sums up quickly. Git walks up the tree, but only looking for a specific
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:27 simonpj]: directory and will e.g. not cross mount points under normal circumstances. This is probably the limit of what you can do. For the record: I never meant to walk the entire cwd, only source directories. While I get the general idea of 'never walk directory trees', in the end that is what you're doing anyway. In directories that only contain the source files, this approach would only reduce the amount of files opened. That said, I'm starting to think the cabal approach would be both easier and a smaller change.
* Caching between runs will be tricky: How will you invalidate the cache? People can (and will :-) do all kinds of evil things between runs, so how can you (in-)validate the cache quicker than re-scanning the file system again? Good point. Caching is probably not feasible here.
* As a general rule to keep in mind during the design: Successful compiler runs should not pay a price. It's OK if things are a little bit slower when an error occurs, but the main use case is successful compilation. This is a bit like exceptions in most programming language implementations: They are more or less for free when you don't use them (yes, they have a cost even then because they complicate/invalidate some compiler optimizations, but let's forget that for now), and are often costly when you actually raise them. Good feedback, thank you very much.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

For the record: I never meant to walk the entire cwd, only source
While I get the general idea of 'never walk directory trees', in the end
#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by svenpanne): Replying to [comment:28 syd]: directories. Hmm, what is a "source directory" then? And what is the exact use case for the feature? I'm a bit lost in this long ticket by now... For anything non-trivial, most people will probably use cabal or stack, anyway (and even for trivial stuff it's often quicker to do a "cabal init" or "stack new" than to fiddle around with epic GHC command lines). that is what you're doing anyway. Hmmm, unless things have changed, I don't think GHC walks the directory tree somehow. Trying to open a file in a few(!) directories is a fundamentally different operation than walking a directory hierarchy. If you try to open another file a tiny bit later, the OS will probably still have the relevant parts of the file system in its cache, so somehow your proposal is moving the caching strategy from the OS to user land, something which is rarely worthwhile and is tricky to get right and perform well.
In directories that only contain the source files, this approach would only reduce the amount of files opened. [...]
Your proposal actually trades off trying to open several files from probably OS-cached directories against a single retrieval of a potentially large directory hierarchy. You probably save a few context switches, but lose in most other aspects. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:29 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11418: Suggest correct spelling when module is not found because of typo -------------------------------------+------------------------------------- Reporter: syd | Owner: Type: feature request | Status: new Priority: lowest | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect | Unknown/Multiple warning at compile-time | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by syd): Replying to [comment:29 svenpanne]:
For anything non-trivial, most people will probably use cabal or stack, anyway (and even for trivial stuff it's often quicker to do a "cabal init" or "stack new" than to fiddle around with epic GHC command lines).
You're right. I modified the page (heavily) to reflect the result of this discussion. That should make it easier to follow along. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11418#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC