
| Then a second pass is done over the syntax tree. This pass does | keep | track of the context. Whenever it finds some error evidence, it | reports it. The syntax tree is a big type. A second pass would be a fairly big deal. But doable. You'd need to be able to look at the nature of the error; SDocs won't do. Maybe errors become a data type. That might be a good idea but it would be another big deal. I very much doubt that you'll be able to discard the context information from the type checker. Maybe some of it. I can't say exactly why, it's a gut feel for now. There is a rich literature on type errors, worth digging into. Alejandro Serrano and his adviser Juraiaan Hage are thinking about this at the moment. SherLoc was interesting too; worked for Haskell. Lennart gave a nice talk at the Haskell Implementors meeting last year about error provenance. Maybe available online? Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Joachim Breitner | Sent: 29 November 2016 16:20 | To: ghc-devs@haskell.org | Subject: Separating typechecking and type error reporting in two | passes? | | Hi, | | the following is a rough idea that I came up with while pondering | #12864, which is about better (and fewer!) error messages when the | user forgot an argument to a function, or swapped their order. | | | The current scheme of type checking and error reporting is the | following (please correct me if I am wrong, I have actually done | little in this area so far): | | The typechecker traverses the syntax tree from top to bottom. | Along | the way, it keeps track of the current context in terms of SDoc | fragments of the form “In the second argument of…”. It generates | constraints (e.g. the argument type of a function is the type of | the | actual argument), which it either solves, or propagate outwards as | “wanted constraints”, which refer to their context, and refer to a | “hole” in the program where, if this constraint gets solved later, | some form of evidence gets filled in. | At the end, if there are any wanted constraints left, they are | reported as type errors, together with the context they are | mentioned. | But with -fdefer-type-errors, they are not reported, but the | evidence holes are filled in with essentially calls to `error` | that | print the error message at runtime. | | The problem I see with this approach is that the type checker has to | remember interesting bits about context that might possibly eventually | be relevant when reporting the error. In particular, it makes it | harder to report multiple related error messages at once. | | | So I am wondering if this approach would be better: | | The type checker does *not* bother keeping track of context: It | traverses the syntax tree, creates constraints, and unsolvable | constraints get filled with “error evidence”. Basically as if | -fdefer-type-errors is one. | | Then a second pass is done over the syntax tree. This pass does | keep | track of the context. Whenever it finds some error evidence, it | reports it. | | | I see the following advantages: | | * The type checker code gets a bit simpler and tidier. | * As most compiled programs are type correct [citation needed], the | type checker, but not the error reporting, is compiler-performance- | critical. This might make type checking a (possibly insignificant) | tad faster. | * The error reporting pass can “look around” more. For example, | before | recursing into a pair `(e1,e2)`, it could check for the special | case | of `(e1 ▷ TyError Int Bool, e1 ▷ TyError Bool Int)` and report this | as one error “Tuple with swapped arguments” instead of two errors. | The #12864 is a more elaborate application of this. | * There could even be an intermediate pass over the syntax tree | between typechecking and error reporting that “transform” or | “optimizes” the AST by moving type errors around, by coalescing | error or other things, all with the aim of giving easier to | understand error messages. | * (This is getting into the area of wild dreams:) | Library authors could somehow express custom error messages | for certain patterns of the form | If the AST now contains an application of `Library.foo` where | the first argument is a type error `TyError Int Bool`, then | give this particular domain-specific error message. | | I don’t see any disadvantages now, but there probably are some, and I | would be grateful about feedback from those who actually have worked | with this part of GHC before. | | Thanks, | Joachim | | -- | Joachim “nomeata” Breitner | mail@joachim-breitner.de • | https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.j | oachim- | breitner.de%2F&data=02%7C01%7Csimonpj%40microsoft.com%7C41dcebe5f26043 | 94f94108d41873a1f0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636160 | 332322948063&sdata=hoylWVvBSQjqGmm0ICnr3nDFz80waxDStDOjWDPQb5A%3D&rese | rved=0 | XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F | Debian Developer: nomeata@debian.org