
Doaitse Swierstra wrote:
On 2006 mei 30, at 17:33, Brian Hulley wrote:
But the buffer will nearly always be incomplete as you're editing it.
I was kind of hoping that the syntax of Haskell could be changed so that for any sequence of characters there would be a unique parse that had a minimum number of "gaps" inserted by the editor to create a complete parse tree, and moreover that this parse could be found by deterministic LL1 recursive descent.
If you use my parsercombinators, and are willing to work on the grammar I think this can in principle be done. The combinators automatically "correct" incorrect (i.e. in this case incomplete) input, but: - you may really need some time to tune the grammar so the corrections are what a user might expect (there are many hooks for doing so, but it will takje some effort, since this s also a human interface issue) - making a Haskell grammar that parsers top-down is not an easy task, and making it LL1 is definitely impossible, but also not needed if you use my combinators [rearranged] Not only the =>, but e.g. the commonality between patterns and expressions makes left factorisation a not so simple task.
My idea was to use a coarse grammar for parsing, rather than H98 directly, so that the parser could "see" common errors like newtype T a = T a a -- only one value may be wrapped by a newtype The coarse grammar can be LL1. The purpose of using LL1 is to ensure that the fontification would remain stable whenever the user types text from left to right ie gaps get filled in rather than the parser deciding on wildly different parses. (Just as type inference has to be predictable enough for users to get a feeling for so does parsing I think.) As John pointed out, code can be edited by going back and forward filling in gaps but in this case there is no issue of stability to worry about. I'm also not at all keen on using higher levels of analysis (such as knowing whether SomeId referes to a tycon or class name) to drive the parsing. I'm really wanting something quite simple so that the editor doesn't become so "clever" that it becomes unpredictable and annoying.
- we could in principle provide you with a complete parser for Haskell using our combinators that was tested by replacing the GHC parser with this parser, and that worked (contact athurb@cs.uu.nl to get a copy of his code)
Thanks for the pointer - I'll keep this possibility in mind.
- did you think about how to handle the offside rule? If not, the good news is that we have combinators for that too.
In my C implementation it turned out to be quite simple to handle, but also was something which prevented me from using a table-based approach (although at the time I'd only understood the rule as inserting an implicit } whenever the parse would otherwise get stuck). The needs of incremental parsing also force some changes to Haskell syntax. For example, currently the layout rule is switched off by an opening {, but this is no use if you need to maintain invariants about the structure of the buffer eg that you can search back and forwards for the beginning of top level declarations by examining the leftmost column (a simpler solution would have been to use a symbol such as {} at the top of a module to indicate that the module as a whole doesn't use layout at all). Also, multi-line strings (being multi-line lexemes) are a real nusiance as well, and I don't think they look good in any case (unlines can always be used instead and looks much neater imho) - so probably in the first instance at least I won't bother trying to support these little-used quirks and with any luck they'll be killed off in future versions of Haskell... ;-) Anyway, thanks (also to everyone else who replied) - there's lots of ideas for me to consider, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com