Editors for Haskell

Walter Potter

25 May 2006 25 May '06

3:02 p.m.

All, I hope that this is the right place for this question. I'm using Haskell (GHC and Hugs) on several different platforms. Windows, OS X and Linux systems. I'd like to have an IDE that works well for medium to large size projects. I know of Eclipse and hIDE. Vim works fine but I'd like more. hiDE seems to be "in process." What would you suggest? I'll be asking my students to use the same IDE. Thanks, Walt

Show replies by date

Christopher Brown

25 May 25 May

3:10 p.m.

Hi Walt, For Mac OS X I would strongly recommend using Sub Etha Edit. Its a very simple editor to use, and offers a lot of power and flexibility. It also has a Haskell highlighting mode. You can find it at: http://www.codingmonkeys.de/subethaedit/ Chris. On 25 May 2006, at 16:02, Walter Potter wrote:

...

All,

I hope that this is the right place for this question.

I'm using Haskell (GHC and Hugs) on several different platforms. Windows, OS X and Linux systems.

I'd like to have an IDE that works well for medium to large size projects. I know of Eclipse and hIDE. Vim works fine but I'd like more. hiDE seems to be "in process."

What would you suggest? I'll be asking my students to use the same IDE.

Thanks, Walt _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Bjorn Bringert

30 May 30 May

1:55 a.m.

Hi Chris, I followed your advice and tried SubEthaEdit. It seems to work really well, except that I can't figure out how to get it to indent my Haskell code correctly. What I expected was something like the Emacs Haskell mode where I can hit tab to cycle between the different reasonable indentations for a line. Am I right that SubEthaEdit does not have this feature? Or did I not RTFM enough? Maybe it's just me, but I find it difficult to write Haskell code without it. /Björn On May 25, 2006, at 8:10 AM, Christopher Brown wrote:

...

Hi Walt,

For Mac OS X I would strongly recommend using Sub Etha Edit. Its a very simple editor to use, and offers a lot of power and flexibility. It also has a Haskell highlighting mode.

You can find it at:

http://www.codingmonkeys.de/subethaedit/

Chris.

On 25 May 2006, at 16:02, Walter Potter wrote:

...
All,

I hope that this is the right place for this question.

I'm using Haskell (GHC and Hugs) on several different platforms. Windows, OS X and Linux systems.

I'd like to have an IDE that works well for medium to large size projects. I know of Eclipse and hIDE. Vim works fine but I'd like more. hiDE seems to be "in process."

What would you suggest? I'll be asking my students to use the same IDE.

Thanks, Walt _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Mathew Mills

2:18 a.m.

With Haskell's lovely strong static typing, it is a crying shame we don't have an editor with immediate feedback, ala Eclipse. On 5/29/06 6:55 PM, "Bjorn Bringert" wrote:

...

Hi Chris,

I followed your advice and tried SubEthaEdit. It seems to work really well, except that I can't figure out how to get it to indent my Haskell code correctly. What I expected was something like the Emacs Haskell mode where I can hit tab to cycle between the different reasonable indentations for a line. Am I right that SubEthaEdit does not have this feature? Or did I not RTFM enough? Maybe it's just me, but I find it difficult to write Haskell code without it.

/Björn

On May 25, 2006, at 8:10 AM, Christopher Brown wrote:

...
Hi Walt,

For Mac OS X I would strongly recommend using Sub Etha Edit. Its a very simple editor to use, and offers a lot of power and flexibility. It also has a Haskell highlighting mode.

You can find it at:

http://www.codingmonkeys.de/subethaedit/

Chris.

On 25 May 2006, at 16:02, Walter Potter wrote:

...
All,

I hope that this is the right place for this question.

I'm using Haskell (GHC and Hugs) on several different platforms. Windows, OS X and Linux systems.

I'd like to have an IDE that works well for medium to large size projects. I know of Eclipse and hIDE. Vim works fine but I'd like more. hiDE seems to be "in process."

What would you suggest? I'll be asking my students to use the same IDE.

Thanks, Walt _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Brian Hulley

6:59 p.m.

Mathew Mills wrote:

...

With Haskell's lovely strong static typing, it is a crying shame we don't have an editor with immediate feedback, ala Eclipse.

I've started writing an editor for Haskell. (It will be a commercial product) The first prototype was in C - now I'm re-writing from scratch in Haskell. It is quite a tall order to provide immediate typed feedback of an edit buffer that will in general be syntactically incomplete but this is my eventual aim. One issue in the area of immediate feedback is that Haskell's syntax is troublesome in a few areas. Consider: foo :: SomeClass a => a -> a when the user has just typed: foo :: SomeClass a should the editor assume SomeClass is a Tycon or a class name? One idea I had to solve this problem was to change the syntax of Haskell slightly so that constraints would be enclosed in {} instead of preceeding => ie: foo :: {SomeClass a} a->a so that in foo :: {SomeClass it is already determined that SomeClass must be a class name. Another thing which causes difficulty is the use of qualified operators, and the fact that the qualification syntax is in the context free grammar instead of being kept in the lexical syntax (where I think it belongs). For example, afaiu according to H98 (but not GHCi) it is permissible to write: a Prelude . + b -- qvarsym -> [ modid . ] varsym whereas in my prototype I put all this into a level immediately above the lexer but below the CFG so that no spaces are allowed thus: a Prelude.+ b -- no spaces in the qvarsym a `Prelude.(+)` b -- a little generalization a `Prelude.(+) b -- no need for closing ` (The generalization above is intended for when you don't know whether or not the function you're qualifying has been declared as an operator but you want to use it as an operator eg if a pop-up list would appear after you typed `Prelude. with entries such as (+) plus add etc) With the above changes, it is possible to parse Haskell (or at least as much as I got round to implementing in my C++ prototype) using a simple deterministic recursive descent parser with only 1 token of lookahead. (There is possibly some confusion in the H98 report about exactly how ambiguous expressions involving typed case alternatives might be parsed eg x :: a->b -> if x 4 then ... but I'm hoping it will be ok to just fix the syntax here by requiring extra brackets) Anyway I suppose the point of this post is to see whether or not people feel that such changes are acceptable in an editor, or whether an editor must adhere exactly to the standard (or whether the standard can be changed to enable the determinism and ease of parsing necessary for interactive editing with immediate feedback)? Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Benjamin Franksen

7:43 p.m.

On Tuesday 30 May 2006 20:59, Brian Hulley wrote:

...

It is quite a tall order to provide immediate typed feedback of an edit buffer that will in general be syntactically incomplete but this is my eventual aim.

One issue in the area of immediate feedback is that Haskell's syntax is troublesome in a few areas. Consider:

foo :: SomeClass a => a -> a

when the user has just typed:

foo :: SomeClass a

should the editor assume SomeClass is a Tycon or a class name?

If SomeClass has been defined somewhere (in the same buffer or in some imported module), the editor will know whether it is a class or a type and can react accordingly (e.g. propose to insert '=>' or use a special color for 'SomeClass'). If not, then the editor should remain agnostic. What is the problem here? No user will expect the editor to unambigously parse /incomplete/ code, or will they?

...

One idea I had to solve this problem was to change the syntax of Haskell slightly so that constraints would be enclosed in {} instead of preceeding => ie:

foo :: {SomeClass a} a->a

so that in

foo :: {SomeClass

it is already determined that SomeClass must be a class name.

Another thing which causes difficulty is the use of qualified operators, and the fact that the qualification syntax is in the context free grammar instead of being kept in the lexical syntax (where I think it belongs). For example, afaiu according to H98 (but not GHCi) it is permissible to write:

a Prelude . + b

-- qvarsym -> [ modid . ] varsym

whereas in my prototype I put all this into a level immediately above the lexer but below the CFG so that no spaces are allowed thus:

a Prelude.+ b -- no spaces in the qvarsym a `Prelude.(+)` b -- a little generalization a `Prelude.(+) b -- no need for closing `

(The generalization above is intended for when you don't know whether or not the function you're qualifying has been declared as an operator but you want to use it as an operator eg if a pop-up list would appear after you typed `Prelude. with entries such as (+) plus add etc)

With the above changes, it is possible to parse Haskell (or at least as much as I got round to implementing in my C++ prototype) using a simple deterministic recursive descent parser with only 1 token of lookahead.

(There is possibly some confusion in the H98 report about exactly how ambiguous expressions involving typed case alternatives might be parsed eg x

:: a->b -> if x 4 then ... but I'm hoping it will be ok to just fix :: the

syntax here by requiring extra brackets)

Anyway I suppose the point of this post is to see whether or not people feel that such changes are acceptable in an editor, or whether an editor must adhere exactly to the standard (or whether the standard can be changed to enable the determinism and ease of parsing necessary for interactive editing with immediate feedback)?

I would not like an editor that forces me to use a special coding style (like using brackets where not strictly necessary). Even less would I like to use one that introduces non-standard syntax. My humble opinion is that you'll have to bite the bullet and implement your syntax recognizer so that it conforms to Haskell'98 (including the approved addenda) [and addtionally however many of the existing extensions you'll manage to support]. It may be more difficult but in the end will also be a lot more useful. And you'll have to find a way to (unobtrusively!) let the user know whether some piece of code does not yet have enough context to parse it unambigously. Ben

Brian Hulley

9:33 p.m.

Benjamin Franksen wrote:

...

On Tuesday 30 May 2006 20:59, Brian Hulley wrote:

...
It is quite a tall order to provide immediate typed feedback of an edit buffer that will in general be syntactically incomplete but this is my eventual aim.

One issue in the area of immediate feedback is that Haskell's syntax is troublesome in a few areas. Consider:

foo :: SomeClass a => a -> a

when the user has just typed:

foo :: SomeClass a

should the editor assume SomeClass is a Tycon or a class name?

If SomeClass has been defined somewhere (in the same buffer or in some imported module), the editor will know whether it is a class or a type and can react accordingly (e.g. propose to insert '=>' or use a special color for 'SomeClass'). If not, then the editor should remain agnostic. What is the problem here? No user will expect the editor to unambigously parse /incomplete/ code, or will they?

But the buffer will nearly always be incomplete as you're editing it. I was kind of hoping that the syntax of Haskell could be changed so that for any sequence of characters there would be a unique parse that had a minimum number of "gaps" inserted by the editor to create a complete parse tree, and moreover that this parse could be found by deterministic LL1 recursive descent. Haskell already seems so very close to having this property - its a real pity about => The problem is: when Haskell was designed, no one seems to have thought about the syntax from the point of view of making it easy to parse or ensuring that it would have this nice property for editing.

...

...
foo :: {SomeClass a} a->a a Prelude.+ b -- no spaces in the qvarsym a `Prelude.(+)` b -- a little generalization a `Prelude.(+) b -- no need for closing `

I would not like an editor that forces me to use a special coding style (like using brackets where not strictly necessary). Even less would I like to use one that introduces non-standard syntax.

My humble opinion is that you'll have to bite the bullet and implement your syntax recognizer so that it conforms to Haskell'98 (including the approved addenda) [and addtionally however many of the existing extensions you'll manage to support]. It may be more difficult but in the end will also be a lot more useful. And you'll have to find a way to (unobtrusively!) let the user know whether some piece of code does not yet have enough context to parse it unambigously.

Thanks for the feedback - I suppose I was being overly optimistic to think I could just change the language to suit my editor ;-). I'll have to find a balance between what I'm able to implement and what is specified in Haskell98 (or Haskell' etc) - just as GHCi has done with qvarsym, and perhaps have different levels of conformance to H98 so people could choose their preferred balance between conformity and instantaneousness of feedback (everything would of course still be loaded/saved as H98). Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

John Meacham

10:46 p.m.

On Tue, May 30, 2006 at 10:33:05PM +0100, Brian Hulley wrote:

...

I was kind of hoping that the syntax of Haskell could be changed so that for any sequence of characters there would be a unique parse that had a minimum number of "gaps" inserted by the editor to create a complete parse tree, and moreover that this parse could be found by deterministic LL1 recursive descent.

a problem is that people don't write code like they write novels. At least I don't. I skip around, adding the beginning and ending of functions and going back and filling in the center, or writing the => first, then the type and going back and filling in the context. or taking existing code and changing it in pieces, deleting something there, adding it somewhere else, changing names, all of which leave intermediate states that arn't just truncated programs. I don't see such a property actually helping that much in terms of this problem. I think you have no choice but to use heuristics to decide what the user intends and try to limit the scope of potentially incomplete source souring later things typed. Although not as popular as they once were, error correcting parsers is a fairly advanced field in computer science, combined with the knowledge of what names are in scope, I think you can come up with fairly robust heuristics. It is by no means a trivial problem, but it is certainly a tractable one. John -- John Meacham - ⑆repetae.net⑆john⑈

George Beshers

11:32 p.m.

Well, my thesis (many moons ago I assure you) was on syntax directed editors. I came to the conclusion that letting the user do what they want is a requirement, but that "heuristics" and other "smarts" were to be avoided on the grounds that at least for my implementation they were more trouble than they were worth. Thus I would avoid error correcting parsers unless you are very confident that the correction used is at least type-safe and that it is not "sticking things in" that are unwanted (or even more maddening removing what I just typed and which **was** what I wanted). So my recommendation is that pointing out where the syntax and typing errors are without having to leave the editor would be great. Then the time required to actually make the corrections is minimal in terms of overall development time. The "interesting" (graveyard laugh) problems revolve around editing a library and the program that uses it at the same time with a few obvious extensions. The "graveyard laugh" is because I rapidly found I needed transactions and as the implementation was in C++ it had some very nasty pointer issues going to and from disk. Performance was also an issue --- but that was a a pre-sparc SUN, M68020 w/ 4Meg of RAM if memory serves me correctly. Good Luck. John Meacham wrote:

...

On Tue, May 30, 2006 at 10:33:05PM +0100, Brian Hulley wrote:

...
I was kind of hoping that the syntax of Haskell could be changed so that for any sequence of characters there would be a unique parse that had a minimum number of "gaps" inserted by the editor to create a complete parse tree, and moreover that this parse could be found by deterministic LL1 recursive descent.

a problem is that people don't write code like they write novels. At least I don't. I skip around, adding the beginning and ending of functions and going back and filling in the center, or writing the => first, then the type and going back and filling in the context. or taking existing code and changing it in pieces, deleting something there, adding it somewhere else, changing names, all of which leave intermediate states that arn't just truncated programs.

I don't see such a property actually helping that much in terms of this problem. I think you have no choice but to use heuristics to decide what the user intends and try to limit the scope of potentially incomplete source souring later things typed. Although not as popular as they once were, error correcting parsers is a fairly advanced field in computer science, combined with the knowledge of what names are in scope, I think you can come up with fairly robust heuristics. It is by no means a trivial problem, but it is certainly a tractable one.

John

Daniel McAllansmith

31 May 31 May

12:19 a.m.

On Wednesday 31 May 2006 11:32, George Beshers wrote:

...

Well, my thesis (many moons ago I assure you) was on syntax directed editors. I came to the conclusion that letting the user do what they want is a requirement, but that "heuristics" and other "smarts" were to be avoided on the grounds that at least for my implementation they were more trouble than they were worth. Thus I would avoid error correcting parsers unless you are very confident that the correction used is at least type-safe and that it is not "sticking things in" that are unwanted (or even more maddening removing what I just typed and which **was** what I wanted).

I certainly agree. I've ended up loathing any editor which unilaterally decides to change what I've typed. That _might_ be because they weren't done properly... maybe.

...

So my recommendation is that pointing out where the syntax and typing errors are without having to leave the editor would be great. Then the time required to actually make the corrections is minimal in terms of overall development time.

It might have been mentioned before but I think IntelliJ's 'Idea' does an excellent job as a java editor. It'd be worth looking at for... ideas, as it were. It doesn't automatically correct anything, when it detects an error it makes it obvious by highlighting the offending code, putting marks on the scrollbar and colouring an indicator. When the cursor is on an error you can hit a key-combo to bring up a list of potential remedial actions.

...

The "interesting" (graveyard laugh) problems revolve around editing a library and the program that uses it at the same time with a few obvious extensions. The "graveyard laugh" is because I rapidly found I needed transactions and as the implementation was in C++ it had some very nasty pointer issues going to and from disk. Performance was also an issue --- but that was a a pre-sparc SUN, M68020 w/ 4Meg of RAM if memory serves me correctly.

Idea is, more or less, transactional. It's 'refactorings' can affect multiple entities, and they're stored in a local history so you can rollback when you want to. I'm not sure it'd run too well on that sort of machine though. :) For what it's worth, I'd love a Haskell editor the likes of Idea (and better). Lots of refactory-goodness, auto-importing of functions, function suggestion from type, display of inferred types, etc, etc, etc Daniel

John Meacham

12:30 a.m.

On Wed, May 31, 2006 at 12:19:40PM +1200, Daniel McAllansmith wrote:

...

On Wednesday 31 May 2006 11:32, George Beshers wrote:

...
Well, my thesis (many moons ago I assure you) was on syntax directed editors. I came to the conclusion that letting the user do what they want is a requirement, but that "heuristics" and other "smarts" were to be avoided on the grounds that at least for my implementation they were more trouble than they were worth. Thus I would avoid error correcting parsers unless you are very confident that the correction used is at least type-safe and that it is not "sticking things in" that are unwanted (or even more maddening removing what I just typed and which **was** what I wanted).

I certainly agree. I've ended up loathing any editor which unilaterally decides to change what I've typed. That _might_ be because they weren't done properly... maybe.

Oh, I did not mean error correcting parsers as in something that would change what you wrote, I meant ones that can deal with parse errors in a local fashion without catastrophically failing to parse the whole program. Parsers that can recover from incomplete input. parser combinators might be a better solution as their grammer can actually be guided and changed by what is in scope rather than fixed at the parser generator phase. editors should never change what the person wrote in general without some prodding by the user IMHO. John -- John Meacham - ⑆repetae.net⑆john⑈

Thomas Hallgren

1 Jun 1 Jun

6:05 a.m.

Brian Hulley wrote:

...

Another thing which causes difficulty is the use of qualified operators, and the fact that the qualification syntax is in the context free grammar instead of being kept in the lexical syntax (where I think it belongs).

You are in luck, because according to the Haskell 98 Report, qualified names are in the lexical syntax! http://www.haskell.org/onlinereport/syntax-iso.html So, C.f is a qualified name, but C . f is composition of the Constructor C with the function f. -- Thomas H

Brian Hulley

7:48 a.m.

Thomas Hallgren wrote:

...

Brian Hulley wrote:

...
Another thing which causes difficulty is the use of qualified operators, and the fact that the qualification syntax is in the context free grammar instead of being kept in the lexical syntax (where I think it belongs).

You are in luck, because according to the Haskell 98 Report, qualified names are in the lexical syntax!

http://www.haskell.org/onlinereport/syntax-iso.html

So, C.f is a qualified name, but C . f is composition of the Constructor C with the function f.

Thanks for pointing this out. Although there is still a problem with the fact that var, qvar, qcon etc is in the context free syntax instead of the lexical syntax so you could write: 2 ` plus ` 4 ( Prelude.+ {- a comment -} ) 5 6 I think this must have been what was in the back of my mind. To make parsing operator expressions simple (ie LL1), it is necessary to somehow treat ` plus ` as a single lexeme, but by having such a thing in the CFG instead of the lexical grammar, a "lexeme" can then occuply multiple lines (which means you can't associate each line with a list of lexemes for incremental parsing)... Allowing "lexemes" to contain spaces and comments also makes fontification a bit more tricky. Also, I can't see any sense in making such things part of the CFG instead of just keeping them lexical - whoever would want to put spaces in a var, qcon etc? I suppose it's not an impossible problem to solve but it just makes life a lot harder for no good purpose that I can see. Best regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Malcolm Wallace

2 Jun 2 Jun

9:55 a.m.

"Brian Hulley" wrote:

...

Thanks for pointing this out. Although there is still a problem with the fact that var, qvar, qcon etc is in the context free syntax instead of the lexical syntax so you could write:

2 ` plus ` 4 ( Prelude.+ {- a comment -} ) 5 6

You appear to be right. However, I don't think I have ever seen a piece of code that actually used the first form. People seem to naturally place the backticks right next to the variable name. Should we consider the fact that whitespace and comments are permitted between backticks to be a bug in the Report? It certainly feels like it should be a lexical issue. On the other hand, the second form looks a lot like just bracketting an ordinary expression, and whitespace and comments can frequently be seen in such a position. If we disallow whitespace in the backtick case, it would feel wrong to permit it in the parenthesised "dual". Does anyone from the original language committee have any memory of why these choices were taken? Regards, Malcolm

Simon Marlow

11:18 a.m.

Malcolm Wallace wrote:

...

"Brian Hulley" wrote:

...
Thanks for pointing this out. Although there is still a problem with the fact that var, qvar, qcon etc is in the context free syntax instead of the lexical syntax so you could write:

2 ` plus ` 4 ( Prelude.+ {- a comment -} ) 5 6

You appear to be right. However, I don't think I have ever seen a piece of code that actually used the first form. People seem to naturally place the backticks right next to the variable name. Should we consider the fact that whitespace and comments are permitted between backticks to be a bug in the Report? It certainly feels like it should be a lexical issue.

I tend in the other direction: I'd rather see as much as possible pushed into the context-free syntax. The only reason that qualified identifiers are in the lexical syntax currently is because of the clash with the '.' operator. I'm not sure I can concisely explain why I think it is better to use the context-free syntax than the lexical syntax, but I'll try. I believe the lexical syntax should adhere, as far as possible, to the following rule: juxtaposition of lexemes of different classes should not affect the lexical interpretation. in other words, whitespace between different lexemes is irrelevant. I know this rule is violated in many places in the Haskell lexical syntax, but at least for me it serves as a guideline for what is tasteful. For example, it rules out `x` as a lexeme, because ` ought to be a reserved symbol, and then `x` would be a justaposition of 3 lexemes. This also explains to me why I think many of GHC's syntactic extensions are ugly (42#, (#..#), $x, [d|..|], etc.). However, it's really hard to extend the lexical syntax and stick to this rule, especially if you want to add brackets. So you should consider this a rant, nothing more. Cheers, Simon

Brian Hulley

1:57 p.m.

Simon Marlow wrote:

...

Malcolm Wallace wrote:

...
"Brian Hulley" wrote:

...
Thanks for pointing this out. Although there is still a problem with the fact that var, qvar, qcon etc is in the context free syntax instead of the lexical syntax so you could write:

2 ` plus ` 4 ( Prelude.+ {- a comment -} ) 5 6

You appear to be right. However, I don't think I have ever seen a piece of code that actually used the first form. People seem to naturally place the backticks right next to the variable name. Should we consider the fact that whitespace and comments are permitted between backticks to be a bug in the Report? It certainly feels like it should be a lexical issue.

I tend in the other direction: I'd rather see as much as possible pushed into the context-free syntax. The only reason that qualified identifiers are in the lexical syntax currently is because of the clash with the '.' operator.

I'm not sure I can concisely explain why I think it is better to use the context-free syntax than the lexical syntax, but I'll try. I believe the lexical syntax should adhere, as far as possible, to the following rule: juxtaposition of lexemes of different classes should not affect the lexical interpretation.

in other words, whitespace between different lexemes is irrelevant.

A question here is: what is a lexeme? For example there are floating point numbers, which are written without spaces, but which could be considered to consist of primitive whole-number lexemes interspersed with . e - 34.678e-98 I don't see what the difference is between them and Prelude.+ especially since we *really* need the dot for other purposes in the CFG such as composition and (hopefully at some point) field selection. Since Prelude.+ is by the above argument a single lexeme, it seems consistent to say that `Mod.Id` (Mod.+) are also single lexemes. The brackets in (Mod.+) have a lexical purpose, to turn a symbol into an id, which is very different imho from the use of brackets to parenthesise expressions or form sections. For example, should a parser consider ( + ) to be an incomplete parenthesised expression with 2 gaps or an id formed from the symbol + ? At the moment of course it would be an id but this causes problems when you're trying to parse Haskell and highlight incomplete expressions, because you'd expect that if the user indended to just make an id there wouldn't be any reason to leave spaces between the symbol and the brackets. In many ways it would be a lot easier if the (lexical) grammar was changed so that the "turning a symbol into an id" would just be indicated by parentheses round the (unqualified part of the) symbol alone not the whole thing thus: Prelude.(+) so that the first lexical rule would be 1) Parentheses around an unqualifed symbol turns it into an id Then the ` could be used to turn a (possibly qualified) id into a symbol: `Prelude.plus `Prelude.(+) and there would be no need for a closing `, so the second rule would be: 2) A grave before an id turns it into a symbol (that can't subsequently be turned back into an id!) There are at least five motivations for suggesting the above changes: 1) It allows operator expressions to be parsed by LL1 recursive descent :-) 2) The low level details of whether or not a symbol or id is used is kept to the lexical level 3) You can use a qualified function and an operator without knowing in advance whether it has been declared as a symbol or an id in the module. For example, you could type x `Mod. and expect to get a pop-up list of functions in Mod, such as (+) add etc, whereas with the current rules, you'd have to go back and add graves around the qualified function if the function was declared as an id and remove the grave if it was already declared as an operator. 4) Only one grave is needed :-) 5) An editor can give more feedback, by distinguishing between incomplete expressions and the turning of symbols into ids Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Doaitse Swierstra

4 Jun 4 Jun

7:27 p.m.

One might want to take a look at: http://www.cs.uu.nl/research/projects/proxima/ where we have built (a.o.) an editing environment for Helium programs (a subset of Haskell), Doaitse On 2006 jun 02, at 10:57, Brian Hulley wrote:

...

Simon Marlow wrote:

...
Malcolm Wallace wrote:

...
"Brian Hulley" wrote:

...
Thanks for pointing this out. Although there is still a problem with the fact that var, qvar, qcon etc is in the context free syntax instead of the lexical syntax so you could write:

2 ` plus ` 4 ( Prelude.+ {- a comment -} ) 5 6

You appear to be right. However, I don't think I have ever seen a piece of code that actually used the first form. People seem to naturally place the backticks right next to the variable name. Should we consider the fact that whitespace and comments are permitted between backticks to be a bug in the Report? It certainly feels like it should be a lexical issue.

I tend in the other direction: I'd rather see as much as possible pushed into the context-free syntax. The only reason that qualified identifiers are in the lexical syntax currently is because of the clash with the '.' operator.

I'm not sure I can concisely explain why I think it is better to use the context-free syntax than the lexical syntax, but I'll try. I believe the lexical syntax should adhere, as far as possible, to the following rule: juxtaposition of lexemes of different classes should not affect the lexical interpretation.

in other words, whitespace between different lexemes is irrelevant.

A question here is: what is a lexeme?

For example there are floating point numbers, which are written without spaces, but which could be considered to consist of primitive whole-number lexemes interspersed with . e -

34.678e-98

I don't see what the difference is between them and

Prelude.+

especially since we *really* need the dot for other purposes in the CFG such as composition and (hopefully at some point) field selection.

Since Prelude.+ is by the above argument a single lexeme, it seems consistent to say that

`Mod.Id` (Mod.+)

are also single lexemes. The brackets in (Mod.+) have a lexical purpose, to turn a symbol into an id, which is very different imho from the use of brackets to parenthesise expressions or form sections.

For example, should a parser consider ( + ) to be an incomplete parenthesised expression with 2 gaps or an id formed from the symbol + ? At the moment of course it would be an id but this causes problems when you're trying to parse Haskell and highlight incomplete expressions, because you'd expect that if the user indended to just make an id there wouldn't be any reason to leave spaces between the symbol and the brackets.

In many ways it would be a lot easier if the (lexical) grammar was changed so that the "turning a symbol into an id" would just be indicated by parentheses round the (unqualified part of the) symbol alone not the whole thing thus:

Prelude.(+)

so that the first lexical rule would be

1) Parentheses around an unqualifed symbol turns it into an id

Then the ` could be used to turn a (possibly qualified) id into a symbol:

`Prelude.plus `Prelude.(+)

and there would be no need for a closing `, so the second rule would be:

2) A grave before an id turns it into a symbol (that can't subsequently be turned back into an id!)

There are at least five motivations for suggesting the above changes:

1) It allows operator expressions to be parsed by LL1 recursive descent :-) 2) The low level details of whether or not a symbol or id is used is kept to the lexical level 3) You can use a qualified function and an operator without knowing in advance whether it has been declared as a symbol or an id in the module. For example, you could type x `Mod. and expect to get a pop-up list of functions in Mod, such as (+) add etc, whereas with the current rules, you'd have to go back and add graves around the qualified function if the function was declared as an id and remove the grave if it was already declared as an operator. 4) Only one grave is needed :-) 5) An editor can give more feedback, by distinguishing between incomplete expressions and the turning of symbols into ids

Regards, Brian.

-- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us.

http://www.metamilk.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Simon Peyton-Jones

11:53 a.m.

Brian You probably know this, but your kind of application is a big reason that we now make GHC available as a library. (Just say 'import GHC'.) You shouldn't need to parse Haskell yourself: just call GHC's parser. You get back a syntax tree with very precise location information that can guide your editor (e.g. if you want to select a sub-exprssion). Similarly, you can call the type checker. All this is much harder with half-written programs, but it seems acceptable to re-run the process at every keystroke, or at least at every pause. Simon | -----Original Message----- | From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of | Brian Hulley | Sent: 30 May 2006 20:00 | To: Mathew Mills; Bjorn Bringert; Christopher Brown | Cc: Walter Potter; haskell-cafe@haskell.org | Subject: Re: [Haskell-cafe] Editors for Haskell | | Mathew Mills wrote: | > With Haskell's lovely strong static typing, it is a crying shame we | > don't have an editor with immediate feedback, ala Eclipse. | | I've started writing an editor for Haskell. (It will be a commercial | product) | The first prototype was in C - now I'm re-writing from scratch in Haskell. | | It is quite a tall order to provide immediate typed feedback of an edit | buffer that will in general be syntactically incomplete but this is my | eventual aim. | | One issue in the area of immediate feedback is that Haskell's syntax is | troublesome in a few areas. Consider: | | foo :: SomeClass a => a -> a | | when the user has just typed: | | foo :: SomeClass a | | should the editor assume SomeClass is a Tycon or a class name? | | One idea I had to solve this problem was to change the syntax of Haskell | slightly so that constraints would be enclosed in {} instead of preceeding | => ie: | | foo :: {SomeClass a} a->a | | so that in | | foo :: {SomeClass | | it is already determined that SomeClass must be a class name. | | Another thing which causes difficulty is the use of qualified operators, and | the fact that the qualification syntax is in the context free grammar | instead of being kept in the lexical syntax (where I think it belongs). For | example, afaiu according to H98 (but not GHCi) it is permissible to write: | | a Prelude . + b | | -- qvarsym -> [ modid . ] varsym | | | whereas in my prototype I put all this into a level immediately above the | lexer but below the CFG so that no spaces are allowed thus: | | a Prelude.+ b -- no spaces in the qvarsym | a `Prelude.(+)` b -- a little generalization | a `Prelude.(+) b -- no need for closing ` | | (The generalization above is intended for when you don't know whether or not | the function you're qualifying has been declared as an operator but you want | to use it as an operator eg if a pop-up list would appear after you typed | `Prelude. with entries such as (+) plus add etc) | | With the above changes, it is possible to parse Haskell (or at least as much | as I got round to implementing in my C++ prototype) using a simple | deterministic recursive descent parser with only 1 token of lookahead. | | (There is possibly some confusion in the H98 report about exactly how | ambiguous expressions involving typed case alternatives might be parsed eg x | :: a->b -> if x 4 then ... but I'm hoping it will be ok to just fix the | syntax here by requiring extra brackets) | | Anyway I suppose the point of this post is to see whether or not people feel | that such changes are acceptable in an editor, or whether an editor must | adhere exactly to the standard (or whether the standard can be changed to | enable the determinism and ease of parsing necessary for interactive editing | with immediate feedback)? | | Regards, Brian. | -- | Logic empowers us and Love gives us purpose. | Yet still phantoms restless for eras long past, | congealed in the present in unthought forms, | strive mightily unseen to destroy us. | | http://www.metamilk.com | | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

Pete Kazmier

7 Jun 7 Jun

12:12 p.m.

"Simon Peyton-Jones" writes:

...

You probably know this, but your kind of application is a big reason that we now make GHC available as a library. (Just say 'import GHC'.)

You shouldn't need to parse Haskell yourself: just call GHC's parser. You get back a syntax tree with very precise location information that can guide your editor (e.g. if you want to select a sub-exprssion). Similarly, you can call the type checker.

Are there any small examples of using GHC's parser? I'm a complete newbie so perhaps I'm not checking all of the relevant locations for docs, but I can't seem to find this parser that is being referred to. I checked out the source tree to GHC as well, but I have no idea where to look in there (not to mention it's a bit intimidating). Pointers would be appreciated! As part of my learning experience, I think I want to see if I can write a haskell pastebin that does proper syntax highlighting. Someone in #haskell suggested that I use just a lexer because using a parser is overkill. However, I can't make this assessment until I see how to use the parser and the information it can supply. Thanks, Pete

Neil Mitchell

12:25 p.m.

Hi Pete

...

As part of my learning experience, I think I want to see if I can write a haskell pastebin that does proper syntax highlighting. Someone in #haskell suggested that I use just a lexer because using a parser is overkill. However, I can't make this assessment until I see how to use the parser and the information it can supply.

You might want to check out HsColour, which does pretty much exactly what you asked for: http://www.cs.york.ac.uk/fp/darcs/hscolour/ Thanks Neil

Malcolm Wallace

12:31 p.m.

Pete Kazmier wrote:

...

As part of my learning experience, I think I want to see if I can write a haskell pastebin that does proper syntax highlighting. Someone in #haskell suggested that I use just a lexer because using a parser is overkill. However, I can't make this assessment until I see how to use the parser and the information it can supply.

For simple static highlighting, a lexical analysis is more than adequate. (You've seen http://www.cs.york.ac.uk/fp/darcs/hscolour, right?) You only need a full parser if you want to do (e.g.) hyperlinks from a variable usage to its definition site. (You seen Programatica as well, http://www.cse.ogi.edu/~hallgren/h2h.html right?) Regards, Malcolm

Pete Kazmier

3:50 p.m.

Pete Kazmier writes:

...

As part of my learning experience, I think I want to see if I can write a haskell pastebin that does proper syntax highlighting. Someone in #haskell suggested that I use just a lexer because using a parser is overkill. However, I can't make this assessment until I see how to use the parser and the information it can supply.

Thanks for the responses and pointers to the other projects. I'll investigate those after the day-job (the one that pays the bills). As for using the lexer vs the parser, I was hoping to do things such as folding and/or nifty mouse-overs of logical blocks of code, which is why I was interested in the parser. I'm not sure if I could do the same with only a lexer. I'm basically just looking for something concrete to tinker with as I learn Haskell and it seems that Haskell is missing a snazzy pastebin. Thanks, Pete

Simon Peyton-Jones

8 Jun 8 Jun

8:19 a.m.

| > You probably know this, but your kind of application is a big reason | > that we now make GHC available as a library. (Just say 'import GHC'.) | > | > You shouldn't need to parse Haskell yourself: just call GHC's parser. | > You get back a syntax tree with very precise location information that | > can guide your editor (e.g. if you want to select a sub-exprssion). | > Similarly, you can call the type checker. | | Are there any small examples of using GHC's parser? I'm a complete | newbie so perhaps I'm not checking all of the relevant locations for | docs, but I can't seem to find this parser that is being referred to. | I checked out the source tree to GHC as well, but I have no idea where | to look in there (not to mention it's a bit intimidating). Pointers | would be appreciated! The interface is exported by the module 'GHC', which is in GHC's sources in main/GHC.hs. Sadly it does not have Haddock-ised documentation, because Haddock isn't fully-featured enough to read GHC's source code. (With a bit of luck the SoC project will fix that problem.) Meanwhile, your best bet is to look at the interface and explore. There a short web page giving an example of using GHC as a library here http://haskell.org/haskellwiki/GHC/As_a_library Please add to it as you find out what to do! simon | | As part of my learning experience, I think I want to see if I can | write a haskell pastebin that does proper syntax highlighting. | Someone in #haskell suggested that I use just a lexer because using a | parser is overkill. However, I can't make this assessment until I see | how to use the parser and the information it can supply. | | Thanks, | Pete | | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

6968

Age (days ago)

6982

Last active (days ago)

List overview

Download

22 comments

16 participants

participants (16)

Benjamin Franksen
Bjorn Bringert
Brian Hulley
Christopher Brown
Daniel McAllansmith
Doaitse Swierstra
George Beshers
John Meacham
Malcolm Wallace
Mathew Mills
Neil Mitchell
Pete Kazmier
Simon Marlow
Simon Peyton-Jones
Thomas Hallgren
Walter Potter