A better syntax for qualified operators?

Hi - Consider the scenario when you want to find a function that returns the i'th element of an array but all you know is that there is a module called Data.Array.IArray that will probably have such a function in it. So you start typing in your program: let ith = Data.Array.IArray. at this point, you'd hope the editor you're using would somehow display a list of avaliable values exported from Data.Array.IArray including the indexing function, so you could select it, thus I would *like* to be able to use the syntax: let ith = Data.Array.IArray.(!) because it's not the user's fault that the person who wrote Data.Array.IArray decided to use a symbol instead of an identifier for this function - the user of Data.Array.IArray in this case just wants to see normal identifiers to use with prefix application so the use of (!) at this point effectively gets rid of the unwanted operatorness associated with the function. However the current syntax of Haskell would not allow this. Instead you have to write: let ith = (Data.Array.IArray.!) The problem is that the user of Data.Array.IArray has to know already in advance, before typing the 'D' of "Data", that the indexing function has been named with a symbol instead of an identifier, but this knowledge is only available later, when the user has typed the '.' after "IArray", so the current syntax would be frustrating for the user because the user then has to go all the way back and insert an opening paren before the 'D'. Also, consider the appearance of: let ith = (Data.Array.IArray.!) arr i b = Data.Array.IArray.bounds arr vs let ith = Data.Array.IArray.(!) arr i b = Data.Array.IArray.bounds arr I'm not sure if I've managed to explain this problem clearly enough, but my proposal is that we might consider changing the lexical syntax of Haskell as follows: varId ::= id varOp ::= symbol varIdOp ::= ` varId varOpId ::= ( varOp ) varOpIdOp ::= ` varOpId qvarId ::= {conId .}+ varId -- { }+ denotes 1 or more times qvarIdOp ::= ` qvarId qvarOp ::= {conId .}+ varOp qvarOpId ::= {conId .}+ varOpId qvarOpIdOp ::= `qvarOpId In other words, to turn an operator symbol into an id, the parentheses would be put immediately around the symbol (with no spaces since this is lexical syntax), and to turn an id into an operator the backquote is put in front of the entire (qualified) id. (Also the trailing backquote in the existing syntax is redundant) The above syntax would have 3 advantages: 1) It allows the client of a module to write code without having to worry if the author of the module used symbols or identifiers to name functions - everything exported from the module can be made to appear as if it was named by an identifier (ie OpId) 2) Moving the parentheses to the lexical syntax makes syntax highlighting easier (because there are no comments to worry about inside the OpId) and also makes parsing simpler because all the mess associated with Ops versus Ids is handled by the lexer 3) It allows an editor to make a distinction between (+) -- an operator turned into an identifier - varOpId ( + ) -- an expression with 2 gaps in it which should be marked as incomplete (+ ) -- a section with 1 gap Some examples of the proposed syntax are: let ith = Data.Array.IArray.(!) arr i foo = k `Math.(+) 6 -- default precendence bar = k Math.+ 6 -- using precedence of + in module Math When you try to write an editor for Haskell (or some subset of it), you quickly discover these areas of Haskell syntax like the above which need to be changed to get an optimum interactive editing experience. I think it *is* possible to adjust the Haskell grammar so that it is LL(1) and the only reason it is not already LL(1) seems to be that the grammar has been designed with compilers (which only need to deal with complete modules) in mind rather than programmers interactively editing in mind. (The other change needed for LL(1) is to give contexts a marker before they appear eg: foo :: {MonadIO m} a -> m a ) By LL(1) I'm really meaning that the grammar for interactive editing needs to be adjusted so that it is possible to maintain the invariant that as code is entered from left to right constructs and identifiers can be highlighted according to their grammatical role and highlighting (modulo incompleteness) must remain unchanged regardless of whatever is typed afterwards to the right otherwise it can become more of a liability than a help, hence my hope that some future revision of Haskell grammar might consider taking the above points into account. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Brian Hulley wrote:
Hi - Consider the scenario when you want to find a function that returns the i'th element of an array but all you know is that there is a module called Data.Array.IArray that will probably have such a function in it. So you start typing in your program:
let ith = Data.Array.IArray.
at this point, you'd hope the editor you're using would somehow display a list of avaliable values exported from Data.Array.IArray including the indexing function, so you could select it, thus I would *like* to be able to use the syntax:
let ith = Data.Array.IArray.(!)
because it's not the user's fault that the person who wrote Data.Array.IArray decided to use a symbol instead of an identifier for this function - the user of Data.Array.IArray in this case just wants to see normal identifiers to use with prefix application so the use of (!) at this point effectively gets rid of the unwanted operatorness associated with the function.
This is a nice argument
However the current syntax of Haskell would not allow this. Instead you have to write:
let ith = (Data.Array.IArray.!)
The problem is that the user of Data.Array.IArray has to know already in advance, before typing the 'D' of "Data", that the indexing function has been named with a symbol instead of an identifier, but this knowledge is only available later, when the user has typed the '.' after "IArray", so the current syntax would be frustrating for the user because the user then has to go all the way back and insert an opening paren before the 'D'.
Also, consider the appearance of:
let ith = (Data.Array.IArray.!) arr i b = Data.Array.IArray.bounds arr vs let ith = Data.Array.IArray.(!) arr i b = Data.Array.IArray.bounds arr
I'm not sure if I've managed to explain this problem clearly enough, but my proposal is that we might consider changing the lexical syntax of Haskell as follows:
varId ::= id varOp ::= symbol varIdOp ::= ` varId varOpId ::= ( varOp ) varOpIdOp ::= ` varOpId
qvarId ::= {conId .}+ varId -- { }+ denotes 1 or more times qvarIdOp ::= ` qvarId qvarOp ::= {conId .}+ varOp qvarOpId ::= {conId .}+ varOpId qvarOpIdOp ::= `qvarOpId
In other words, to turn an operator symbol into an id, the parentheses would be put immediately around the symbol (with no spaces since this is lexical syntax), and to turn an id into an operator the backquote is put in front of the entire (qualified) id. Why does the nice argument not apply equally well to infixifying things?
If I think I want to use infix some thing from Data.Array and start typing myArr Data.Array. and find out element access has become "get" while I wasn't looking, it's not my fault the author of Data.Array decided to use a function when I was expecting an identifier - Shouldn't I be able to write myArr Data.Arr.`get` ix
(Also the trailing backquote in the existing syntax is redundant)
The trailing backquote is just as redundant as the trailing close paren in the syntax for using a symbol as a prefix function and just as important for my comment on backticks as the closing paren is to your proposal for sections - it means it's lexically apparent at least at one side of the identifier that it's a section/infixification Brandon

Brandon Moore wrote:
Brian Hulley wrote:
I would *like* to be able to use the syntax: ith = Data.Array.IArray.(!)
Why does the nice argument not apply equally well to infixifying things? Shouldn't I be able to write
myArr Data.Arr.`get` ix
Good point. This would also remove the need for allowing double conversion as in OpIdOp which was an element of asymmetry in my original proposal. Thus I revise my proposal to the following: varId ::= id varOp ::= symbol varIdOp ::= ` varId ` varOpId ::= ( varOp ) q<x> ::= {conId .}+ x so the concerns of qualification and Id, Op, Id<->Op would now be separated (the point being that you can only make a decision regarding Id<->Op when you know whether or not you're starting with an Id or an Op and you only know this latter fact when you've already "arrived" at the module by typing the qualifier).
(Also the trailing backquote in the existing syntax is redundant)
The trailing backquote is just as redundant as the trailing close paren in the syntax for using a symbol as a prefix function and just as important for my comment on backticks as the closing paren is to your proposal for sections - it means it's lexically apparent at least at one side of the identifier that it's a section/infixification
I'm not sure I understand this argument for why a trailing backquote is needed, though I can see it would be needed if the dist-fix idea proposed by Benjamin Franksen in http://www.haskell.org/pipermail/haskell-cafe/2006-August/017373.html was adopted eg: Control.K.`if cond Control.K.`then` t Control.K.`else` f Control.K.fi` Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

On Wednesday 27 September 2006 22:20, Brian Hulley wrote:
(The other change needed for LL(1) is to give contexts a marker before they appear eg:
foo :: {MonadIO m} a -> m a )
Or move contexts to the end of a type and separate it with a | like Clean does: (See 6.2 of http://clean.cs.ru.nl/download/Clean20/doc/CleanRep2.0.pdf) foo :: a -> m a | MonadIO m Personally I like this style because I always think first about what the type of the function should be (a -> m a) and then about the contexts / restrictions that hold for the variables (MonadIO). I do the same thinking when writing list comprehensions. First I think of the general form of the elements in the list: [ (a, b) ... then I think about the restrictions on the variables: | a <- [1..10], b <- [1..10], a > b] Bas van Dijk

Bas van Dijk wrote:
On Wednesday 27 September 2006 22:20, Brian Hulley wrote:
(The other change needed for LL(1) is to give contexts a marker before they appear eg:
foo :: {MonadIO m} a -> m a )
Or move contexts to the end of a type and separate it with a | like Clean does: (See 6.2 of http://clean.cs.ru.nl/download/Clean20/doc/CleanRep2.0.pdf)
foo :: a -> m a | MonadIO m
Personally I like this style because I always think first about what the type of the function should be (a -> m a) and then about the contexts / restrictions that hold for the variables (MonadIO).
I do the same thinking when writing list comprehensions. First I think of the general form of the elements in the list: [ (a, b) ... then I think about the restrictions on the variables: | a <- [1..10], b <- [1..10], a > b]
I also like this style (introduced to the Haskell world by Bulat in http://www.haskell.org/pipermail/haskell/2006-September/018466.html) for the same reasons, and also because visually I think it looks really great. The problem with it as far as an editor is concerned is that the type variables are used before they are quantified, so any feedback about lexical scoping may be wrong till the context has been entered eg: class C q where foo :: q -> q instance C (T a b) | Eq a, Eq b where foo x = let g :: a -> a | a, Eq a g = ... in g x Here the local nature of the type variable 'a' in 'g' is not known while a->a is being entered ie the user knows that the 'a' is going to be a local type variable but the editor will tell the user it is the 'a' of the instance decl thus resulting in a poor editing experience. An alternative would be to write: let g :: forall a. a-> a | Eq a but this isn't so nice because then the quantifier is separated from the restrictions in the context, which I think should really be regarded as part of the quantifier itself. A possible solution would be to just forbid shadowing of type variables altogether, which is maybe not as draconian as it sounds, especially when there is so much to gain from using the neat | syntax and a good precedent for it's use. As an aside, regarding lexical scoping of type variables, it might in future be very useful to have the rule that variables not explicitly quantified are scoped over the entire enclosing *top level* declaration (or the highest level decl that makes sense) hence the need for a very parsimonious syntax such as f :: a -> b | a, b instead of the long-winded and sometimes confusing "forall" keyword. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Hello Bas, Thursday, September 28, 2006, 2:39:13 AM, you wrote:
foo :: {MonadIO m} a -> m a
Or move contexts to the end of a type and separate it with a | like Clean
foo :: a -> m a | MonadIO m
i've proposed both these constructs here at list some time ago :)))) but we don't decide... -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed, 27 Sep 2006, Brian Hulley wrote:
Hi - Consider the scenario when you want to find a function that returns the i'th element of an array but all you know is that there is a module called Data.Array.IArray that will probably have such a function in it. So you start typing in your program:
let ith = Data.Array.IArray.
at this point, you'd hope the editor you're using would somehow display a list of avaliable values exported from Data.Array.IArray including the indexing function, so you could select it, thus I would *like* to be able to use the syntax:
let ith = Data.Array.IArray.(!)
because it's not the user's fault that the person who wrote Data.Array.IArray decided to use a symbol instead of an identifier for this function - the user of Data.Array.IArray in this case just wants to see normal identifiers to use with prefix application so the use of (!) at this point effectively gets rid of the unwanted operatorness associated with the function.
This cool editor is able to show a list of functions with the given qualification but is not able to enclose the qualified identifier in parentheses? I don't think that it is a good idea to move the qualification away from the qualified identifier. The parentheses around the infix operator are a special case of sections. With the proposed syntactic change, we would have two meanings of parentheses: Section and making an infix operator prefix. One can also mix up Data.Array.IArray.(!) more easily with function composition Data.Array.IArray . (!) if Data.Array.IArray is also a constructor.

Henning Thielemann wrote:
On Wed, 27 Sep 2006, Brian Hulley wrote:
ith = Data.Array.IArray.(!)
This cool editor is able to show a list of functions with the given qualification but is not able to enclose the qualified identifier in parentheses?
I hadn't thought of that...
I don't think that it is a good idea to move the qualification away from the qualified identifier. The parentheses around the infix operator are a special case of sections. With the proposed syntactic change, we would have two meanings of parentheses: Section and making an infix operator prefix. One can also mix up Data.Array.IArray.(!) more easily with function composition Data.Array.IArray . (!) if Data.Array.IArray is also a constructor.
Yes this problem was at the back of my mind. I think I'm coming round to the view that the original syntax is actually better after all. It also turns out that it's rather difficult to deal with Data.Array.IArray.(+) at the lexical level because there are 3 different ways in which this can be incomplete eg: Data.IArray.Array. Data.IArray.Array.( Data.IArray.Array.(+ leading either to too many different tokens to represent the above variations or else a very heavyweight token with many fields. I had originally thought it would be easier to have (+) and `div` as lexemes so that it would be easier to parse expressions involving operators in the CFG, but as long as http://hackage.haskell.org/trac/haskell-prime/wiki/FixityResolution is accepted, this reason will disappear. Thanks for the good justification of the existing syntax, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Brian Hulley wrote:
Consider the scenario when you want to find a function that returns the i'th element of an array but all you know is that there is a module called Data.Array.IArray that will probably have such a function in it. So you start typing in your program:
let ith = Data.Array.IArray.
at this point, you'd hope the editor you're using would somehow display a list of avaliable values exported from Data.Array.IArray including the indexing function, so you could select it, thus I would *like* to be able to use the syntax:
let ith = Data.Array.IArray.(!)
because it's not the user's fault that the person who wrote Data.Array.IArray decided to use a symbol instead of an identifier for this function - the user of Data.Array.IArray in this case just wants to see normal identifiers to use with prefix application so the use of (!) at this point effectively gets rid of the unwanted operatorness associated with the function.
However the current syntax of Haskell would not allow this. Instead you have to write:
let ith = (Data.Array.IArray.!)
The problem is that the user of Data.Array.IArray has to know already in advance, before typing the 'D' of "Data", that the indexing function has been named with a symbol instead of an identifier, but this knowledge is only available later, when the user has typed the '.' after "IArray", so the current syntax would be frustrating for the user because the user then has to go all the way back and insert an opening paren before the 'D'.
Sorry, but I can't see the problem here. Why can't the editor offer the operator as '!' in the list of options, and if the user selects it insert both '(' and ')' at the right places (i.e. before the module name and after the operator)? Is there some unwritten law that forbids editors or IDEs to insert stuff at positions other than the current cursor position? Isn't Haskell (assuming you are programming your editor in Haskell) not supposed to make hard things easy(er)? Rather than complain about libraries that offer operators (which i personally like very much, thank you) or proposing to change the language, use your imagination and design a new human interface that deals with the language in the most useful way. [BTW, I recommend you take a look at what Conor BcBride did to support 'visual' (dependently typed) programming in Epigram. There's many good ideas there to steal from ;)] Generally speaking, I would always hesitate to change the language so it better suits programming tools(*). It is the tools which should adapt to the language, even if that means the programmer has to find new ways of suporting the user (and the language). The most important reason being that code is more often read than written. That said, there still might be good arguments to change the syntax in the way you propose, just not the one you gave above. In fact, I am half of a mind to say I like Data.Array.IArray.(!) better than (Data.Array.IArray.!). My reason is that the former is more readable because it highlights the operator symbol by surrounding it with parentheses, whereas the latter obscures it. At the danger of becoming completely off-topic now (sorry!), I have to say that I find /both/ versions ugly and unnecessarily hard to read. My personal solution is to generally avoid qualified imports. I use it only if absolutely necessary to disambiguate some symbol, and then just for that symbol. I am aware that there is an opposing faction here, who tries to convinve everyone that qualified import should be the standard (and the actual exported symbols --at least some of them-- meaningless, such as 'C' or 'T'). I think such a convention is inappropriate for a functional language (especially one with user defined operators). There simply is no natural 1:1 correspondence between data type declaration and functions acting on that data built into the language, as opposed to e.g. OO languages. Extensibility in the functional dimension, i.e. the ability to arbitrarily add functions that operate on some data without having to change the code (module) that defines the data, is one of the hallmarks of functional programming, as opposed to OO programming.(**) Cheers Ben -- (*) Yes I know it has at least to be compilable, preferably in an efficient way. However, that doesn't invalidate the argument. Indeed, it is bad enough that we have to compromise in order to make our languages implementable with current compiler technology. No need to compromise on much less essential equipment, such as syntax aware editing tools. (**) One could argue that this extensibility is lost anyway as soon as /abstract/ data types come into play. However, [warning: it gets even more off-topic from here on] nothing prevents us from offering /two/ interfaces (visible modules), one where the data type is abstract ("client interface") and a different one where it is concrete ("extension interface"). Of course, only the abstract (client) interface guarantees that invariants cannot be broken by the user code. But that doesn't mean a concrete interface wouldn't be useful. How often have there been requests to add this or that function to some standard library abstract data type? If the concrete type would be available via an extension interface, users who need additional functions that operate directly on the underlying data structure (for instance for efficiency reasons) could then build and also publish such extension modules (with, again, an abstract interface so usage is safe). [I am aware that this can lead to maintenance problems if the internal data type or invariants get changed. I still think it is better to have the possibility to add new functions (and live with eventual consequences) than to not have it.]

Benjamin Franksen wrote:
Brian Hulley wrote:
ith = Data.Array.IArray.(!)
Sorry, but I can't see the problem here. Why can't the editor offer the operator as '!' in the list of options, and if the user selects it insert both '(' and ')' at the right places (i.e. before the module name and after the operator)? Is there some unwritten law that forbids editors or IDEs to insert stuff at positions other than the current cursor position?
I hadn't thought of that - I've now decided to just use the existing syntax here.
Generally speaking, I would always hesitate to change the language so it better suits programming tools(*). It is the tools which should adapt to the language, even if that means the programmer has to find new ways of suporting the user (and the language). The most important reason being that code is more often read than written.
My motivation for the change was that it would better suit the human user of the programming tool, though in this particular instance you and Henning have convinced me that the original syntax was better after all.
At the danger of becoming completely off-topic now (sorry!), I have to say that I find /both/ versions ugly and unnecessarily hard to read. My personal solution is to generally avoid qualified imports.
How does this solution scale? Surely it's only lucky if people happen to choose names that don't clash with those of other modules?
I use it only if absolutely necessary to disambiguate some symbol, and then just for that symbol. I am aware that there is an opposing faction here, who tries to convinve everyone that qualified import should be the standard (and the actual exported symbols --at least some of them-- meaningless, such as 'C' or 'T').
Although C and T are in themselves meaningless, the name of the module itself is not. As I understand it, this convention makes the module name carry the meaning so you use Set.T instead of Set.Set where the meaning is duplicated (a rather uneasy situation) in both the module name and type name.
I think such a convention is inappropriate for a functional language (especially one with user defined operators). There simply is no natural 1:1 correspondence between data type declaration and functions acting on that data built into the language, as opposed to e.g. OO languages. Extensibility in the functional dimension, i.e. the ability to arbitrarily add functions that operate on some data without having to change the code (module) that defines the data, is one of the hallmarks of functional programming, as opposed to OO programming.
If you have an abstract data type then it *is* like an object (though in a potentially more powerful way than in OOP) because there is no other way to manipulate values of that type. If the type is not abstract, the advantage of calling it T is just that it avoids naming it twice (by type name and module name) in the situation where you want to not worry about name clashes with constructors of other types.
However, nothing prevents us from offering two interfaces (visible modules), one where the data type is abstract ("client interface") and a different one where it is concrete ("extension interface")
You can still call both types T... :-) Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

Brian Hulley wrote:
Benjamin Franksen wrote:
At the danger of becoming completely off-topic now (sorry!), I have to say that I find /both/ versions ugly and unnecessarily hard to read. My personal solution is to generally avoid qualified imports.
How does this solution scale? Surely it's only lucky if people happen to choose names that don't clash with those of other modules?
Many, if not most, name clashes result from two different things actually being "the same concept". Such entities should be refactored into classes, rather than disambiguated using qualified names, if possible. Apart from these, a large class of name clashes is the result of chosing inappropriate names. Chosing good names is an art in itself; and the more carefully you name things, the less the likelyhood that they will accidentally clash with somebody else's. For the remaining cases, you need to use qualified import; I meant these cases when I said:
I use it only if absolutely necessary to disambiguate some symbol, and then just for that symbol. I am aware that there is an opposing faction here, who tries to convinve everyone that qualified import should be the standard (and the actual exported symbols --at least some of them-- meaningless, such as 'C' or 'T').
Although C and T are in themselves meaningless, the name of the module itself is not.
Ok, but I still don't like it. I would much rather use 'Set' than 'Set.T'. Since it is always clear from the context whether something is a data type or a class, the additional '.T' or '.C' only adds noise, making the program uglier and harder to read. BTW, it has been argued by others that a single exported data type or class per module is a special case (however common) which cannot be generally assumed for library modules. Even in the cmmon case where there is one 'main' data type exported, you often need auxiliary data types (which are often concrete types) to go along. I would find it strange to call main concept just 'T' while the minor, auxiliary stuff gets a 'real' name.
As I understand it, this convention makes the module name carry the meaning so you use Set.T instead of Set.Set where the meaning is duplicated (a rather uneasy situation) in both the module name and type name.
Set.Set is horrible indeed. That is why I would rather not import all of Set qualified. If there is a name clash, e.g. we also need some different data type Set which gets exported from module Foo, then my solution is to explicitly disambiguate Set, as in import qualified Foo (Set) import Foo hiding (Set) import qualified Set (Set) import Set hiding (Set) type FooBar = Foo.Set -- this one is often not necessary type Set = Set.Set This buys me more readable functions at the cost of adding more noise at the top of the module for disambiguating imports. I think the gain is worth the cost, because the function definitions are where the non-trivial (i.e. hard to understand) stuff is. As to scalability: In practice I always wait for the compiler to complain about ambiguous imports and only then fix like indicated above. A great feature of the Haskell module system is that it /never/ allows any ambiguity to actually appear in your source code. There is no way you could accidentally use an imported entity when it is not absolutely clear (to the compiler) which module it is imported from. OTOH, the compiler /only/ reports an error if you actually use such an ambigously imported name. Otherwise the above method of 'manual' disambiguation would indeed be very impractical.
I think such a convention is inappropriate for a functional language (especially one with user defined operators). There simply is no natural 1:1 correspondence between data type declaration and functions acting on that data built into the language, as opposed to e.g. OO languages. Extensibility in the functional dimension, i.e. the ability to arbitrarily add functions that operate on some data without having to change the code (module) that defines the data, is one of the hallmarks of functional programming, as opposed to OO programming.
If you have an abstract data type then it *is* like an object (though in a potentially more powerful way than in OOP) because there is no other way to manipulate values of that type.
Yes, right.
If the type is not abstract, the advantage of calling it T is just that it avoids naming it twice (by type name and module name) in the situation where you want to not worry about name clashes with constructors of other types.
You never need to worry beforehand! You can rest assured that the compiler will mercilessly flag all ambiguous uses of imported names. Only if and ehen it does you need to start thinking about how you want to name these different entities in your module. [NB however far this discussion has digressed from its origin, there is still a common theme here: I keep arguing against making life simpler for the program /writer/ if this causes more difficulty for the (human) /reader/ of a program. Here it is my argument that manually resolving disambiguities may be more work initially, but (I think) it pays off in more beautiful code that is easier to read (and therefore to understand).]
However, nothing prevents us from offering two interfaces (visible modules), one where the data type is abstract ("client interface") and a different one where it is concrete ("extension interface")
You can still call both types T... :-)
...and next come we name "the" function exported from each module 'f' or what? Imagine set2 = Data.Set.Insert.f x set1 "What a beautiful world this could be..." ;-)) (*) Cheers, Ben (*) Donald Fagen (forgot the name of the song)

On Fri, 29 Sep 2006, Benjamin Franksen wrote:
Brian Hulley wrote:
Benjamin Franksen wrote:
At the danger of becoming completely off-topic now (sorry!), I have to say that I find /both/ versions ugly and unnecessarily hard to read. My personal solution is to generally avoid qualified imports.
How does this solution scale? Surely it's only lucky if people happen to choose names that don't clash with those of other modules?
Many, if not most, name clashes result from two different things actually being "the same concept". Such entities should be refactored into classes, rather than disambiguated using qualified names, if possible.
That's the problem. If one instance of such a conceptual method takes one additional parameter you cannot merge it in a class with another instance. What about, if the functions only differ in laziness? See for instance http://www.cs.york.ac.uk/fp/darcs/HaXml/src/Text/XML/HaXml/Parse.hs http://www.cs.york.ac.uk/fp/darcs/HaXml/src/Text/XML/HaXml/ParseLazy.hs
BTW, it has been argued by others that a single exported data type or class per module is a special case (however common) which cannot be generally assumed for library modules.
I like to use the style for the modules where it fits, and do differently where it does not fit. However I'm trying to make the modules fit, that is, I try to collect the functions concerning one data type in one module.
Even in the cmmon case where there is one 'main' data type exported, you often need auxiliary data types (which are often concrete types) to go along. I would find it strange to call main concept just 'T' while the minor, auxiliary stuff gets a 'real' name.
Not very nice, indeed. However, I would not call them 'real' names, but 'more special' names. Say Text.Html.T (main data type) Text.Html.Element (auxiliary type) Text.Html.Attribute (auxiliary type) Network.URI.T (main data type) Network.URI.Authority (auxiliary type)
As to scalability: In practice I always wait for the compiler to complain about ambiguous imports and only then fix like indicated above. A great feature of the Haskell module system is that it /never/ allows any ambiguity to actually appear in your source code. There is no way you could accidentally use an imported entity when it is not absolutely clear (to the compiler) which module it is imported from. OTOH, the compiler /only/ reports an error if you actually use such an ambigously imported name. Otherwise the above method of 'manual' disambiguation would indeed be very impractical.
When reading modules, I find it difficult to find out where identifiers come from. The foreign identifiers used in a module may not be defined in any of the directly imported modules. This is the case if an imported module only re-exports other module interfaces. Because of this I prefer imports like import A (f,g,h) import qualified B to import A import B Accidentally these are the only choices in Modula languages. :-)
If the type is not abstract, the advantage of calling it T is just that it avoids naming it twice (by type name and module name) in the situation where you want to not worry about name clashes with constructors of other types.
You never need to worry beforehand! You can rest assured that the compiler will mercilessly flag all ambiguous uses of imported names. Only if and ehen it does you need to start thinking about how you want to name these different entities in your module.
A program which is written as ambigous as the compiler allows makes reading such a program quite hard.
...and next come we name "the" function exported from each module 'f' or what? Imagine
set2 = Data.Set.Insert.f x set1
I have no preference for a particular name, but indeed, if a module merely implements one function and there are more modules (expected) with the same interface but different implementations, I would use the same identifier.

Brian Hulley wrote:
When you try to write an editor for Haskell (or some subset of it), you quickly discover these areas of Haskell syntax like the above which need to be changed to get an optimum interactive editing experience. I think it *is* possible to adjust the Haskell grammar so that it is LL(1) and the only reason it is not already LL(1) seems to be that the grammar has been designed with compilers (which only need to deal with complete modules) in mind rather than programmers interactively editing in mind.
(The other change needed for LL(1) is to give contexts a marker before they appear eg:
foo :: {MonadIO m} a -> m a )
Just catching up on haskell-cafe... The other notorious part of the Haskell grammar that isn't LL/LR(1) is expressions vs. patterns. In a statement, if you see a variable, you don't know whether it is a pattern variable (apat) or an expression variable (aexpr). This is why Haskell grammars generally parse expressions and patterns using the same non-terminals.
By LL(1) I'm really meaning that the grammar for interactive editing needs to be adjusted so that it is possible to maintain the invariant that as code is entered from left to right constructs and identifiers can be highlighted according to their grammatical role and highlighting (modulo incompleteness) must remain unchanged regardless of whatever is typed afterwards to the right otherwise it can become more of a liability than a help, hence my hope that some future revision of Haskell grammar might consider taking the above points into account.
So you won't be able to colour patterns differently from expressions, that doesn't seem any worse than the context vs. type issue. Indeed, I'm not even sure you can colour types vs. values properly, look at this: data T = C [Int] at this point, is C a constructor? What if I continue the declaration like this: data T = C [Int] `F` now it's a type! Cheers, Simon

On Wed, Oct 18, 2006 at 04:42:00PM +0100, Simon Marlow wrote:
The other notorious part of the Haskell grammar that isn't LL/LR(1) is expressions vs. patterns. In a statement, if you see a variable, you don't know whether it is a pattern variable (apat) or an expression variable (aexpr). This is why Haskell grammars generally parse expressions and patterns using the same non-terminals.
it should be noted that all of haskell (including the maximal munching rules and lexing (but not the layout without a little preprocessing AFAIK)) can easily be parsed by a PEG. I am going to switch jhc to using one eventually as the maintainability advantages of a peg grammar are persuasive (and I pull my hair out every time I have to modify the current happy LALR parser). If you are writing something like a syntax highlighter for an editor, I'd strongly recommend checking them out as a basis. they are a straightforward generalization of regular expressions and can be made to deal gracefully with errors simply choosing the "most plausable" choice from ambiguous or incomplete code, something that is extremely useful for an editor.
By LL(1) I'm really meaning that the grammar for interactive editing needs to be adjusted so that it is possible to maintain the invariant that as code is entered from left to right constructs and identifiers can be highlighted according to their grammatical role and highlighting (modulo incompleteness) must remain unchanged regardless of whatever is typed afterwards to the right otherwise it can become more of a liability than a help, hence my hope that some future revision of Haskell grammar might consider taking the above points into account.
So you won't be able to colour patterns differently from expressions, that doesn't seem any worse than the context vs. type issue. Indeed, I'm not even sure you can colour types vs. values properly, look at this:
data T = C [Int]
at this point, is C a constructor? What if I continue the declaration like this:
data T = C [Int] `F`
no problem for a PEG as the infinite lookahead allows it to see the `F` no matter how far away it is. jhc may give horrible type errors, but by golly it's gonna give some good parse errors. :) John -- John Meacham - ⑆repetae.net⑆john⑈

So you won't be able to colour patterns differently from expressions,
doesn't seem any worse than the context vs. type issue. Indeed, I'm not even sure you can colour types vs. values properly, look at this:
data T = C [Int]
at this point, is C a constructor? What if I continue the declaration
On Wed, Oct 18, 2006 at 04:42:00PM +0100, Simon Marlow wrote: that like
this:
data T = C [Int] `F`
Haskell is easier than Java in this type of situation because Haskell's VARID and CONID are the same whereas Java's VARID and CONID are lexically equivalent. Modern Java IDE's color them correctly by doing (at least) two passes of highlighting: one during lexing and one after renaming/typechecking. As a result, they color identifiers based on lots of semantic information including their scope, visibility, and other factors. IntelliJ will even do data flow analsys to color an identifier differently within a single method depending on whether or not the variable can be null at each occurrence. I think that an editor for Haskell would need to use a similar technique to be useful. For example, I want top-level values colored differently than local values, and I want exported, non-exported, imported, and unbound identifiers highlighted differently. And, I want parameters to be highlighted based on their strictness (determined automatically). This cannot generally be done until the entire module (as well as all of the modules it depends on) have been parsed. In summary, I think that doing any syntax highlighting or other analysis of a Haskell module before it has gone through the renaming phase is a dead end. Regards, Brian
participants (9)
-
Bas van Dijk
-
Benjamin Franksen
-
Brandon Moore
-
Brian Hulley
-
Brian Smith
-
Bulat Ziganshin
-
Henning Thielemann
-
John Meacham
-
Simon Marlow