Avoiding name collisions by using value spaces instead of modules

Hi - A main problem I've found with Haskell is that within a module, too many things are put into the same scope. For example data Tree a b = Leaf a | Node {elem::b, lhs::Tree a b, rhs::Tree a b} puts Leaf, Node, elem, lhs, and rhs, into the module's namespace, as well as Tree. This means that if you want another kind of tree in the same module you've got to try and think of another word for Leaf, or else prefix every value constructor with some prefix of the tycon to try and prevent name collisions. I propose that the above declaration should introduce a *new module* Tree, as a sub module of the containing module, and Leaf, Node, elem etc will be put into this module, and not the module containing the data declaration itself. Identifiers in this child module could then either be used qualified or unqualified as long as they don't conflict with anything in the parent (containing) module, thus in the parent module, we could have: a = Tree.Leaf 3 b = Node{elem = 6, lhs = Leaf 2, rhs = Leaf 3} c = Tree.elem b c = b^elem where ^ is just a helpful sugar which binds tighter than function application and can be used anywhere to allow an "object oriented" way of thinking where the first arg is put on the left of the function (more on this later). Also, we could generalise the GHC GADT style data declarations by allowing any declarations to appear in the where part, and allow modules to consist of a single data declaration (ie begin with the keyword data instead of module) so that the contents of the Data.Set module would instead be something like: data Ord a => Set a = .... where insert :: Set a -> a -> Set a ... This would allow Set to be imported qualified into other modules where we could refer to the Set type by the single word Set instead of Set.Set (which I always think looks very strange indeed) ie module M where import qualified Data.Set as Set f :: a -> Set a -- no longer need to write a -> Set.Set a In fact, the whole complexity asociated with hiding components of a module and allowing unqualified imports could be discarded, since the correct resolution for each identifier would be determined by its typed context, just as in OOP the resolution is determined by the type of the object. So to summarise so far, for any value f :: T0 -> T1 -> ... -> Tn, we construct the tuple (Q0, Q1, ... Qn) where for each i, Qi is obtained from Ti by ignoring all predicates and quantification and replacing each tyvar by the symbol '?'. For example, for module M where foo :: forall b. Eq a => Set (Tree a) -> Tree ([a],b) -> [Tree (a,b)] we ignore the forall and Eq, and replace tyvars and tuple the results to get: M.(Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo as the fully qualified reference to the entity we've just declared. If we call the tuple (Set, ...) the "ty-tuple", which specifies a space of values (ie something that each value is considered to "belong" to - the analogue of the object in OOP), then we can define a projection from a space into a containing space as follows: 1) Let Qi == Ui Si0 Si1 ... Sik be any element of a ty-tuple P. Then we can form a new ty-tuple P' by replacing Qi by Ui. 2) Let Qi be any element of a ty-tuple P of arity n+1. Then we can form P' == (Q0, ..., Qi-1, Qi+1, ... Qn) of arity n In both cases, we can use P' to qualify the identifier to get a value reference as long as this still has enough information for disambiguating other uses of the identifier in M. Applying these transformations of the ty-tuple to the example above, any of the following could be used as a partial qualification of foo in M: (Set Tree, Tree ([],?), [Tree]).foo (Set, Tree, []).foo Tree.foo foo Rule 2 allows us to propagate identifiers back up to their ancestor modules as long as there are no conflicts along the way. The type inference algorithm could then be modified (I hope) to use the top-level annotation for a value declaration to determine the entities that identifiers refer to, by searching in the appropriate value spaces determined by the types of the args and return value. label :: Tree a -> Int -> (Tree a, Int) label (Leaf x) i = (Leaf i, i+1) The identifier Leaf is resolved in the current namespace augmented by the namespace for module Tree, thus we don't have to explicitly write Tree.Leaf even though the declaration of label occurs outside the Tree module itself. (I suggest that top-level type annotations should be mandatory since without them one just drowns in a sea of TI confusion when there is a type error. By making them mandatory the TI algorithm could make real use of them to resolve identifier bindings as in OOP) Similarly: foo :: a -> Set a foo x = singleton x There is no need to say Set.singleton x, because we know that the result of foo has type Set a therefore we search in the current module augmented by the contents of the Set module (every type is also a module) for a binding for "singleton" which maps from a to Set a. Finally (apologies for this long post), returning to the use of ^ to allow an object oriented way of thinking consider: insert :: a -> Set a -> Set a ps = singleton 3 qs = insert 4 ps rs = ps^insert 4 When resolving "insert" used in the binding for rs, the compiler should see that we are looking for some function Set Int -> Int -> Set Int, and hence will be looking in the current module augmented by the Set module. However the Set module only has a binding for insert with type a -> Set a -> Set a. So the compiler should generate a new function insert' from insert by moving the first Set a arg to the front. Summary: 1) Every value binding belongs to a value space represented by a ty-tuple which abstracts away from the details of the actual type 2) The space of ty-tuples forms a lattice, and in particular this means qualification is optional when there are no conflicts (we can use Leaf 3 instead of Tree.Leaf 3), and we can use partial qualification to supply just the disambiguation info needed. 3) The TI algorithm should use the type of the expression (generated top-down from the top-level annotation) to search in the correct value spaces to resolve identifiers 4) Record field selection doesn't need any special scoping rules, and is just one example of a general object-oriented way of thinking about function application. 5) We can get all the advantages of automatic namespace management the OOP programmers take for granted, in functional programming, by using value spaces as the analogue of objects, and can thereby get rid of complicated import/export directives and improve code readability with less typing (but making more use of typing - excuse the pun :-)) I'm in the process of trying to implement a pre-processor for Haskell which would use the algorithms sketched above (as well as fixing a few other things I think are wrong with Haskell such as the way the current layout rule allows one to write code that would break if you replaced an identifier with one that had a different number of characters in it etc), so any feedback on these ideas would be welcome. Thanks for reading so far! Brian Hulley PS Everything above is purely intended for resolving the kind of ad-hoc overloading that is not amenable to treatment by type classes (which transforms some subset of the ad-hoc-overloaded functions which share the same shape into a single function with implicit dictionary arguments (in the usual implementation)).

Am Sonntag, 8. Januar 2006 14:06 schrieb Brian Hulley:
Hi - A main problem I've found with Haskell is that within a module, too many things are put into the same scope. For example
data Tree a b = Leaf a | Node {elem::b, lhs::Tree a b, rhs::Tree a b}
puts Leaf, Node, elem, lhs, and rhs, into the module's namespace, as well as Tree. This means that if you want another kind of tree in the same module you've got to try and think of another word for Leaf, or else prefix every value constructor with some prefix of the tycon to try and prevent name collisions.
I propose that the above declaration should introduce a *new module* Tree, as a sub module of the containing module, and Leaf, Node, elem etc will be put into this module, and not the module containing the data declaration itself.
What speaks against putting the data declaration in a separate module: module ThisKindOfTrees where data Tree a b = ... and then use qualified imports (with a short alias), if you want to use different kinds of trees in one module? Yes, more files, but, IMHO, much more readable.
Identifiers in this child module could then either be used qualified or unqualified as long as they don't conflict with anything in the parent (containing) module, thus in the parent module, we could have:
a = Tree.Leaf 3 b = Node{elem = 6, lhs = Leaf 2, rhs = Leaf 3} c = Tree.elem b
You get that by unqualified import.
c = b^elem
where ^ is just a helpful sugar which binds tighter than function application and can be used anywhere to allow an "object oriented" way of thinking where the first arg is put on the left of the function (more on this later).
'^' is definitely a bad choice, 'cause it's exponentiation (and has a long and venerable history as symbol therefore).
Also, we could generalise the GHC GADT style data declarations by allowing any declarations to appear in the where part, and allow modules to consist of a single data declaration (ie begin with the keyword data instead of module) so that the contents of the Data.Set module would instead be something like:
data Ord a => Set a = .... where insert :: Set a -> a -> Set a ...
This would allow Set to be imported qualified into other modules where we could refer to the Set type by the single word Set instead of Set.Set (which I always think looks very strange indeed) ie
module M where import qualified Data.Set as Set
f :: a -> Set a -- no longer need to write a -> Set.Set a
import qualified Data.Set as Set import Data.Set (Set) does the trick and I prefer it over your suggestions below.
In fact, the whole complexity asociated with hiding components of a module and allowing unqualified imports could be discarded, since the correct resolution for each identifier would be determined by its typed context, just as in OOP the resolution is determined by the type of the object.
So to summarise so far, for any value f :: T0 -> T1 -> ... -> Tn, we construct the tuple (Q0, Q1, ... Qn) where for each i, Qi is obtained from Ti by ignoring all predicates and quantification and replacing each tyvar by the symbol '?'.
For example, for
module M where foo :: forall b. Eq a => Set (Tree a) -> Tree ([a],b) -> [Tree (a,b)]
we ignore the forall and Eq, and replace tyvars and tuple the results to get:
M.(Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo
as the fully qualified reference to the entity we've just declared.
Looks absolutely horrible to me. What would we gain? We could have M.(Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo and M.(Int, Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo but why? Do we really want it? I certainly don't.
If we call the tuple (Set, ...) the "ty-tuple", which specifies a space of values (ie something that each value is considered to "belong" to - the analogue of the object in OOP), then we can define a projection from a space into a containing space as follows:
1) Let Qi == Ui Si0 Si1 ... Sik be any element of a ty-tuple P. Then we can form a new ty-tuple P' by replacing Qi by Ui.
2) Let Qi be any element of a ty-tuple P of arity n+1. Then we can form P' == (Q0, ..., Qi-1, Qi+1, ... Qn) of arity n
In both cases, we can use P' to qualify the identifier to get a value reference as long as this still has enough information for disambiguating other uses of the identifier in M.
Applying these transformations of the ty-tuple to the example above, any of the following could be used as a partial qualification of foo in M:
(Set Tree, Tree ([],?), [Tree]).foo (Set, Tree, []).foo Tree.foo foo
Rule 2 allows us to propagate identifiers back up to their ancestor modules as long as there are no conflicts along the way.
The type inference algorithm could then be modified (I hope) to use the top-level annotation for a value declaration to determine the entities that identifiers refer to, by searching in the appropriate value spaces determined by the types of the args and return value.
label :: Tree a -> Int -> (Tree a, Int) label (Leaf x) i = (Leaf i, i+1)
The identifier Leaf is resolved in the current namespace augmented by the namespace for module Tree, thus we don't have to explicitly write Tree.Leaf even though the declaration of label occurs outside the Tree module itself.
(I suggest that top-level type annotations should be mandatory since without them one just drowns in a sea of TI confusion when there is a type error. By making them mandatory the TI algorithm could make real use of them to resolve identifier bindings as in OOP)
Similarly:
foo :: a -> Set a foo x = singleton x
There is no need to say Set.singleton x, because we know that the result of foo has type Set a therefore we search in the current module augmented by the contents of the Set module (every type is also a module) for a binding for "singleton" which maps from a to Set a.
import qualified Data.Set as Set import Data.Set (Set, singleton, ... ) But yes, it's a nice idea to have that done automatically, however, plain import Data.Set gives you qualified and unqualified access and you only need to use the qualified form in case of ambiguities. I admit that sometimes it's annoying to have to write Data.Set.map (or whatever) when you apply it to a Set, rather than the type-checker determines it automatically, but if it did, you'd havoc because sometimes you've just made an error -- which wouldn't then be spotted by the type-checker.
Finally (apologies for this long post), returning to the use of ^ to allow an object oriented way of thinking consider:
insert :: a -> Set a -> Set a ps = singleton 3 qs = insert 4 ps rs = ps^insert 4
When resolving "insert" used in the binding for rs, the compiler should see that we are looking for some function Set Int -> Int -> Set Int, and hence will be looking in the current module augmented by the Set module. However the Set module only has a binding for insert with type a -> Set a -> Set a. So the compiler should generate a new function insert' from insert by moving the first Set a arg to the front.
Automatic permutation of arguments? Has its merits, but goodbye to type-safety, I believe.
Summary:
1) Every value binding belongs to a value space represented by a ty-tuple which abstracts away from the details of the actual type
2) The space of ty-tuples forms a lattice, and in particular this means qualification is optional when there are no conflicts (we can use Leaf 3 instead of Tree.Leaf 3), and we can use partial qualification to supply just the disambiguation info needed.
3) The TI algorithm should use the type of the expression (generated top-down from the top-level annotation) to search in the correct value spaces to resolve identifiers
4) Record field selection doesn't need any special scoping rules, and is just one example of a general object-oriented way of thinking about function application.
5) We can get all the advantages of automatic namespace management the OOP programmers take for granted, in functional programming, by using value spaces as the analogue of objects, and can thereby get rid of complicated import/export directives and improve code readability with less typing (but making more use of typing - excuse the pun :-))
I'm in the process of trying to implement a pre-processor for Haskell which would use the algorithms sketched above (as well as fixing a few other things I think are wrong with Haskell such as the way the current layout rule allows one to write code that would break if you replaced an identifier with one that had a different number of characters in it etc),
So you would have to adjust your indentation in that case. If you consider that a serious problem, what about using braces and semicolons? If you do it like foo bar = do { x <- something bar ; more x ; yetMore x bar } you get the advantages of both, layout-enhanced readability and layout-insensitivity due to explicit braces and semicolons.
so any feedback on these ideas would be welcome.
You want object-oriented Haskell? Go ahead (there's something around already), but make it a language of its own. No offence intended, but I like Haskell as it is and haven't really seen the merits of OO-overloading yet.
Thanks for reading so far!
Brian Hulley
PS Everything above is purely intended for resolving the kind of ad-hoc overloading that is not amenable to treatment by type classes (which transforms some subset of the ad-hoc-overloaded functions which share the same shape into a single function with implicit dictionary arguments (in the usual implementation)).
Cheers, Daniel

----- Original Message -----
From: "Daniel Fischer"
Am Sonntag, 8. Januar 2006 14:06 schrieb Brian Hulley:
Hi - A main problem I've found with Haskell is that within a module, too many things are put into the same scope. For example
data Tree a b = Leaf a | Node {elem::b, lhs::Tree a b, rhs::Tree a b}
< snip> I propose that the above declaration should introduce a *new module* Tree, as a sub module of the containing module, and Leaf, Node, elem etc will be put into this module, and not the module containing the data declaration itself.
What speaks against putting the data declaration in a separate module:
module ThisKindOfTrees where
data Tree a b = ...
and then use qualified imports (with a short alias), if you want to use different kinds of trees in one module?
All I'm proposing is that the compiler should do all this painful work for you, so that you don't need to bother creating a different file that then needs two import directives to achieve the effect I want. Is there any case where you would *not* want a type to be declared in its own module?
Yes, more files, but, IMHO, much more readable.
In what way is it more readable?
<snip> For example, for
module M where foo :: forall b. Eq a => Set (Tree a) -> Tree ([a],b) -> [Tree (a,b)]
we ignore the forall and Eq, and replace tyvars and tuple the results to get:
M.(Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo
as the fully qualified reference to the entity we've just declared.
Looks absolutely horrible to me. What would we gain? We could have
M.(Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo and M.(Int, Set (Tree ?), Tree ( (,) ( [] ?) ?), [] (Tree ( (,) ? ?))).foo
but why? Do we really want it? I certainly don't.
I agree that I would certainly not want to have to write out the fully qualified name (or superfluously qualified name as you point out with Int,...), but I think we would gain a great deal from this, because just by making a declaration in module M, we've effectively created an infinite number of child modules that the declaration belongs to, without having to create an infinite number of files and write an infinite number of import directives in M. For example, suppose I'm writing a module M that deals with grammar, where the elements in a grammar rule are parameterised so that rules can be written using strings but processed as if we'd used ints instead: data Element a = Terminal a | Nonterminal a | Action a data Rule a = Rule (Element a) [[Element a]] Now I want to convert elements and rules from a to Int, so at the moment I have to write: convertElement :: Element a -> CM (Element Int) ... convertRule :: Rule a -> CM (Rule Int) for some appropriate monad CM. Whereas I would have much preferred to use just the word "convert" in both cases. It is tremendously annoying to have to suffix everything with the type name. In another situation, suppose we have two types T1 and T2, and some function convert :: T1 -> T2 The problem I have is which module (if I used a separate module for T1 and T2), should I put the convert function in? Essentially I think it belongs to the space of relations between T1 and T2, hence my idea to use tuple notation to get a module called (Q1,Q2) eg (Set,Map). But I certainly don't want the bother of having to create a new file and type import directives into M every time I want to define a function on some different relation space. Really I don't want to have to think about modules at all, since I'd like to write code that organises itself into modules (using these ty-tuples and top-down type/identifier-resolution inference) so I can concentrate on typed values and relations between them without all the module-level plumbing.
<snip> you'd havoc because sometimes you've just made an error -- which wouldn't then be spotted by the type-checker.
I agree this could be a disadvantage - ease of coding is gained but some kinds of errors cannot be caught so easily.
Finally (apologies for this long post), returning to the use of ^ to allow an object oriented way of thinking consider:
insert :: a -> Set a -> Set a ps = singleton 3 qs = insert 4 ps rs = ps^insert 4
When resolving "insert" used in the binding for rs, the compiler should see that we are looking for some function Set Int -> Int -> Set Int, and hence will be looking in the current module augmented by the Set module. However the Set module only has a binding for insert with type a -> Set a -> Set a. So the compiler should generate a new function insert' from insert by moving the first Set a arg to the front.
Automatic permutation of arguments? Has its merits, but goodbye to type-safety, I believe.
Yes, perhaps this does make life too complicated.
<snip> things I think are wrong with Haskell such as the way the current layout rule allows one to write code that would break if you replaced an identifier with one that had a different number of characters in it etc),
So you would have to adjust your indentation in that case.
Surely that is most terrible! :-)
If you consider that a serious problem, what about using braces and semicolons? If you do it like
foo bar = do { x <- something bar ; more x ; yetMore x bar }
you get the advantages of both, layout-enhanced readability and layout-insensitivity due to explicit braces and semicolons.
It is quite simple to create a new layout rule. My idea with this is that all lines should start with zero or more tab characters (non-tab leading whitespace is disallowed), and all layout blocks should start on a new line. Moreover, it is possible to completely dump the ugly let..in construct, and make "=" one of the tokens that can start a new layout block, so instead of: f x = let a = x+1 b = x + 2 in a + b one would simply write: f x = a = x+1 b = x+2 a + b This allows you to use a variable width font and does not break when the identifiers are renamed. Also, I would use ',' instead of ';' in an explicit block, so that {,} becomes a general construct that can be used for records as well as blocks (so you can use layout for records too), and, while I'm on the subject of things to change, I would use {,} for predicates instead of (...)=> and use => (or =) instead of -> in the value syntax: data (Eq a, Ord a) => Set a ... -- looks like a tuple but has nothing to do with tuples data {Eq a, Ord a} Set a -- looks more like a set of predicates this way case p of [] => q \x y => x+y (avoids all the problems when you want to type things) <Rearranged>
'^' is definitely a bad choice, 'cause it's exponentiation (and has a long and venerable history as symbol therefore).
:: would mean list cons, as it has been since the birth of ML : would be used for type annotations as it has been since the birth of Pascal (at least) o would be used for function composition so that . can be used for qualification user defined fixities would be discarded because they make code unreadable by anyone else
so any feedback on these ideas would be welcome.
You want object-oriented Haskell? Go ahead (there's something around already)
I think this value space idea is far more powerful than just object orientation, because in OOP, functions can only be associated with their first argument, whereas with value spaces, values are associated with the whole space of relations (ie all args + return value) they are concerned with.
but make it a language of its own. No offence intended, but I like Haskell as it is and haven't really seen the merits of OO-overloading yet.
Fair enough. I hope it was ok to post these ideas to this group because Haskell is very similar to the language I'm trying to create and it is very useful to get this feedback. I'm happy if anyone wants to incorporate any of my ideas above into Haskell but equally happy if they don't :-) Best regards, Brian Hulley.

On 08/01/06, Brian Hulley
For example, suppose I'm writing a module M that deals with grammar, where the elements in a grammar rule are parameterised so that rules can be written using strings but processed as if we'd used ints instead:
data Element a = Terminal a | Nonterminal a | Action a
data Rule a = Rule (Element a) [[Element a]]
Now I want to convert elements and rules from a to Int, so at the moment I have to write:
convertElement :: Element a -> CM (Element Int) ...
convertRule :: Rule a -> CM (Rule Int)
for some appropriate monad CM. Whereas I would have much preferred to use just the word "convert" in both cases. It is tremendously annoying to have to suffix everything with the type name.
This is what typeclasses are for. class Convert c where convert :: c a -> CM (c Int) - Cale

----- Original Message -----
From: "Cale Gibbard"
On 08/01/06, Brian Hulley
wrote: For example, suppose I'm writing a module M that deals with grammar, where the elements in a grammar rule are parameterised so that rules can be written using strings but processed as if we'd used ints instead:
data Element a = Terminal a | Nonterminal a | Action a
data Rule a = Rule (Element a) [[Element a]]
Now I want to convert elements and rules from a to Int, so at the moment I have to write:
convertElement :: Element a -> CM (Element Int) ...
convertRule :: Rule a -> CM (Rule Int)
for some appropriate monad CM. Whereas I would have much preferred to use just the word "convert" in both cases. It is tremendously annoying to have to suffix everything with the type name.
This is what typeclasses are for.
class Convert c where convert :: c a -> CM (c Int)
Type classes just seem overkill for this kind of thing. All I want is compile time resolution of an overloaded identifier, whereas type classes give all the machinery that would be needed if I wanted runtime ad-hoc polymorphism, with all the attendant verbosity, just so that the compiler can then optimize out the runtime polymorphism behind the scenes for cases like the example above. After all, I just want to write two very simple functions, so the effort to factor into a type class + two instances, also having to include the Convert c in the context whenever I call one of them just seems really painful. Also, the word "Convert" is now used up as well... Also, when I'm just writing code in an exploratory way, I don't know in advance what the common things are that could be factored out into type classes (except perhaps in very simple examples like that above), so while I'm writing the code for the first time there is still a problem trying to think up different names. Regards, Brian.

On 08/01/06, Brian Hulley
----- Original Message ----- From: "Cale Gibbard"
To: "Brian Hulley" Cc: "Daniel Fischer" ; "Haskell-cafe" Sent: Sunday, January 08, 2006 5:54 PM Subject: Re: [Haskell-cafe] Avoiding name collisions by using value spaces instead of modules On 08/01/06, Brian Hulley
wrote: For example, suppose I'm writing a module M that deals with grammar, where the elements in a grammar rule are parameterised so that rules can be written using strings but processed as if we'd used ints instead:
data Element a = Terminal a | Nonterminal a | Action a
data Rule a = Rule (Element a) [[Element a]]
Now I want to convert elements and rules from a to Int, so at the moment I have to write:
convertElement :: Element a -> CM (Element Int) ...
convertRule :: Rule a -> CM (Rule Int)
for some appropriate monad CM. Whereas I would have much preferred to use just the word "convert" in both cases. It is tremendously annoying to have to suffix everything with the type name.
This is what typeclasses are for.
class Convert c where convert :: c a -> CM (c Int)
Type classes just seem overkill for this kind of thing. All I want is compile time resolution of an overloaded identifier, whereas type classes give all the machinery that would be needed if I wanted runtime ad-hoc polymorphism, with all the attendant verbosity, just so that the compiler can then optimize out the runtime polymorphism behind the scenes for cases like the example above.
There's no such runtime polymorphism going on there. Barring existential types, everything is always resolved at compile time. Further, I wouldn't consider the polymorphism ad-hoc, so much as it is a restricted form of parametric polymorphism. Further, it's not really that much typing. You type: class Convert c where convert :: c a -> CM (c Int) instance Convert Rule where convert (Rule lhs rhss) = ... instance Convert Element where convert (Terminal t) = ... convert (Nonterminal n) = ... convert (Action a) = ... rather than convertRule :: Rule a -> CM (Rule Int) convert (Rule lhs rhss) = ... convertElement :: Element a -> CM (Element Int) convert (Terminal t) = ... convert (Nonterminal n) = ... convert (Action a) = ... which is 8 lines rather than 6, but, hey, what do you want? You get the convenience of not typing qualifiers everywhere, and restricted parametric polymorphism if you ever need it.
After all, I just want to write two very simple functions, so the effort to factor into a type class + two instances, also having to include the Convert c in the context whenever I call one of them just seems really painful. Also, the word "Convert" is now used up as well...
First, you only add the context when the type variable escapes and gets universally quantified (modulo that class restriction). That is, you only end up typing anything extra if you use the convert function in a way which is polymorphic, something that you can't do with your ad-hoc polymorphism, or with modules. Convert is only used up *as the name of a typeclass*, which is a pretty sparse namespace to begin with. Just name the classes after the functions you're generalising, and you'll run out of names at exactly the same time. Secondly, if the functions are really different, and you never plan to use them polymorphically, why the heck would you want to call them the same thing? That's just confusing to anyone that has to read the code later. If you end up qualifying all the uses of them with implicitly declared module names, and all the modules are in (qualified) scope, that's basically the same as adding qualifications to the names in the first place. It's just as bad to end up with a bunch of almost-empty namespaces as it is to end up with one large too-cluttered namespace.
Also, when I'm just writing code in an exploratory way, I don't know in advance what the common things are that could be factored out into type classes (except perhaps in very simple examples like that above), so while I'm writing the code for the first time there is still a problem trying to think up different names.
Perhaps a thesaurus? :) I'm sorry, but I've always had a really hard time taking this sort of complaint seriously. You *really* can't come up with a unique name? Try adding an adjective, or if that's too much typing, even a (meaningful) single letter prefix/postfix. Some people hate those, but I see no problem with them when used sparingly. In the case of different tree types, you can name one of them RoseTree, and another BinaryTree. If it's still really hard to come up with a name, possibly you shouldn't be naming the thing in the first place. Rather than trying to keep your fingers moving, think about the design a moment longer. Why are you defining it? What function does it serve to the program? Programming in Haskell is more about contemplation of a problem than lots of typing anyway. - Cale

Cale Gibbard wrote:
<snip>
Thanks for the illustration - I see another advantage with type classes is that you only need to write the type signature once (in the class declaration) instead of before each instance binding.
Secondly, if the functions are really different, and you never plan to use them polymorphically, why the heck would you want to call them the same thing? That's just confusing to anyone that has to read the code later.
For example, Data.Map declares: insert :: Ord k => k -> a -> Map k a -> Map k a whereas Data.Set declares: insert :: Ord a => a -> Set a -> Set a This is an example where type classes can't be applied even though the functions in a sense do the same thing. My system would solve this problem, by allowing the programmer to type d = insert a b c and have the type inference algorithm work out that Data.Map.insert was meant, as long as c or d has been typed as Map p q. But perhaps there is a way to get the signature for Data.Map.insert into the same form as that of Data.Set.insert? Regards, Brian.

On 08/01/06, Brian Hulley
Cale Gibbard wrote:
<snip>
Thanks for the illustration - I see another advantage with type classes is that you only need to write the type signature once (in the class declaration) instead of before each instance binding.
Secondly, if the functions are really different, and you never plan to use them polymorphically, why the heck would you want to call them the same thing? That's just confusing to anyone that has to read the code later.
For example, Data.Map declares:
insert :: Ord k => k -> a -> Map k a -> Map k a
whereas Data.Set declares:
insert :: Ord a => a -> Set a -> Set a
This is an example where type classes can't be applied even though the functions in a sense do the same thing. My system would solve this problem, by allowing the programmer to type d = insert a b c and have the type inference algorithm work out that Data.Map.insert was meant, as long as c or d has been typed as Map p q.
But perhaps there is a way to get the signature for Data.Map.insert into the same form as that of Data.Set.insert?
Regards, Brian.
Well, that's an interesting case, since the types are actually reasonably different. Prior to these being named the same way, we had addToSet / addToFM, which didn't require qualified imports. Of course, with qualified imports, we get the same effect as postfixing, so it's basically the same thing. Unifying these two under a single operation is certainly trickier, and it's a little more questionable that it should be done at all, given that their types are so different -- below is the closest I could come to it off-hand. --- {-# OPTIONS_GHC -fglasgow-exts #-} -- for fundeps/multiparameter classes import qualified Data.Map as Map import Data.Map (Map) import qualified Data.Set as Set import Data.Set (Set) class Insert t c a | c a -> t where insert :: t -> c a -> c a instance (Ord a) => Insert a Set a where insert x s = Set.insert x s instance (Ord k) => Insert (k,a) (Map k) a where insert (k,v) m = Map.insert k v m exampleSet = insert 5 $ insert 6 $ Set.empty exampleMap = insert (1,2) $ insert (2,7) $ Map.empty ---- Perhaps someone else will have some ideas as to suitable typeclass magic to allow for the curried form rather than using tuples. - Cale

Unifying these two under a single operation is certainly trickier, and it's a little more questionable that it should be done at all, given that their types are so different -- below is the closest I could come to it off-hand.
--- {-# OPTIONS_GHC -fglasgow-exts #-} -- for fundeps/multiparameter classes import qualified Data.Map as Map import Data.Map (Map) import qualified Data.Set as Set import Data.Set (Set)
class Insert t c a | c a -> t where insert :: t -> c a -> c a
instance (Ord a) => Insert a Set a where insert x s = Set.insert x s
instance (Ord k) => Insert (k,a) (Map k) a where insert (k,v) m = Map.insert k v m
exampleSet = insert 5 $ insert 6 $ Set.empty exampleMap = insert (1,2) $ insert (2,7) $ Map.empty
----
Perhaps someone else will have some ideas as to suitable typeclass magic to allow for the curried form rather than using tuples.
- Cale
Oh, this is a little less general, but simpler to use: {-# OPTIONS_GHC -fglasgow-exts #-} import qualified Data.Map as Map import Data.Map (Map) import qualified Data.Set as Set import Data.Set (Set) class Insert t c | c -> t where insert :: t -> c -> c instance (Ord a) => Insert a (Set a) where insert x s = Set.insert x s instance (Ord k) => Insert (k,a) (Map k a) where insert (k,v) m = Map.insert k v m exampleSet = insert 5 $ insert 6 $ Set.empty exampleMap = insert (1,2) $ insert (2,7) $ Map.empty

Cale Gibbard wrote:
Unifying these two under a single operation is certainly trickier, and it's a little more questionable that it should be done at all, given that their types are so different -- below is the closest I could come to it off-hand.
--- {-# OPTIONS_GHC -fglasgow-exts #-} -- for fundeps/multiparameter classes import qualified Data.Map as Map import Data.Map (Map) import qualified Data.Set as Set import Data.Set (Set)
class Insert t c a | c a -> t where insert :: t -> c a -> c a
instance (Ord a) => Insert a Set a where insert x s = Set.insert x s
instance (Ord k) => Insert (k,a) (Map k) a where insert (k,v) m = Map.insert k v m
exampleSet = insert 5 $ insert 6 $ Set.empty exampleMap = insert (1,2) $ insert (2,7) $ Map.empty
----
Perhaps someone else will have some ideas as to suitable typeclass magic to allow for the curried form rather than using tuples.
- Cale
Oh, this is a little less general, but simpler to use:
{-# OPTIONS_GHC -fglasgow-exts #-} import qualified Data.Map as Map import Data.Map (Map) import qualified Data.Set as Set import Data.Set (Set)
class Insert t c | c -> t where insert :: t -> c -> c
instance (Ord a) => Insert a (Set a) where insert x s = Set.insert x s
instance (Ord k) => Insert (k,a) (Map k a) where insert (k,v) m = Map.insert k v m
exampleSet = insert 5 $ insert 6 $ Set.empty exampleMap = insert (1,2) $ insert (2,7) $ Map.empty
Thanks! I'm impressed. Obviously there is a lot more power in type classes than I'd thought. I hadn't realised that you could separate the Ord a and Ord k from the type signature in the class declaration, and put them in instance declarations like that (for example). It would be really interesting to see how far one could go in factoring all the collection type functions/values into type classes. Best regards, Brian.

Brian Hulley wrote:
Cale Gibbard wrote:
Unifying these two under a single operation is certainly trickier, and it's a little more questionable that it should be done at all, given that their types are so different -- below is the closest I could come to it off-hand. <snip>
Thanks! I'm impressed. Obviously there is a lot more power in type classes than I'd thought.
Of course, this still doesn't solve the original problem I was trying to address, namely that I want identifier bindings to be pulled into scope by their typed context (ie value type or return type + arg types) so that functional programmers could get the same advantages (in fact even more advantages) than OOP programmers. With type classes, every use of any specific identifier, within the whole program, would have to be declared an instance of a single global type class, which would then be imported unqualified into the module so that you could write insert (1,2) m etc without having to qualify the word "insert". (Because if these type classes/instances were imported qualified we'd just swap "Set.insert" for "Collection.insert") With the ty-tuple idea, all functions in a module are automatically organised into sub-modules and brought into scope only when needed, so every function in a program could be typed into a single file thus freeing the programmer from the onerous burden of having to work out where to put them manually. (The programmer would still put data declarations into different modules) I must admit my thinking is strongly influenced by years of C++ programming, but so far I haven't seen any description of how one decides what module a function should be placed in in functional programming, and the existing module system seems like a pauper's alternative to static class methods in C++, C#, or Java, since in Haskell, you need to use the same name for the module and the type (for Data.Set etc) yet this choice of same name is not enforced by the language even though the user thinks of them as being "the same", and in fact two import directives are needed to maintain the illusion that they are the same so you can write Set a and Set.insert etc... Regards, Brian Hulley

[A bit late reply - I've just returned from vacation] On Sun, Jan 08, 2006 at 05:47:19PM -0000, Brian Hulley wrote:
All I'm proposing is that the compiler should do all this painful work for you, so that you don't need to bother creating a different file that then needs two import directives to achieve the effect I want. Is there any case where you would *not* want a type to be declared in its own module?
I can think of such cases - for example consider a set of mutually recursive datatypes used to represent abstract syntax trees in some language. Of course, I imagine that "your modules" could be introduced in such a way that would still allow recursion, but it's simply more natural for me to place all those declarations in one module named Syntax or AST.
It is quite simple to create a new layout rule. My idea with this is that all lines should start with zero or more tab characters (non-tab leading whitespace is disallowed),
All lines start with at least zero tab characters, trivially.
and all layout blocks should start on a new line.
That's a good coding practice (yes, you can write like this in Haskell already), making your code more change-friendly, which is especially important when you use some version control tool. It would be nice if this could be enforced by the compiler, at least as some kind of a warning. I encourage you to add such option to some Haskell compiler, or a coding policy checking tool :-)
Moreover, it is possible to completely dump the ugly let..in construct, and make "=" one of the tokens that can start a new layout block, so instead of:
f x = let a = x+1 b = x + 2 in a + b
How about f x = let a = x + 1 b = x + 2 in a + b Anyway, I would use "where" here.
one would simply write:
f x = a = x+1 b = x+2 a + b
I don't like it. It's shorter, but less readable too. You could simply write: f x = a + b where a = x+1 b = x+2 which is not that much longer, but much more readable. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

Tomasz Zielonka wrote:
[A bit late reply - I've just returned from vacation]
On Sun, Jan 08, 2006 at 05:47:19PM -0000, Brian Hulley wrote:
All I'm proposing is that the compiler should do all this painful work for you, so that you don't need to bother creating a different file that then needs two import directives to achieve the effect I want. Is there any case where you would *not* want a type to be declared in its own module?
I can think of such cases - for example consider a set of mutually recursive datatypes used to represent abstract syntax trees in some language. Of course, I imagine that "your modules" could be introduced in such a way that would still allow recursion, but it's simply more natural for me to place all those declarations in one module named Syntax or AST.
My idea was that modules would be hierarchical, so that for example you could have a module AST as follows: module AST where data Data1 = ... data Data2 = ... This would be equivalent to writing something like: module AST where -- in file prefixPath/AST.hs import AST.Data1 as Data1 (Data1) -- using partially qualified module names (not yet supported AFAIK?) import AST.Data2 as Data2 (Data2) module Data1 where -- in file prefixPath/AST/Data1.hs data Data1 = ... module Data2 where data Data2 = ... If the fully qualified name of the AST module was prefixSeq.AST then the fully qualified names of the DataN modules would be prefixSeq.AST.DataN In other words, one file could contain a tree of modules without having to physically put each module into files arranged as a tree on disk with import directives. An advantage of this would be that you'd no longer need to think up different field names for each record that you use to keep track of state when traversing a data structure - something that quickly becomes extremely difficult as things are at the moment for large modules.
It is quite simple to create a new layout rule. My idea with this is that all lines should start with zero or more tab characters (non-tab leading whitespace is disallowed),
All lines start with at least zero tab characters, trivially.
I could also have said, all characters in leading whitespace must be tabs.
and all layout blocks should start on a new line.
That's a good coding practice
Thanks - glad to see someone agrees with me! :-)
(yes, you can write like this in Haskell already), making your code more change-friendly, which is especially important when you use some version control tool. It would be nice if this could be enforced by the compiler, at least as some kind of a warning. I encourage you to add such option to some Haskell compiler, or a coding policy checking tool :-)
Moreover, it is possible to completely dump the ugly let..in construct, and make "=" one of the tokens that can start a new layout block, so instead of:
f x = let a = x+1 b = x + 2 in a + b
How about
f x = let a = x + 1 b = x + 2 in a + b
This is how I indent let..in at the moment. However I feel that "let" and "in" just takes up space, and while useful in describing the abstract syntax, I don't see what use it has in the concrete syntax of a language, apart from the fact that in the current layout rule, "let" is needed to introduce a block. I think that because "let" and "in" are rather redundant, people like to squash them into the same lines as non-redundant code, leading to: f x = let a = x + 1 b = x+2 in a+ b or a million times worse: f x = let a = a+1 ; b = x+2 -- mixing up two different ways of writing a block of things c = a + b in c
Anyway, I would use "where" here.
I agree this would be neater in this case - sometimes let..in is still needed eg in a branch of an if construct etc.
one would simply write:
f x = a = x+1 b = x+2 a + b
I don't like it. It's shorter, but less readable too.
Yet this is very similar to how you write functions in C++, C, Java, C# etc so for anyone used to these languages, it seems very natural ie f(x) { a = x+1; // Hurray! no need to write "let" :-))) b = x+2; // ((single assignment simulating binding)) return a+b; }
You could simply write:
f x = a + b where a = x+1 b = x+2
or even f x = a + b where a = x+1 b = x+2 Regards, Brian.

On Tue, Jan 17, 2006 at 08:43:55PM -0000, Brian Hulley wrote:
An advantage of this would be that you'd no longer need to think up different field names for each record that you use to keep track of state when traversing a data structure - something that quickly becomes extremely difficult as things are at the moment for large modules.
Not a bad idea. Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

On Sun, Jan 08, 2006 at 01:06:18PM -0000, Brian Hulley wrote:
5) We can get all the advantages of automatic namespace management the OOP programmers take for granted, in functional programming, by using value spaces as the analogue of objects, and can thereby get rid of complicated import/export directives
There is nothing complicated in Haskell's module system. It's very simple, explicit, independent from the type system and therefore easy to understand. "Verbose" or "low-level" would be better accusations. It seems that you want to introduce some kind of C++'s Koenig's lookup to Haskell. Is it your inspiration? For me, C++ doesn't seem to be a source of good ideas for programming language design ;-) Best regards Tomasz -- I am searching for programmers who are good at least in (Haskell || ML) && (Linux || FreeBSD || math) for work in Warsaw, Poland

Tomasz Zielonka wrote:
On Sun, Jan 08, 2006 at 01:06:18PM -0000, Brian Hulley wrote:
5) We can get all the advantages of automatic namespace management the OOP programmers take for granted, in functional programming, by using value spaces as the analogue of objects, and can thereby get rid of complicated import/export directives
There is nothing complicated in Haskell's module system. It's very simple, explicit, independent from the type system and therefore easy to understand.
There are many concepts that one needs to understand ie importing/exporting, qualified/unqualified, hiding, selecting, and a strange syntax that looks like tuples but isn't anything to do with tuples.
"Verbose" or "low-level" would be better accusations.
I agree with those as well.
It seems that you want to introduce some kind of C++'s Koenig's lookup to Haskell. Is it your inspiration?
First argument lookup works well in C++, but perhaps wouldn't work nearly so well in Haskell due to the presence of polymorphism/higher order functions. Another source of inspiration was Smalltalk, where instead of thinking in terms of files (modules) one just thinks in terms of objects(types) and methods(functions) and enters new types/functions via a browser instead of a text editor. My real purpose is to try to find a way to be able to concentrate on values/types without having to worry about where to put them. Just as we take garbage collection (or other memory management) for granted, I'm trying to find some way of automatically managing the "storage" of value/type declarations. In C# and Java, every class must be stored in a file of that name, and most C++ programmers follow this rule as a convention. Whereas in Haskell (or ML), where there are lots of small data declarations, I don't see any simple rule for where to put stuff. It is all too easy to end up with a gigantic module where one type has been converted into another and another etc etc and soon it is very difficult to think up different names for functions/variants/fields, and remember even where the functions are defined within the file, leading to much scrolling and search operations when editing code.
For me, C++ doesn't seem to be a source of good ideas for programming language design ;-)
Certainly C++ has its problems, but I think some rather clever ideas can be salvaged from it (eg the use of traits in template programming) :-) Best regards, Brian.

On Tuesday 17 January 2006 22:44, Brian Hulley wrote:
Tomasz Zielonka wrote:
On Sun, Jan 08, 2006 at 01:06:18PM -0000, Brian Hulley wrote:
5) We can get all the advantages of automatic namespace management the OOP programmers take for granted, in functional programming, by using value spaces as the analogue of objects, and can thereby get rid of complicated import/export directives
There is nothing complicated in Haskell's module system. It's very simple, explicit, independent from the type system and therefore easy to understand.
There are many concepts that one needs to understand ie importing/exporting, qualified/unqualified, hiding, selecting, and a strange syntax that looks like tuples but isn't anything to do with tuples.
Hey, you managed to summarize almost all features of Haskell's module system within less than 4 lines of text. It can't be that complicated, can it? ;-) (This isn't meant to dismiss your proposal. I've sometimes been thinking along similar lines.) Ben
participants (5)
-
Benjamin Franksen
-
Brian Hulley
-
Cale Gibbard
-
Daniel Fischer
-
Tomasz Zielonka