Text.ParserCombinators.* is a bad name

* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name. This week I was working on a set of regex combinators, that ideally would go under: Text.Combinators.Regex but we're stuck with Text.ParserCombinators.Regex. Making me think that: Text.ParserCombinators.ReadP Text.ParserCombinators.Parsec Text.PrettyPrint.HughesPJ namespace should be deprecated in favour of: Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ Reflecting the wide variety of text-manipulation combinators we use. The latter seem more elegant. Parsec in particular can lead to some noisy import statements in its current namespace. * Secondly, an FAQ on #haskell is "Where is fst3/snd3/thd3?" Lacking some fun generic system for hacking at tuples, perhaps fst3/snd3/thd3 should be put back into the base library under Data.Tuple, next to fst/snd? I note they were in (at least some versions of) Haskel 1.2, and they're even defined locally in GHC.PArr. I'll draft the patches if people think these are reasonable. Cheers, Don

On Wed, Mar 08, 2006 at 04:42:08PM +1100, Donald Bruce Stewart wrote:
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
Sounds reasonable. I wonder whether HughesPJ is also a candidate for renaming to something more descriptive? (No offence to John and Simon) Bernie.

Bernard James POPE writes:
On Wed, Mar 08, 2006 at 04:42:08PM +1100, Donald Bruce Stewart wrote:
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
Sounds reasonable. I wonder whether HughesPJ is also a candidate for renaming to something more descriptive? (No offence to John and Simon)
One disadvantage of this scheme is that it obscures the difference
between the parsing combinators and the pretty-printing combinators.
We wouldn't want to make name of the HughesPJ module too generic, as
there are also other libraries of pretty printing combinators out there,
such as Daan Leijen's PPrint.
--
David Menendez

David Menendez
Sounds reasonable. I wonder whether HughesPJ is also a candidate for renaming to something more descriptive? (No offence to John and Simon)
We wouldn't want to make name of the HughesPJ module too generic, as there are also other libraries of pretty printing combinators out there, such as Daan Leijen's PPrint.
Indeed, the whole point of the 'HughesPJ' suffix was to make the name _more_ descriptive. There are several widespread combinator libraries for both parsing and pretty-printing, all with different interfaces. They all used to be named simply and blandly, like: ParseLib Pretty and it was difficult to know exactly whose library was intended. In the absence of any truly distinguishing internal features, at least naming the authors gives a clue to which paper you should be reading. :-) Regards, Malcolm

On Wed, 2006-03-08 at 12:15 +0000, Malcolm Wallace wrote:
David Menendez
wrote: Sounds reasonable. I wonder whether HughesPJ is also a candidate for renaming to something more descriptive? (No offence to John and Simon)
We wouldn't want to make name of the HughesPJ module too generic, as there are also other libraries of pretty printing combinators out there, such as Daan Leijen's PPrint.
Indeed, the whole point of the 'HughesPJ' suffix was to make the name _more_ descriptive. There are several widespread combinator libraries for both parsing and pretty-printing, all with different interfaces. They all used to be named simply and blandly, like: ParseLib Pretty and it was difficult to know exactly whose library was intended. In the absence of any truly distinguishing internal features,
Yes. I meant that Text.Combinators.HughesPJ doesn't really say what job the combinators are for.
at least naming the authors gives a clue to which paper you should be reading. :-)
:-) But then HughesPJ should be called Hughes! Anyway I agree that classifying things can be a rather tricky business. There must be a significant point of difference between, say the Wadler library, and the Hughes library, and I wonder if that could not be exposed in the name? (a rhetorical question) Cheers, Bernie.

Hello Bernard, Wednesday, March 8, 2006, 9:13:23 AM, you wrote:
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
BJP> Sounds reasonable. I wonder whether HughesPJ is also a candidate for BJP> renaming to something more descriptive? (No offence to John and Simon) to Text.Combinators.Donald.Bruce.Stewart ? :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hello Donald, Wednesday, March 8, 2006, 8:42:08 AM, you wrote: DBS> * Secondly, an FAQ on #haskell is "Where is fst3/snd3/thd3?" DBS> Lacking some fun generic system for hacking at tuples, perhaps DBS> fst3/snd3/thd3 should be put back into the base library under DBS> Data.Tuple, next to fst/snd? I note they were in (at least some DBS> versions of) Haskel 1.2, and they're even defined locally in GHC.PArr. DBS> I'll draft the patches if people think these are reasonable. i'm greatly support this. there is a plenty of trivial functions which are repeated in almost any program. there is a GenUtil module by John Meacham, NewPrelude at hawiki, Util directory in GHC compiler and many other places where these trivial functions are conceal itself so adding even part of this fucntions to std libs will be great. especially in the face of Haskell' where these functions can be standartized. i'm especially like to improve situation with string processing, even strstr() analog is absent in std libs! -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

dons@cse.unsw.edu.au (Donald Bruce Stewart) writes:
* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name.
Amen. I've never understood why there is a need for a deep, sparse hierarchy. You're practically never going to navigate for a library anyway, and even if you were, a deep hierarchy will unavoidably have ambiguities. (Of course, in this case, the obvious "Text.Regex" is already taken)
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
My preference is to name modules after their purpose or application area ("Text") rather than after their implementation technique ("Combinators"). -k -- If I haven't seen further, it is by standing in the footprints of giants

On Wed, Mar 08, 2006 at 04:42:08PM +1100, Donald Bruce Stewart wrote:
* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name. This week I was working on a set of regex combinators, that ideally would go under:
Text.Combinators.Regex
but we're stuck with Text.ParserCombinators.Regex. Making me think that:
Text.ParserCombinators.ReadP Text.ParserCombinators.Parsec Text.PrettyPrint.HughesPJ
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
If you want to make them shorter, it's the redundant "Combinators" that should go. While we're complaining about module names, how about System.Console.GetOpt?

Ross Paterson
Text.ParserCombinators.Parsec namespace should be deprecated in favour of: Text.Combinators.Parsec
If you want to make them shorter, it's the redundant "Combinators" that should go.
Is there not a valid distinction between a Parser library (e.g. for XML documents), and a ParserCombinator library (which enables one to write parsers)? Is it helpful to blur the boundaries between the glue and the things being glued?
While we're complaining about module names, how about System.Console.GetOpt?
What is wrong with it? (Not a rhetorical question - I really can't guess!) Ketil Malde wrote:
I've never understood why there is a need for a deep, sparse hierarchy.
Really, it is just a social mechanism to encourage developers to come up with long, descriptive names. You could just as easily drop the dots and have TextParserCombinatorsParsec or Text_ParserCombinators_Parsec with almost no technical impact. But lots of people would complain about the amount of letters to type if that was the scheme, and somehow, if we use dots, it lessens the resistance. :-) I'm sure, without dots, we would still be stuck with lots of short module names in every project, clashing with everyone elses libraries. Regards, Malcolm

On Wed, Mar 08, 2006 at 12:27:43PM +0000, Malcolm Wallace wrote:
Ross Paterson
wrote: Text.ParserCombinators.Parsec namespace should be deprecated in favour of: Text.Combinators.Parsec
If you want to make them shorter, it's the redundant "Combinators" that should go.
Is there not a valid distinction between a Parser library (e.g. for XML documents), and a ParserCombinator library (which enables one to write parsers)?
An XML parser would presumably be under Text.XML, etc, while a general one would be higher in the hierarchy.
Is it helpful to blur the boundaries between the glue and the things being glued?
OK, but to replace ParserCombinators with Combinators is to forget what kind of glue we're talking about.
While we're complaining about module names, how about System.Console.GetOpt?
What is wrong with it? (Not a rhetorical question - I really can't guess!)
Sorry, it's the Console I object to, as in System.Console.Readline, when it's for parsing the command line. There may be no console involved at all.

Ross Paterson
Is it helpful to blur the boundaries between the glue and the things being glued?
OK, but to replace ParserCombinators with Combinators is to forget what kind of glue we're talking about.
Oh, I agree with you there. What I'm saying is that trimming either half of the name, "Parser" or "Combinator", potentially loses a meaningful distinction.
System.Console.GetOpt?
Sorry, it's the Console I object to, as in System.Console.Readline, when it's for parsing the command line. There may be no console involved at all.
I always assumed "Console" was basically synonymous with "command-line". Are arguments ever passed to a program by some other means? Or is it the fact that getOpt just parses strings, regardless of where they came from (program arguments or otherwise)? Text.Parser.GetOpt Language.CmdLine.Options Language.Gnu.GetOpt Any better suggestions? Regards, Malcolm

On Wed, Mar 08, 2006 at 01:55:11PM +0000, Malcolm Wallace wrote:
System.Console.GetOpt?
Sorry, it's the Console I object to, as in System.Console.Readline, when it's for parsing the command line. There may be no console involved at all.
I always assumed "Console" was basically synonymous with "command-line". Are arguments ever passed to a program by some other means?
execl() It's Cmdline or Args, not Console.

On 3/8/06, Donald Bruce Stewart
* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name. This week I was working on a set of regex combinators, that ideally would go under:
Text.Combinators.Regex
but we're stuck with Text.ParserCombinators.Regex. Making me think that:
Text.ParserCombinators.ReadP Text.ParserCombinators.Parsec Text.PrettyPrint.HughesPJ
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
Reflecting the wide variety of text-manipulation combinators we use. The latter seem more elegant. Parsec in particular can lead to some noisy import statements in its current namespace.
Why do we need the "Combinators" bit at all? Will we really have so many different kind of parsers and other text manipulation tools that we need to separate them based on their technique? I'd rather have Text.Parsers.Parsec Or even just Text.Parsec /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

And I step into the minefield: I need to name my Text.* module... Donald Bruce Stewart wrote:
* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name. This week I was working on a set of regex combinators, that ideally would go under:
Text.Combinators.Regex
Not coincidentally, I have my own regex module that I need to name. This is the one that I mentioned a few days ago that uses Parsec to replace Text.Regex (it currently matches Text.Regex on all the tests I have collected). The most interesting thing about my new library is that it is lazy. And now Donald and I both need part of the Text*Regex* namespace. So I could name the module path using some combination of relevant adjectives "Parser,Parsec,Lazy,Native,Faster,Replacement" Perhaps: Text.Regex.Parsec Text.Regex.Lazy This is where java's namespace is so brilliant. I could have used reverse-dns: Com.MightyReason.Haskell.Text.Regex So how should Donald and I avoid a namespace collision? -- Chris

Hello Chris, Thursday, March 9, 2006, 2:17:30 PM, you wrote: CK> Not coincidentally, I have my own regex module that I need to name. This is the CK> one that I mentioned a few days ago that uses Parsec to replace Text.Regex (it CK> currently matches Text.Regex on all the tests I have collected). The most CK> interesting thing about my new library is that it is lazy. i think that it's better to use name "Text.Regex" for your library, and rename old to smth like "Text.Regex.C" or "Text.Regex.PackedString" -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Chris Kuklewicz
And now Donald and I both need part of the Text*Regex* namespace.
So I could name the module path using some combination of relevant adjectives "Parser,Parsec,Lazy,Native,Faster,Replacement"
Text.Regex.Parsec Text.Regex.Lazy
Yes, any of those seem adaequately descriptive.
This is where java's namespace is so brilliant. I could have used reverse-dns:
Com.MightyReason.Haskell.Text.Regex
And of course you can use that namespace in Haskell too, if you wish. That scheme is even described as a possibility in the Report Addendum for hierarchical namespaces.
So how should Donald and I avoid a namespace collision?
By agreement. :-) Just because plain Text.Regex is already taken, doesn't mean it has to stay that way. Another common approach is to select a "project name" as the discriminating element of the namespace, e.g. Text.XML.HaXml.* Graphics.UI.WX.* Graphics.UI.Gtk.* So you could agree on Text.Regex.MightyReason Text.Regex.Dons and rename the existing package as Text.Regex.Gnu or whatever (I don't know if it really is from Gnu). Regards, Malcolm

Disclaimer: random noise follows. There is an opportunity for haskell to innovate here. We could move away from a mere hierachical namespace to something more semantic-rich: eg. import Parser | author=="Hughes" && abstraction=="Combinators" The attributes would be normal top-level definitions in the modules, and the guard any boolean haskell expression. The import would be rejected if not exactly one module matches. (No, this is not a serious proposal... But I will look like a visionary when someone actually implements this :) Cheers, JP.

Donald Bruce Stewart wrote:
* Firstly, a library naming issue. Text.ParserCombinators feels like a clunkey name. This week I was working on a set of regex combinators, that ideally would go under:
Text.Combinators.Regex
but we're stuck with Text.ParserCombinators.Regex. Making me think that:
Text.ParserCombinators.ReadP Text.ParserCombinators.Parsec Text.PrettyPrint.HughesPJ
namespace should be deprecated in favour of:
Text.Combinators.ReadP Text.Combinators.Parsec Text.Combinators.HughesPJ
Reflecting the wide variety of text-manipulation combinators we use. The latter seem more elegant. Parsec in particular can lead to some noisy import statements in its current namespace.
At the time, I think my preference was for Text.Parsing.* and Text.Pretty.* (maybe Text.Parser.*, or Text.Parsers.*, I can't remember exactly). Malcolm W. liked the longer version, I didn't feel strongly enough to pursue it. Still, I prefer Parsing and even ParserCombinators to just Combinators, I don't think Combinators is a particularly useful classification; most modules define combinators. I think the intention with ParserCombinators was to distinguish the category from parsers for a particular grammar; on the other hand, the hierarchy puts parsers for particular grammars elsewhere (Text.XML.Parser, Language.Haskell.Parser). That is, everything to do with a particular language is collected in a sub-hierarchy, rather than collecting by operation first. Designing a hierarchy is never an exact science; you often come across several categories on which you can split, and it's a difficult choice as to which category should be nearer the root of the tree (System.Posix.IO, or System.IO.Posix?). You could argue all day about this stuff and not make any useful progress. Can I remind people of this: http://haskell.org/~simonmar/lib-hierarchy.html also available in GHC's darcs repo in libraries/doc. This is the place we keep the "design" of the hierarchy, it's not an official document in any sense, but it is the result of our original discussions about the hierarchy, updates from time to time as new libraries have come along. I hope as part of the Haskell' process we'll fix a design for the hierarchy.
* Secondly, an FAQ on #haskell is "Where is fst3/snd3/thd3?"
Lacking some fun generic system for hacking at tuples, perhaps fst3/snd3/thd3 should be put back into the base library under Data.Tuple, next to fst/snd? I note they were in (at least some versions of) Haskel 1.2, and they're even defined locally in GHC.PArr.
I'll draft the patches if people think these are reasonable.
Data.Tuple for these is fine by me. Cheers, Simon
participants (12)
-
Bernard James POPE
-
Bernard Pope
-
Bulat Ziganshin
-
Chris Kuklewicz
-
David Menendez
-
dons@cse.unsw.edu.au
-
Jean-Philippe Bernardy
-
Ketil Malde
-
Malcolm Wallace
-
Ross Paterson
-
Sebastian Sylvan
-
Simon Marlow