Add haskell-src as an official machine-readable component of the Haskell standard

I propose that the haskell-src package be renamed haskell20nn-src for each revision Haskell 20nn of the standard, and be made an official machine-readable component of the standard. This has the following advantages: 1. It would require almost no extra work, because haskell-src already exists, and syntax changes, if any, will be very minimal for each revision. 2. For the portion of the standard that it covers, any ambiguity that might creep into the human-readable standard document would be resolved. 3. It would serve as a basic machine-verification tool which is easily extensible. 4. As a side-effect, the haskell-src package would be continually maintained. The package is useful in its own right as a much lighter-weight version of haskell-src-exts. It is much easier to use when an application does not require full support of all of Haskell's syntax. This proposal is a natural extension of a proposal raised on the libraries mailing list in the context of the Haskell Platform: Ian Lynagh proposed that the current haskell-src-exts be renamed haskell-src (and included in the Haskell Platform, as suggested by Sterling Clover), and that the current haskell-src be renamed to haskell98-src (and removed from the Platform). The libraries subthread is here: http://www.haskell.org/pipermail/libraries/2010-November/015018.html Thanks, Yitz

On Tue, Nov 16, 2010 at 9:13 AM, Yitzchak Gale
I propose that the haskell-src package be renamed haskell20nn-src for each revision Haskell 20nn of the standard, and be made an official machine-readable component of the standard.
As much as I like the idea of standardising a representation of Haskell syntax, it's a highly nontrivial library and so coming to consensus on the various design decisions involved in producing the AST and so forth would be thorny if we started demanding that every implementation upheld them. I think that in general, libraries in the Report should be minimal, and generally only provide "obvious" or primitive constructs which would likely be the same in every implementation, and on which can be built more interesting libraries separately. It would become necessary to include this sort of thing, I think, if we ever wanted something like Template Haskell or any other metaprogramming facilities to be included in the language. But I don't think anyone believes that TH or anything like it is ready for inclusion in haskell' yet. (Examples of controversies possible in haskell-src: we have the Hs prefix on constructors everywhere, we can't provide fixity information (and the haskell-src-exts implementation of this is unsatisfactory in several important ways), a lot of type class instances are absent (even Ord!), the distribution of SrcLocs is a little awkward when manipulating source abstractly, and some constructors allow impossible values, e.g. HsLambda can contain zero patterns)

Ben Millwood wrote:
As much as I like the idea of standardising a representation of Haskell syntax, it's a highly nontrivial library and so coming to consensus on the various design decisions involved in producing the AST and so forth would be thorny if we started demanding that every implementation upheld them. I think that in general, libraries in the Report should be minimal, and generally only provide "obvious" or primitive constructs which would likely be the same in every implementation, and on which can be built more interesting libraries separately.
I am not proposing that haskell-src become part of the standard libraries. Neither its design, nor its suitability for use in any application, are relevant to this proposal, except for one application: machine verification of the compliance of certain aspects of the syntax of a Haskell program with the standard. The library itself (after being generated by Alex and Happy, and apart from the Data and Typeable instances and the pretty printer, which are all outside the scope of this proposal) is Haskell 20nn. It is simply Haskell 20nn compliant code which, when compiled by any compliant compiler and fed a program, will determine by definition whether the program is compliant with the Haskell 20nn standard or not with respect to those aspects of Haskell 20nn that are within its scope. The current use of Alex and Happy does create the uncomfortable situation that in practice it is hardly possible for a human to read and understand the full source code of haskell-src without looking at the Alex and Happy from which it was generated, while the Alex and Happy sources themselves are not actually part of the standard. That can be remedied in future versions if someone gets around to writing a lexer and parser without resorting to those tools. But I don't see the use of Alex and Happy as an obstacle in principle, as long as the generated code is Haskell 20nn compliant.
It would become necessary to include this sort of thing, I think, if we ever wanted something like Template Haskell or any other metaprogramming facilities to be included in the language. But I don't think anyone believes that TH or anything like it is ready for inclusion in haskell' yet.
I don't see why the presence or absence of TH makes any difference.
(Examples of controversies possible in haskell-src: we have the Hs prefix on constructors everywhere,
Why does this matter?
we can't provide fixity information
Therefore fixity is currently beyond the scope of this proposal.
a lot of type class instances are absent (even Ord!),
Those instances are defined in the libraries, they are certainly outside the scope of this proposal.
the distribution of SrcLocs is a little awkward
SrcLoc is irrelevant.
some constructors allow impossible values, e.g. HsLambda can contain zero patterns
If it would accept (\ -> undefined) as valid Haskell, that is a bug that would need to be fixed. Otherwise it is not an impediment, though not ideal. Thanks, Yitz

On Tue, Nov 16, 2010 at 3:40 PM, Yitzchak Gale
I am not proposing that haskell-src become part of the standard libraries.
Right, I misunderstood here.
Neither its design, nor its suitability for use in any application, are relevant to this proposal, except for one application: machine verification of the compliance of certain aspects of the syntax of a Haskell program with the standard.
The library itself (after being generated by Alex and Happy, and apart from the Data and Typeable instances and the pretty printer, which are all outside the scope of this proposal) is Haskell 20nn.
It is simply Haskell 20nn compliant code which, when compiled by any compliant compiler and fed a program, will determine by definition whether the program is compliant with the Haskell 20nn standard or not with respect to those aspects of Haskell 20nn that are within its scope.
So essentially, all you are asking for is an official implementation of haskell parsing, so that you input a program and it spits out either "valid" or "not valid", according to the parts of the spec that it audits. This is not such a bad idea, except that I feel like there are a lot of examples of languages /without/ an official standard whose compliance to the spec is therefore determined, as you seem to suggest, by compliance to a reference implementation, and I think it tends to be more painful as a process. If there are bugs in the reference implementation, other implementations then have to decide whether to "implement" them or do what they think is best. If there are disagreements between the reference implementation and language spec, or ambiguities in language spec, the spec should certainly be fixed! I strongly believe that a bureaucratic, democratic, and conceptual approach to the design of the Haskell language is one of its major strengths - its design decisions are made by committee and after lengthy discussion, rather than by implementors according to whatever works at the time. So I'm not convinced that converting part of the language description into a machine-readable form is necessarily for the best.
(Examples of controversies possible in haskell-src: we have the Hs prefix on constructors everywhere,
Why does this matter?
It's just annoying and makes code more verbose for no obvious reason. It's not a big deal but it's one of the things we'd have to talk about if we wanted your library to expose its parsing API - it's something that, for example, haskell-src and haskell-src-exts have diverged on.
we can't provide fixity information
Therefore fixity is currently beyond the scope of this proposal.
Hmm. But fixity resolution is one of the trickiest parts of Haskell parsing, imo. It seems like an awful cop-out to put the really difficult cases - like parsing \x -> x == x == 0 - out of the scope of your verification tool. Everything else I said assumed people wanted to use the official parsing library to manipulate the AST it generated, in which case the random SrcLocs and too-liberal constructors are real annoyances, but I suppose that's not really what you're after. But it seems a bit odd to have a complete parsing library that nevertheless doesn't provide AST inspection and manipulation, which is what I guess most people use a haskell parser in Haskell for. (Other things that I would find really nice in developing applications for manipulating source are resumability, like attoparsec, or error-correcting, like uu-parsinglib. More control over how I feed the input into the parser could also be useful for running parses of files without resorting to unsafe interleaving. But that's all way out of the scope of this proposal and for discussion somewhere else entirely).

Ben Millwood wrote:
So essentially, all you are asking for is an official implementation of haskell parsing, so that you input a program and it spits out either "valid" or "not valid", according to the parts of the spec that it audits.
Yes, that is the most essential requirement. It is a desirable feature for it to work as a parser, too. It can then be used as the basis for further verification tools, and for parsing with guarantees about standards compliance. And haskell-exts does work as a parser, though it perhaps not as polished as some other parsers. If we can get a parser based on haskell-src-exts instead, that would be great. But it's more work. haskell-exts is basically ready today.
...compliance to a reference implementation... tends to be more painful as a process. If there are bugs in the reference implementation, other implementations then have to decide whether to "implement" them or do what they think is best. If there are disagreements between the reference implementation and language spec, or ambiguities in language spec, the spec should certainly be fixed! ...So I'm not convinced that converting part of the language description into a machine-readable form is necessarily for the best.
I am not suggesting converting part of the spec into code and dispensing with that part of the document. I am suggesting that both the human-readable document and the reference parser should be officially part of the spec. If there is inconsistency between them, that is a bug in the spec which needs to be fixed like any other.
...fixity resolution is one of the trickiest parts of Haskell parsing, imo. It seems like an awful cop-out to put the really difficult cases - like parsing \x -> x == x == 0 - out of the scope of your verification tool.
Yes, it should be in scope if possible. If the fixity handling of haskell-src-exts is deemed sufficient and we move to that now or in the future, we'll have it.
...it seems a bit odd to have a complete parsing library that nevertheless doesn't provide AST inspection and manipulation, which is what I guess most people use a haskell parser in Haskell for.
haskell-src does provide that. It works and it is usable. Again, if we can be based on haskell-src-exts now or in the future, all the better. Regards, Yitz

On Wed, Nov 17, 2010 at 8:52 AM, Yitzchak Gale
Ben Millwood wrote:
So essentially, all you are asking for is an official implementation of haskell parsing, so that you input a program and it spits out either "valid" or "not valid", according to the parts of the spec that it audits.
Yes, that is the most essential requirement.
It is a desirable feature for it to work as a parser, too. It can then be used as the basis for further verification tools, and for parsing with guarantees about standards compliance. And haskell-exts does work as a parser, though it perhaps not as polished as some other parsers.
But if we make the official parser usable for AST manipulation, we have to rule on the design issues I raised above: whether to make efforts to stop invalid lambdas being constructed, how to name the types and constructors, etc. All of this relatively special-interest stuff now becomes the business of the language design committee, which sounds like an unnecessary burden to me. Also, don't get me wrong - haskell-src-exts is a leader in its field, it's more mature than any other standalone Haskell parser I know of. But it's still a complex library with difficult issues to be tackled, and I don't think the place to tackle those issues is in the language specification.
...compliance to a reference implementation... tends to be more painful as a process. If there are bugs in the reference implementation, other implementations then have to decide whether to "implement" them or do what they think is best. If there are disagreements between the reference implementation and language spec, or ambiguities in language spec, the spec should certainly be fixed! ...So I'm not convinced that converting part of the language description into a machine-readable form is necessarily for the best.
I am not suggesting converting part of the spec into code and dispensing with that part of the document.
I am suggesting that both the human-readable document and the reference parser should be officially part of the spec. If there is inconsistency between them, that is a bug in the spec which needs to be fixed like any other.
Well, sure, but then I wonder what your automated verification program
is actually useful for. What purpose does it serve that a Haskell
parser independent of the report does not? We can't guarantee it's
bug-free, so we can't make any more assurances than a third party
could about the correctness of its parsing behaviour. We still have to
(and should!) maintain the abstract language description, which may
well involve duplicating information: indeed, if the reference parser
is /not/ redundant, that's a bug :P
I suppose providing a reference parser at least ensures that any
modification to Haskell syntax is implementable, and the issues in its
implementation will have been considered, but in practice every
alteration to the Haskell language from now on is going to be
standardising extensions that already exist, so I don't really think
this is a priority.
On Wed, Nov 17, 2010 at 12:18 PM, S. Doaitse Swierstra
Reading this proposal I think it clearly states my point made earlier: allowing infix specifications everywhere provides unneeded flexibility and unnecessary complexity.
Ideally I would like to see them even before the module keyword: they state how to read the text that follows, and thus fall in the category of:
- LANGUAGE pragma's which add sometimes extra syntax - import's, which extend the name space
Restricting them to occur only directly after the imports is something I cannot see anyone to object to, and would enable the immediate correct parsing of all expressions to follow.
Doaitse
This is an interesting idea! It would certainly solve a fair few issues with fixity parsing, but I worry that we'd lose a lot of consistency, and/or gain a lot of redundancy - we want operators to associate the same way in every file, but people will have different ideas about which way to associate what (I like associating $ to the left, but I generally don't for the sake of my readers' sanity). Plus, like explicit import lists, I suspect that a list of all operators used in the program, potentially some distance away from their usage site, is going to invite subtle errors when people forget to add one, redundancy when people forget to remove one, and noise in patch files when people do the right thing. So as much as I want to see Haskell's infix syntax simplified, I'm not sure this is a practical way to do so. I once had the idea of having fixity determined in some sense by the name of the operator - long operators binding more tightly/loosely than short ones, or an angle bracket in the right place changing the associativity - but I don't think there's any satisfactory way of doing that either.

Please explain. Fixity information cannot be provided unless you find all the imported modules and process those, so I'm not sure how haskell-src-exts could do any better than it currently does.
(Examples of controversies possible in haskell-src: we have the Hs prefix on constructors everywhere, we can't provide fixity information (and the haskell-src-exts implementation of this is unsatisfactory in several important ways), a lot of type class instances are absent (even Ord!), the distribution of SrcLocs is a little awkward when manipulating source abstractly, and some constructors allow impossible values, e.g. HsLambda can contain zero patterns)

On Tue, Nov 16, 2010 at 7:51 PM, Lennart Augustsson
Please explain. Fixity information cannot be provided unless you find all the imported modules and process those, so I'm not sure how haskell-src-exts could do any better than it currently does.
The tickets I had in mind were: http://trac.haskell.org/haskell-src-exts/ticket/197 http://trac.haskell.org/haskell-src-exts/ticket/191 and this one I've just submitted: http://trac.haskell.org/haskell-src-exts/ticket/207

Thanks, I'll look into all of that when I get a chance, hopefully soonish.
Cheers,
/Niklas
On Wed, Nov 17, 2010 at 12:22 AM, Ben Millwood
On Tue, Nov 16, 2010 at 7:51 PM, Lennart Augustsson
wrote: Please explain. Fixity information cannot be provided unless you find all the imported modules and process those, so I'm not sure how haskell-src-exts could do any better than it currently does.
The tickets I had in mind were:
http://trac.haskell.org/haskell-src-exts/ticket/197 http://trac.haskell.org/haskell-src-exts/ticket/191
and this one I've just submitted:
http://trac.haskell.org/haskell-src-exts/ticket/207 _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime

See http://hackage.haskell.org/trac/ghc/ticket/4430 for what we are proposing for Template Haskell. S | -----Original Message----- | From: haskell-prime-bounces@haskell.org [mailto:haskell-prime-bounces@haskell.org] On | Behalf Of Lennart Augustsson | Sent: 16 November 2010 19:52 | To: Ben Millwood | Cc: haskell-prime@haskell.org | Subject: Re: Add haskell-src as an official machine-readable component of the Haskell | standard | | Please explain. Fixity information cannot be provided unless you find | all the imported modules and process those, so I'm not sure how | haskell-src-exts could do any better than it currently does. | | > | > (Examples of controversies possible in haskell-src: we have the Hs | > prefix on constructors everywhere, we can't provide fixity information | > (and the haskell-src-exts implementation of this is unsatisfactory in | > several important ways), a lot of type class instances are absent | > (even Ord!), the distribution of SrcLocs is a little awkward when | > manipulating source abstractly, and some constructors allow impossible | > values, e.g. HsLambda can contain zero patterns) | _______________________________________________ | Haskell-prime mailing list | Haskell-prime@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-prime

There is nothing to stop an library author doing exactly this, and it
might even be useful for some people (personally I'm going to stick to
haskell-src-exts, because it's a brilliant library). However, I don't
think we should make it official or part of the standard. I've found
plenty of HSE/GHC parsing differences in my work, and my suspicion is
that several of them are probably also present in haskell-src. I also
don't want the Haskell Prime Committee to take on the jobs of library
maintainership or implementation, those are best kept elsewhere.
Thanks, Neil
On Tue, Nov 16, 2010 at 9:13 AM, Yitzchak Gale
I propose that the haskell-src package be renamed haskell20nn-src for each revision Haskell 20nn of the standard, and be made an official machine-readable component of the standard.
This has the following advantages:
1. It would require almost no extra work, because haskell-src already exists, and syntax changes, if any, will be very minimal for each revision.
2. For the portion of the standard that it covers, any ambiguity that might creep into the human-readable standard document would be resolved.
3. It would serve as a basic machine-verification tool which is easily extensible.
4. As a side-effect, the haskell-src package would be continually maintained. The package is useful in its own right as a much lighter-weight version of haskell-src-exts. It is much easier to use when an application does not require full support of all of Haskell's syntax.
This proposal is a natural extension of a proposal raised on the libraries mailing list in the context of the Haskell Platform: Ian Lynagh proposed that the current haskell-src-exts be renamed haskell-src (and included in the Haskell Platform, as suggested by Sterling Clover), and that the current haskell-src be renamed to haskell98-src (and removed from the Platform).
The libraries subthread is here:
http://www.haskell.org/pipermail/libraries/2010-November/015018.html
Thanks, Yitz _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime

Hi Neil, Neil Mitchell wrote:
There is nothing to stop an library author doing exactly this, and it might even be useful for some people (personally I'm going to stick to haskell-src-exts, because it's a brilliant library).
Yes, it is. I am not proposing changing in any way how we develop and use Haskell parsers as tools. I am proposing taking an existing parser and using it in a different way.
I've found plenty of HSE/GHC parsing differences in my work, and my suspicion is that several of them are probably also present in haskell-src.
Indeed. Easily finding and documenting such discrepancies is one of the major benefits of having a machine-readable component in the standard.
I also don't want the Haskell Prime Committee to take on the jobs of library maintainership or implementation, those are best kept elsewhere.
Agreed. I am not proposing that. The machine-readable portion of the standard will be based on a parser library that exists quite apart from the standard. Thanks, Yitz
participants (6)
-
Ben Millwood
-
Lennart Augustsson
-
Neil Mitchell
-
Niklas Broberg
-
Simon Peyton-Jones
-
Yitzchak Gale