More flexible literate Haskell extensions (Trac #9789), summary on wiki

As requested on my ticket I summarised the entire proposal on the wiki here: https://ghc.haskell.org/trac/ghc/wiki/FlexibleLiterateExtension I don't expect a lot of disagreement on discussion, aside from minor bike shedding on the flavour of the extension. I've started implementing this already. I'm open to bikesheds on exact extension, as it shouldn't affect the implementation. Unless there's any vehement objections, I'll produce a diff on fabricator asap. Cheers, Merijn

I don't expect a lot of disagreement on discussion, aside from minor bike shedding on the flavour of the extension. I've started implementing this already. I'm open to bikesheds on exact extension, as it shouldn't affect the implementation.
Joining the party late, my main use case for literate Haskell is README.md files. The solution I currently use is to have a symlink: README.lhs -> README.md (see e.g. https://github.com/hspec/hspec-wai) As I understand it the current proposal neither helps nor conflicts with this use case/solution. I leave this here as a comment mainly to indicated that: * the proposal as it stands does not solve my use case * there are other ways to solve this (namely: using symlinks) Cheers, Simon Hengel

Thanks. I don't have strong opinions about any of this. But I would love to have an actual specification of what is proposed. The wiki page starts with "Proposal" but lists a set of alternatives. Later "Concrete proposal" discusses only file suffixes, and has lots of discussion of alternatives. Would it be possible to have a section a) describing a single alternative, as precisely as possible. b) saying what the effect or meaning of proposal is For (a), is Foo.hs still ok? Foo.lhs? What if both exist and/or Foo.md.hs or whatever? For (b) what does a suffix of Foo.hs.md mean? Presumably there is some markdown in there. But how is it delimited? Is md the only one proposed or are there others? Is it meant to be extensible or is there a fixed set? In short, a *specification* of what is proposed. I think that would be helpful for people who come to this without having participated in the discussion that led up to it. Simon | -----Original Message----- | From: Glasgow-haskell-users [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Merijn Verstraaten | Sent: 14 November 2014 04:22 | To: ghc-devs@haskell.org; GHC Users Mailing List | Subject: More flexible literate Haskell extensions (Trac #9789), | summary on wiki | | As requested on my ticket I summarised the entire proposal on the wiki | here: https://ghc.haskell.org/trac/ghc/wiki/FlexibleLiterateExtension | | I don't expect a lot of disagreement on discussion, aside from minor | bike shedding on the flavour of the extension. I've started | implementing this already. I'm open to bikesheds on exact extension, | as it shouldn't affect the implementation. | | Unless there's any vehement objections, I'll produce a diff on | fabricator asap. | | Cheers, | Merijn | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Hi Simon, Thanks for the comments. I think most of the confusion stems from people overthinking the scope of what I was proposing. I'll clear up the page a bit as it's currently conflating implementation details with semantics.
On 14 Nov 2014, at 2:29, Simon Peyton Jones
wrote: Would it be possible to have a section a) describing a single alternative, as precisely as possible.
The single alternative would simply be: If GHC tries to find the source for a module Foo and none of "Foo.hs", "Foo.lhs", "Foo.hsig" or "Foo.lhsig" are found, it will accept any file with a "Foo.lhs.*" extension, i.e., "Foo.lhs.md", "Foo.lhs.tex", etc.
b) saying what the effect or meaning of proposal is
The proposal does NOT modify the way GHC treats the contents of files or unlits literate haskell in anyway. While I'm in favour of supporting more literate formats, that's orthogonal to this proposal.
For (a), is Foo.hs still ok? Foo.lhs? What if both exist and/or Foo.md.hs or whatever?
Yes, both "Foo.hs" and "Foo.lhs" are still ok. I don't think the manual specifies what GHC does in the case "Foo.hs" AND "Foo.lhs" both exist. But the implementation prefers extensions in the following order: "hs", "lhs", "hsig" and "lhsig". I would just add the new allowed extension behind that as lower priority than the current ones.
For (b) what does a suffix of Foo.hs.md mean? Presumably there is some markdown in there. But how is it delimited? Is md the only one proposed or are there others? Is it meant to be extensible or is there a fixed set?
See my earlier point, I do *not* intend to affect the way GHC interprets/unlits the contents of files. Pandoc is already perfectly happy to work with literate files, it just currently lacks a way to determine what the content type of the non-literate bits is. Which is what I hope to deal with here. Cheers, Merijn

Marijn,
Thanks. Can you make sure that you update the wiki page to reflect what you say here? Email is transitory; the wiki page gives the *specification* of the feature, and says unambiguously what you intend. Misunderstandings expressed in email are simply tell you how to improve the wiki page!
thanks
Simon
| -----Original Message-----
| From: Merijn Verstraaten [mailto:merijn@inconsistent.nl]
| Sent: 16 November 2014 21:42
| To: Simon Peyton Jones
| Cc: ghc-devs@haskell.org; GHC Users Mailing List
| Subject: Re: More flexible literate Haskell extensions (Trac #9789),
| summary on wiki
|
| Hi Simon,
|
| Thanks for the comments. I think most of the confusion stems from people
| overthinking the scope of what I was proposing. I'll clear up the page a
| bit as it's currently conflating implementation details with semantics.
|
| > On 14 Nov 2014, at 2:29, Simon Peyton Jones

On 16 Nov 2014, at 14:09, Simon Peyton Jones
Thanks. Can you make sure that you update the wiki page to reflect what you say here? Email is transitory; the wiki page gives the *specification* of the feature, and says unambiguously what you intend. Misunderstandings expressed in email are simply tell you how to improve the wiki page!
I had already updated the wiki with the relevant notes before replying, but apparently I forgot to mention this! Cheers, Merijn

Hello Merijn, On 2014-11-14 at 05:22:11 +0100, Merijn Verstraaten wrote:
As requested on my ticket I summarised the entire proposal on the wiki here: https://ghc.haskell.org/trac/ghc/wiki/FlexibleLiterateExtension
I don't see Cabal mentioned anywhere on that wiki page. Doesn't Cabal need to be made aware of those new extensions? Cheers, hvr

Hi Merjin, there is one possible problem with this approach. Currently, the compiler never has to read the contents of the directory (or at least that’s what I assume; is that correct?) but only has to probe the existence of a fixed finite set of files. With your extensions it will have to read the directory contents. In most situations, that should be fine, but it might cause minor inconveniences with very large directories, many search paths (-i flags) and/or very weird file systems (compiling from a FUSE-mounted HTTP-Server that does not support directory listing? Would work now...) A fixed set of extensions (e.g. just "md" and "tex") would avoid this problem, but goes against the spirit of the proposal. This is not an objection against the proposal, just a minor point to be considered. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

2014-11-20 9:36 GMT+01:00 Joachim Breitner
[...] With your extensions it will have to read the directory contents. In most situations, that should be fine, but it might cause minor inconveniences with very large directories, many search paths (-i flags) and/or very weird file systems (compiling from a FUSE-mounted HTTP-Server that does not support directory listing? Would work now...)
Hmmm, IMHO reading directory contents is not a good idea for a compiler, for just the reasons you mentioned.
A fixed set of extensions (e.g. just "md" and "tex") would avoid this problem, but goes against the spirit of the proposal.
I think we can get the best of both worlds by adding a compiler flag, e.g. --literate-extensions=md,tex. This way the compiler still has to probe only a finite set of filenames *and* we are still flexible. Cheers, S.

Hey, I'm not really happy with an explicit extension list passed through flags, as it seems far to manual. It doesn't strike me as a very significant problem as my current work-in-progress patch only resorts to scanning the directory listing if none of the existing extensions are found. It seems unlikely that extremely large directories are in the search path. I don't think we're expecting anyone to have thousands of files in their module directories (and even thousands would take 0 time to scan...). We already try all search paths during probing so the extra overhead from large number of search paths shouldn't be substantially more than we have now. As for the weird FS support, I really don't think it's GHC's job to support people doing ridiculous things like "compiling on a filesystem that doesn't support directory listings", I mean, come on! To be honest, I consider both directories with huge numbers of files (keep in mind, we'd only scan the actual module sub-directory AND that we'd need to run into directories with tens if not hundreds of thousands files to notice any real slowdown) and weird filesystems that don't support directory listing cases of "so don't do that, then". Especially since these are only problems for users using non-standard module extensions. Cheers, Merijn
On 20 Nov 2014, at 2:40, Sven Panne
wrote: 2014-11-20 9:36 GMT+01:00 Joachim Breitner : [...] With your extensions it will have to read the directory contents. In most situations, that should be fine, but it might cause minor inconveniences with very large directories, many search paths (-i flags) and/or very weird file systems (compiling from a FUSE-mounted HTTP-Server that does not support directory listing? Would work now...)
Hmmm, IMHO reading directory contents is not a good idea for a compiler, for just the reasons you mentioned.
A fixed set of extensions (e.g. just "md" and "tex") would avoid this problem, but goes against the spirit of the proposal.
I think we can get the best of both worlds by adding a compiler flag, e.g. --literate-extensions=md,tex. This way the compiler still has to probe only a finite set of filenames *and* we are still flexible.

Did you try measuring on eg, edwardk's gl package?
On Thu, Nov 20, 2014 at 12:52 PM, Merijn Verstraaten wrote: Hey, I'm not really happy with an explicit extension list passed through flags,
as it seems far to manual. It doesn't strike me as a very significant
problem as my current work-in-progress patch only resorts to scanning the
directory listing if none of the existing extensions are found. It seems unlikely that extremely large directories are in the search path.
I don't think we're expecting anyone to have thousands of files in their
module directories (and even thousands would take 0 time to scan...). We
already try all search paths during probing so the extra overhead from
large number of search paths shouldn't be substantially more than we have
now. As for the weird FS support, I really don't think it's GHC's job to
support people doing ridiculous things like "compiling on a filesystem that
doesn't support directory listings", I mean, come on! To be honest, I consider both directories with huge numbers of files (keep
in mind, we'd only scan the actual module sub-directory AND that we'd need
to run into directories with tens if not hundreds of thousands files to
notice any real slowdown) and weird filesystems that don't support
directory listing cases of "so don't do that, then". Especially since these
are only problems for users using non-standard module extensions. Cheers,
Merijn On 20 Nov 2014, at 2:40, Sven Panne [...] With your extensions it will have to read the directory contents.
In
most situations, that should be fine, but it might cause minor
inconveniences with very large directories, many search paths (-i flags)
and/or very weird file systems (compiling from a FUSE-mounted
HTTP-Server that does not support directory listing? Would work now...) Hmmm, IMHO reading directory contents is not a good idea for a
compiler, for just the reasons you mentioned. A fixed set of extensions (e.g. just "md" and "tex") would avoid this
problem, but goes against the spirit of the proposal. I think we can get the best of both worlds by adding a compiler flag,
e.g. --literate-extensions=md,tex. This way the compiler still has to
probe only a finite set of filenames *and* we are still flexible. ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs
participants (8)
-
Carter Schonwald
-
Herbert Valerio Riedel
-
Jan Stolarek
-
Joachim Breitner
-
Merijn Verstraaten
-
Simon Hengel
-
Simon Peyton Jones
-
Sven Panne