PROPOSAL: Literate haskell and module file names

Ola! I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal. Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :) The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors. Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo. I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter. Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes Cons: - Someone has to implement it - ?? Discussion: 4 weeks Cheers, Merijn

One problem with Foo.*.hs or even Foo.md.hs mapping to the module name
Foois that as I recall JHC will look for
Data.Vector in Data.Vector.hs as well as Data/Vector.hs
This means that on a case insensitive file system Foo.MD.hs matches both
conventions.
Do I want to block an change to GHC because of an incompatible change in
another compiler? Not sure, but I at least want to raise the issue so it
can be discussed.
Another small issue is that this means you need to actually scan the
directory rather than look for particular file names, but off my head
really I don't expect directories to be full enough for that to be a
performance problem.
-Edward
On Sun, Mar 16, 2014 at 8:56 AM, Merijn Verstraaten
Ola!
I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal.
Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :)
The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors.
Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo.
I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter.
Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes
Cons: - Someone has to implement it - ??
Discussion: 4 weeks
Cheers, Merijn
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

I agree that this could collide, see my beginning remark that I believe that the report should provide a minimal specification how to map modules to filenames and vice versa. Anyhoo, I'm not married to this specific suggestion. Carter suggested "Foo+rst.lhs" on IRC, other options would be "Foo.rst+lhs" or "Foo.lhs+rst", I don't particularly care what as long as we pick something. Patching tools to support whatever solution we pick should be trivial. Cheers, Merijn On Mar 16, 2014, at 16:41 , Edward Kmett wrote:
One problem with Foo.*.hs or even Foo.md.hs mapping to the module name Foo is that as I recall JHC will look for Data.Vector in Data.Vector.hs as well as Data/Vector.hs
This means that on a case insensitive file system Foo.MD.hs matches both conventions.
Do I want to block an change to GHC because of an incompatible change in another compiler? Not sure, but I at least want to raise the issue so it can be discussed.
Another small issue is that this means you need to actually scan the directory rather than look for particular file names, but off my head really I don't expect directories to be full enough for that to be a performance problem.
-Edward
On Sun, Mar 16, 2014 at 8:56 AM, Merijn Verstraaten
wrote: Ola! I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal.
Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :)
The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors.
Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo.
I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter.
Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes
Cons: - Someone has to implement it - ??
Discussion: 4 weeks
Cheers, Merijn
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Foo+rst.lhs does nicely dodge the collision with jhc.
How does ghc do the search now? By trying each alternative in turn?
On Sun, Mar 16, 2014 at 1:14 PM, Merijn Verstraaten
I agree that this could collide, see my beginning remark that I believe that the report should provide a minimal specification how to map modules to filenames and vice versa.
Anyhoo, I'm not married to this specific suggestion. Carter suggested "Foo+rst.lhs" on IRC, other options would be "Foo.rst+lhs" or "Foo.lhs+rst", I don't particularly care what as long as we pick something. Patching tools to support whatever solution we pick should be trivial.
Cheers, Merijn
On Mar 16, 2014, at 16:41 , Edward Kmett wrote:
One problem with Foo.*.hs or even Foo.md.hs mapping to the module name Foois that as I recall JHC will look for Data.Vector in Data.Vector.hs as well as Data/Vector.hs
This means that on a case insensitive file system Foo.MD.hs matches both conventions.
Do I want to block an change to GHC because of an incompatible change in another compiler? Not sure, but I at least want to raise the issue so it can be discussed.
Another small issue is that this means you need to actually scan the directory rather than look for particular file names, but off my head really I don't expect directories to be full enough for that to be a performance problem.
-Edward
On Sun, Mar 16, 2014 at 8:56 AM, Merijn Verstraaten < merijn@inconsistent.nl> wrote:
Ola!
I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal.
Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :)
The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors.
Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo.
I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter.
Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes
Cons: - Someone has to implement it - ??
Discussion: 4 weeks
Cheers, Merijn
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Mon, Mar 17, 2014 at 9:08 AM, Edward Kmett
Foo+rst.lhs does nicely dodge the collision with jhc.
Is this legal on Windows? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

2014-03-17 14:22 GMT+01:00 Brandon Allbery
On Mon, Mar 17, 2014 at 9:08 AM, Edward Kmett
wrote: Foo+rst.lhs does nicely dodge the collision with jhc.
Is this legal on Windows?
According to http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).as... it is, although I can't test this at the moment.

On 17/03/2014 13:08, Edward Kmett wrote:
Foo+rst.lhs does nicely dodge the collision with jhc.
How does ghc do the search now? By trying each alternative in turn?
Yes - see compiler/main/Finder.hs Cheers, Simon
On Sun, Mar 16, 2014 at 1:14 PM, Merijn Verstraaten
mailto:merijn@inconsistent.nl> wrote: I agree that this could collide, see my beginning remark that I believe that the report should provide a minimal specification how to map modules to filenames and vice versa.
Anyhoo, I'm not married to this specific suggestion. Carter suggested "Foo+rst.lhs" on IRC, other options would be "Foo.rst+lhs" or "Foo.lhs+rst", I don't particularly care what as long as we pick something. Patching tools to support whatever solution we pick should be trivial.
Cheers, Merijn
On Mar 16, 2014, at 16:41 , Edward Kmett wrote:
One problem with Foo.*.hs or even Foo.md.hs mapping to the module name Foo is that as I recall JHC will look for Data.Vector in Data.Vector.hs as well as Data/Vector.hs
This means that on a case insensitive file system Foo.MD.hs matches both conventions.
Do I want to block an change to GHC because of an incompatible change in another compiler? Not sure, but I at least want to raise the issue so it can be discussed.
Another small issue is that this means you need to actually scan the directory rather than look for particular file names, but off my head really I don't expect directories to be full enough for that to be a performance problem.
-Edward
On Sun, Mar 16, 2014 at 8:56 AM, Merijn Verstraaten
mailto:merijn@inconsistent.nl> wrote: Ola!
I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal.
Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :)
The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors.
Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo.
I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter.
Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes
Cons: - Someone has to implement it - ??
Discussion: 4 weeks
Cheers, Merijn
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org mailto:Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Ola!
I raised this proposal earlier this year and got to busy to follow up, this week I was suddenly reminded and decided to reraise this. To summarise the discussion up until this point:
There was no real opposition to the general idea, the only real objection to the original proposal was that “Foo.lhs.md” and “Foo.md.lhs” would collide with the naming scheme used by JHC on case insensitive filesystems. Alternative proposal raised during the discussion: "Foo+md.lhs", "Foo.lhs+md” and “Foo.md+lhs”.
According to MS documentation and testing the + should not be an issue on windows, the + doesn’t collide with any other haskell compiler (at least, not any I’m aware off) and since the report doesn’t specify any module name resolution mechanism, it does not conflict with the report either.
My personal preferences goes to either “.lhs+md” or “.md+lhs”, since GHC currently tries every alternative in turn, I propose to just extend this list to look for any file whose extension is “.lhs+*” or “.*+lhs”.
Are there any objections to this? If not, I’m just going to produce a patch + ticket as there were no real objections to the proposal last time.
Cheers,
Merijn
On 16 Mar 2014, at 05:56 , Merijn Verstraaten
Ola!
I didn't know what the most appropriate venue for this proposal was so I crossposted to haskell-prime and glasgow-haskell-users, if this isn't the right venue I welcome advice where to take this proposal.
Currently the report does not specify the mapping between filenames and module names (this is an issue in itself, it essentially makes writing haskell code that's interoperable between compilers impossible, as you can't know what directory layout each compiler expects). I believe that a minimal specification *should* go into the report (hence, haskell-prime). However, this is a separate issue from this proposal, so please start a new thread rather than sidetracking this one :)
The report only mentions that "by convention" .hs extensions imply normal haskell and .lhs literate haskell (Section 10.4). In the absence of guidance from the report GHC's convention of mapping module Foo.Bar.Baz to Foo/Bar/Baz.hs or Foo/Bar/Baz.lhs seems the only sort of standard that exists. In general this standard is nice enough, but the mapping of literate haskell is a bit inconvenient, it leaves it completelyl ambiguous what the non-haskell content of said file is, which is annoying for tool authors.
Pandoc has adopted the policy of checking for further file extensions for literate haskell source, e.g. Foo.rst.lhs and Foo.md.lhs. Here .rst.lhs gets interpreted as being reStructured Text with literate haskell and .md.lhs is Markdown with literate haskell. Unfortunately GHC currently maps filenames like this to the module names Foo.rst and Foo.md, breaking anything that wants to import the module Foo.
I would like to propose allowing an optional extra extension in the pandoc style for literate haskell files, mapping Foo.rst.lhs to module name Foo. This is a backwards compatible change as there is no way for Foo.rst.lhs to be a valid module in the current GHC convention. Foo.rst.lhs would map to module name "Foo.rst" but module name "Foo.rst" maps to filename "Foo/rst.hs" which is not a valid haskell module anyway as the rst is lowercase and module names have to start with an uppercase letter.
Pros: - Tool authors can more easily determine non-haskell content of literate haskell files - Currently valid module names will not break - Report doesn't specify behaviour, so GHC can do whatever it likes
Cons: - Someone has to implement it - ??
Discussion: 4 weeks
Cheers, Merijn
participants (5)
-
Brandon Allbery
-
Edward Kmett
-
Merijn Verstraaten
-
Simon Marlow
-
Sven Panne