What separates lines in Haskell code?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 In the report, under the layout rule (section 9.3), "The characters newline, return, linefeed, and formfeed, all start a new line." (Which four characters are those? from http://en.wikipedia.org/wiki/Linefeed , I'm guessing "LF: Line Feed, U+000A", "CR: Carriage Return, U+000D", "FF: Form Feed, U+000C", and what's the fourth one? Newline usually refers to '\n', which is LF, but linefeed has a direct name correspondence to that also!) The literate haskell section 9.4 just talks about lines without being specific about how they're specified. My proposed sample implementation uses Prelude.lines ... Prelude.lines presumes that lines are separated only by '\n'. (Of course, for Prelude.unlines to be an inverse operation (which it's not anyway) there has to be only one character that makes a line-separation) Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGcT5vHgcxvIWYTTURAowrAJ4rz3/Sc763l8TEharcnWcma5BkBgCfRhAF XbfCIG8tnym1gZFRZf4KuRo= =it7M -----END PGP SIGNATURE-----

On Thu, Jun 14, 2007 at 09:11:12AM -0400, Isaac Dupree wrote:
In the report, under the layout rule (section 9.3), "The characters newline, return, linefeed, and formfeed, all start a new line." (Which four characters are those? from http://en.wikipedia.org/wiki/Linefeed , I'm guessing "LF: Line Feed, U+000A", "CR: Carriage Return, U+000D", "FF: Form Feed, U+000C", and what's the fourth one? Newline usually refers to '\n', which is LF, but linefeed has a direct name correspondence to that also!)
The H98 lexical syntax defines newline as newline -> return linefeed | return | linefeed | formfeed It could, I suppose, also refer to the Unicode character U+2028 LINE SEPARATOR, but then probably U+2029 PARAGRAPH SEPARATOR ought to be included as well. There are, BTW, Unicode guidelines for newline usage in section 5.8 of the Unicode 5.0 online edition. -- Antti-Juhani Kaijanaho, Jyväskylä http://antti-juhani.kaijanaho.fi/newblog/

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Antti-Juhani Kaijanaho wrote:
On Thu, Jun 14, 2007 at 09:11:12AM -0400, Isaac Dupree wrote:
In the report, under the layout rule (section 9.3), "The characters newline, return, linefeed, and formfeed, all start a new line." (Which four characters are those? from http://en.wikipedia.org/wiki/Linefeed , I'm guessing "LF: Line Feed, U+000A", "CR: Carriage Return, U+000D", "FF: Form Feed, U+000C", and what's the fourth one? Newline usually refers to '\n', which is LF, but linefeed has a direct name correspondence to that also!)
The H98 lexical syntax defines newline as newline -> return linefeed | return | linefeed | formfeed
It could, I suppose, also refer to the Unicode character U+2028 LINE SEPARATOR, but then probably U+2029 PARAGRAPH SEPARATOR ought to be included as well.
There are, BTW, Unicode guidelines for newline usage in section 5.8 of the Unicode 5.0 online edition.
http://www.unicode.org/versions/Unicode5.0.0/ch05.pdf#G10213 Alright, I think the comment in the layout-rule section should not try to enumerate newlines, but rather should refer back to the lexical definition of 'newline'. As per the above Unicode guideline, the existing set of characters that Haskell98 accepts as newlines, and a section of the Unicode regex guidelines http://unicode.org/reports/tr18/, I propose all should be accepted as line separators: \u000A | \u000B | \u000C | \u000D | \u0085 | \u2028 | \u2029 | \u000D\u000A i.e. (not in the same order) CR, LF, CRLF, NEL, VT, FF, LS, PS. Unfortunately that makes it a little hard to process; maybe translate all into '\n' before doing any processing (such as unliteration). Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGdXETHgcxvIWYTTURApE8AJsEdw8zUrri+EzXfa+EhlyC1UT2TACdHjgp RjtYbkXTMFadsavlzhCHDJ0= =Nbl0 -----END PGP SIGNATURE-----
participants (2)
-
Antti-Juhani Kaijanaho
-
Isaac Dupree