Proposal: Fix a "bug" in the layout interpretation algorithm in Section 10.3 of Report 2010

Hello,
I found that the layout interpretation algorithm in Section 10.3 of Report
2010
produces parse-error when applied to the following code snippet:
[Snippet 1 : The code that fails to be parsed by Report]
main = print t where { t = s where s = 1 :: Int }
[/Snippet 1]
GHC 8.4.4 and GHC 8.6.4 accept this code, which is a good behavior in my
opinion.
If we regard the behavior of those current GHCs as correct,
this "bug" in the Report lies in the following lines in the definition of
the function L:
[Snippet 2 : A part of the layout interpretation algorithm in the Report]
L (} : ts) (0 : ms) = } : L ts ms (Note 3)
L (} : ts) ms = parse-error (Note 3)
[/Snippet 2]
I suggest this be modified to what follows:
[Snippet 3 : A Suggested modification to Snippet 2]
L (} : ts) (m : ms)
| m == 0 = } : L ts ms
| m > 0 = } : L (} : ts) ms
L (} : ts) [] = parse-error
[/Snippet 3]
Let me explain in detail.
First of all, how is Snippet 1 refused by the Report 2010 algorithm?
Let us emulate it by hand. Firstly pre-process Snippet 1:
[Snippet 4 : Pre-processed Snippet 1]
{1} main = print t where { t = s where {36} s = 1 :: Int }
[/Snippet 4]
I assumed that Snippet 1 is the only line in a file.
We may compute:
[Computation 5 : Apply L to Snippet 4]
L

On 2019-04-29 3:54 a.m., 佐藤玄基 wrote:
Hello,
I found that the layout interpretation algorithm in Section 10.3 of Report 2010 produces parse-error when applied to the following code snippet:
[Snippet 1 : The code that fails to be parsed by Report] main = print t where { t = s where s = 1 :: Int } [/Snippet 1]
GHC 8.4.4 and GHC 8.6.4 accept this code, which is a good behavior in my opinion.
It's worth noting that the snippet is accepted by GHC even with -XAlternativeLayoutRule. Overall, I think I'd rather just have the concrete AlternativeLayoutRule algorithm in the specification than tweak the current broken and vague pseudo-code.
If we regard the behavior of those current GHCs as correct, this "bug" in the Report lies in the following lines in the definition of the function L:
[Snippet 2 : A part of the layout interpretation algorithm in the Report] L (} : ts) (0 : ms) = } : L ts ms (Note 3) L (} : ts) ms = parse-error (Note 3) [/Snippet 2]
I suggest this be modified to what follows:
[Snippet 3 : A Suggested modification to Snippet 2] L (} : ts) (m : ms) | m == 0 = } : L ts ms | m > 0 = } : L (} : ts) ms L (} : ts) [] = parse-error [/Snippet 3]
Let me explain in detail. First of all, how is Snippet 1 refused by the Report 2010 algorithm? Let us emulate it by hand. Firstly pre-process Snippet 1:
[Snippet 4 : Pre-processed Snippet 1] {1} main = print t where { t = s where {36} s = 1 :: Int } [/Snippet 4]
I assumed that Snippet 1 is the only line in a file. We may compute:
[Computation 5 : Apply L to Snippet 4] L
[] = "{ main = print t where { t = s where { s = 1 :: Int" ++ L "}" [36,0,1] [/Computation 5] Wait here. What does L do next? Looking from top to bottom, we hit the second line in Snippet 2, which leads us to parse-error. Now you might expect that the line
[Snippet 6 : Another part of the Report algorithm] L (t : ts) (m : ms) = } : (L (t : ts) ms) if m /= 0 and parse-error(t) (Note 5) [/Snippet 6]
would help, but unfortunately it doesn't work. Even if we put aside the fact that Snippet 6 is lower in the text than Snippet 2 and has lower priority of execution according to the usual Haskell matching rule, this line in L won't be triggered by this parse error at all, since *parse-error('}') is false!* Let us go back to the definition of parse-error(t).
[Quotation 7 : Note 5 in the Report algorithm] The side condition parse-error(t) is to be interpreted as follows: if the tokens generated so far by L together with the next token t represent an invalid prefix of the Haskell grammar, and the tokens generated so far by L followed by the token "}" represent a valid prefix of the Haskell grammar, then parse-error(t) is true. [/Quotation 7]
Now, this is the point. In this case, "the tokens generated so far by L together with the next token t" is:
[Snippet 8] { main = print t where { t = s where { s = 1 :: Int } [/Snippet 8]
This is a *valid prefix of the Haskell grammar*, and hence parse-error('}') is false.
Therefore, any Haskell2010-compliant compiler should reject Snippet 1, and this doesn't seem to be any sensible choice of specification. Speaking generally, I guess the Report 2010's authors wanted the case where a inner implicit brace and a outer explicit brace is closed at the same time to be processed by the rule in Snippet 6, but it doesn't work since Snippet 2 is before Snippet 6 in the definition of L and parse-error(t) is always false if t = '}'.
The fix of this problem is easy: replace Snippet 2 with Snippet 3. The added case of m > 0 is doing almost the same thing as Snippet 6, but parse-error(t) is removed. If we distinguish implicit close-braces and explicit close-braces, the condition m > 0 fully does the job of parse-error('}'), so I expect there will be no problem with this modification.
I apologize you if this long text has exhausted your eyes. I hope this suggestion would help.
Sincerely yours, Genki SATO
_______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-prime
participants (2)
-
Mario Blažević
-
佐藤玄基