Proposal: Fix a "bug" in the layout interpretation algorithm in Section 10.3 of Report 2010

29 Apr 2019

      Hello,

I found that the layout interpretation algorithm in Section 10.3 of Report
2010
produces parse-error when applied to the following code snippet:

[Snippet 1 : The code that fails to be parsed by Report]
main = print t where { t = s where s = 1 :: Int }
[/Snippet 1]

GHC 8.4.4 and GHC 8.6.4 accept this code, which is a good behavior in my
opinion.
If we regard the behavior of those current GHCs as correct,
this "bug" in the Report lies in the following lines in the definition of
the function L:

[Snippet 2 : A part of the layout interpretation algorithm in the Report]
L (} : ts) (0 : ms) = } : L ts ms (Note 3)
L (} : ts) ms = parse-error (Note 3)
[/Snippet 2]

I suggest this be modified to what follows:

[Snippet 3 : A Suggested modification to Snippet 2]
L (} : ts) (m : ms)
  | m == 0 = } : L ts ms
  | m > 0 = } : L (} : ts) ms
L (} : ts) [] = parse-error
[/Snippet 3]

Let me explain in detail.
First of all, how is Snippet 1 refused by the Report 2010 algorithm?
Let us emulate it by hand. Firstly pre-process Snippet 1:

[Snippet 4 : Pre-processed Snippet 1]
{1} main = print t where { t = s where {36} s = 1 :: Int }
[/Snippet 4]

I assumed that Snippet 1 is the only line in a file.
We may compute:

[Computation 5 : Apply L to Snippet 4]
L  []
 = "{ main = print t where { t = s where { s = 1 :: Int"
       ++ L "}" [36,0,1]
[/Computation 5]

Wait here. What does L do next?
Looking from top to bottom, we hit the second line in Snippet 2,
which leads us to parse-error.
Now you might expect that the line

[Snippet 6 : Another part of the Report algorithm]
L (t : ts) (m : ms) = } : (L (t : ts) ms)   if m /= 0 and parse-error(t)
(Note 5)
[/Snippet 6]

would help, but unfortunately it doesn't work.
Even if we put aside the fact that Snippet 6 is lower in the text than
Snippet 2
and has lower priority of execution according to the usual Haskell matching
rule,
this line in L won't be triggered by this parse error at all,
since *parse-error('}') is false!*
Let us go back to the definition of parse-error(t).

[Quotation 7 : Note 5 in the Report algorithm]
The side condition parse-error(t) is to be interpreted as follows:
if the tokens generated so far by L together with the next token t
represent an invalid prefix of the Haskell grammar,
and the tokens generated so far by L followed by the token "}"
represent a valid prefix of the Haskell grammar, then parse-error(t) is
true.
[/Quotation 7]

Now, this is the point.
In this case, "the tokens generated so far by L together with the next
token t" is:

[Snippet 8]
{ main = print t where { t = s where { s = 1 :: Int }
[/Snippet 8]

This is a *valid prefix of the Haskell grammar*, and hence parse-error('}')
is false.

Therefore, any Haskell2010-compliant compiler should reject Snippet 1,
and this doesn't seem to be any sensible choice of specification.
Speaking generally, I guess the Report 2010's authors wanted
the case where a inner implicit brace and a outer explicit brace is closed
at the same time
to be processed by the rule in Snippet 6,
but it doesn't work since Snippet 2 is before Snippet 6 in the definition
of L
and parse-error(t) is always false if t = '}'.

The fix of this problem is easy: replace Snippet 2 with Snippet 3.
The added case of m > 0 is doing almost the same thing as Snippet 6, but
parse-error(t) is removed.
If we distinguish implicit close-braces and explicit close-braces,
the condition m > 0 fully does the job of parse-error('}'),
so I expect there will be no problem with this modification.

I apologize you if this long text has exhausted your eyes.
I hope this suggestion would help.

Sincerely yours,
Genki SATO

佐藤玄基

Mario Blažević

tags

participants (2)