[GHC] #12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: new Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Other Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Dear all, After trying to (indirectly) parse GHC.hs using hothasktags, I kept getting an "Illegal character in string gap" error. After reconstructing the file using cpphs I found that the issue is caused by an incorrect multiline string in the raw (i.e. before cpp includes) GHC.hs, that trips the (perhaps stricter?) parser in haskell-src-exts... The following part of the code: {{{#!hs Nothing -> panic "compileToCoreModule: target FilePath not found in\ module dependency graph" }}} should be {{{#!hs Nothing -> panic "compileToCoreModule: target FilePath not found in\ \module dependency graph" }}} It seems the issue goes back to several other versions of the file. Of course it is not really a big deal, but it was hard to trigger (try to parse the file with haskell-src-exts), and to hunt!! Best regards -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: new Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): This looks like a bug in the parser -- this string should be rejected according to Haskell2010. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: new Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ifigueroap): Replying to [comment:1 osa1]:
This looks like a bug in the parser -- this string should be rejected according to Haskell2010.
You mean a bug in the internal ghc parser, right? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

A string may include a “gap”—two backslants enclosing white characters—which is ignored. This allows one to write long strings on more
#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: new Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Yes, from Haskell2010 says this: (https://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-200002.6) than one line by writing a backslant at the end of one line and at the start of the next. It seems like we don't follow this rule, we only have one backslash here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => invalid Comment: No, it's correct as-is. The C preprocessor consumes the line continuation marker `\`, so the Haskell source is a single-line string constant. It would be wrong to add a second `\` before `module`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ifigueroap): * Attachment "GHC.hs" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ifigueroap): I just attached the file produced when building ghc-8.0.1 from github. I cloned the repo, fixed the package-library redirects, and then performed ./boot, configure and make. The particular string I point in this issue is located in line 1021 and continues in line 1022. Perhaps I'm doing something wrong when building? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): I don't understand the question...? That is a source file, it is not produced by anything. GHC automatically runs it through CPP before compiling, because it has `{-# LANGUAGE CPP #-}`. The output of CPP is a valid Haskell module. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I don't understand the question...? That is a source file, it is not
#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ifigueroap): Replying to [comment:6 rwbarton]: produced by anything. GHC automatically runs it through CPP before compiling, because it has `{-# LANGUAGE CPP #-}`. The output of CPP is a valid Haskell module. Well, my point is that when running hothasktags, which internally runs cpphs and then tries to parse with haskell-src-exts I get the aforementioned "Illegal character in string gap", ''which in this particular case may mean that the output of the preprocessor is not a valid Haskell module.'' I'm not sure whether this is the case in the build process, as I didn't find how to get the post-processed file during compilation. To illustrate my point, if I manually run the preprocessor (in the compiler/main dir): {{{ cpphs -I../ -I../stage1 GHC.hs > GHC2.hs }}} (Note the two includes are for HsVersions.h in compiler/ and for ghc_autoconf.h in compiler/stage1). I get the "ill-formed" two-line string in the resulting GHC2.hs file. Of course I'm not sure what other flags are being passed to cpphs... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ifigueroap): * Attachment "GHC.hs" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ifigueroap): * Attachment "GHC2.hs" added. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Oh, cpphs doesn't seem to understand the line continuation syntax. I guess you need a real C preprocessor. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ifigueroap): Replying to [comment:8 rwbarton]:
Oh, cpphs doesn't seem to understand the line continuation syntax. I guess you need a real C preprocessor.
Thanks, you are right, I just run {{{ gcc -E -I../ -I../stage1 GHC.c }}} (with a dirty rename to .c) in order to use a real C preprocessor, and althought I got several errorrs, the particular string is now in 1 line. Thanks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12414: Ill-formed or incorrect multiline string in compiler/main/GHC.hs -------------------------------------+------------------------------------- Reporter: ifigueroap | Owner: Type: bug | Status: closed Priority: lowest | Milestone: Component: Compiler | Version: 8.0.1 Resolution: invalid | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by malcolmw): Indeed, cpphs only lexes the contents of CPP directives as C. It lexes all other text as Haskell. This is entirely intentional, and is designed to work around numerous issues people have with traditional cpp doing the wrong thing with otherwise-correct Haskell source. I would suggest that ghc source files should not be relying on these bad features of cpp. What next? /* */ style comments? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12414#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC