[GHC] #13015: Remove as much Haskell code as we can from Lexer.x

20 Dec 2016

      #13015: Remove as much Haskell code as we can from Lexer.x
-------------------------------------+-------------------------------------
           Reporter:  ezyang         |             Owner:  ezyang
               Type:  task           |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.1
  (Parser)                           |
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I want to reduce the code in Lexer.x so (1) when I use hasktags I jump to
 some real source, not the postprocess output, and (2) so that I can avoid
 suppressing warnings on our Haskell source code, and not just the lexer
 code.

 If we are willing to introduce an hs-boot file with the following
 signatures:

 {{{
 bol :: Int
 layout :: Int
 layout_do :: Int
 layout_if :: Int
 layout_left :: Int
 option_prags :: Int
 }}}

 We can do quite well; the only functions we can't extract are: begin, pop,
 multiline_doc_comment, lineCommentToken, nested_doc_comment,
 withLexedDocType, do_bol, setLine, setFile, warnTab, addTabWarning, lexer,
 lexTokenAlr, lexToken, lexTokenStream, linePrags, fileHeaderPrags,
 ignoredPrags, oneWordPrags, twoWordPrags, dispatch_pragmas and
 known_pragma.

 Some of these, like lexToken, we can't expect to be able to move to a
 separate file, as they depend on Alex generated data types and functions
 (AlexRoutine, alexScanUser). But if we boot-ify lexToken, we can move even
 more code out.

 One question, however, is the performance cost of going through an hs-boot
 file, as these don't inline. I think the constants are fairly safe
 (inlining opportunities should only occur when we compile Lexer.x, at
 which point we will have unfoldings for them), but I am less certain about
 lexToken.

 Any advice?

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13015
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler