[Git][ghc/ghc][wip/az/ghc-cpp] 112 commits: Simplifier: Constant fold invald tagToEnum# calls to bottom expr.

Alan Zimmerman pushed to branch wip/az/ghc-cpp at Glasgow Haskell Compiler / GHC Commits: 2e204269 by Andreas Klebinger at 2025-04-22T12:20:41+02:00 Simplifier: Constant fold invald tagToEnum# calls to bottom expr. When applying tagToEnum# to a out-of-range value it's best to simply constant fold it to a bottom expression. That potentially allows more dead code elimination and makes debugging easier. Fixes #25976 - - - - - 7250fc0c by Matthew Pickering at 2025-04-22T16:24:04-04:00 Move -fno-code note into Downsweep module This note was left behind when all the code which referred to it was moved into the GHC.Driver.Downsweep module - - - - - d2dc89b4 by Matthew Pickering at 2025-04-22T16:24:04-04:00 Apply editing notes to Note [-fno-code mode] suggested by sheaf These notes were suggested in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/14241 - - - - - f760da42 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 GHC-CPP: first rough proof of concept Processes #define FOO #ifdef FOO x = 1 #endif Into [ITcppIgnored [L loc ITcppDefine] ,ITcppIgnored [L loc ITcppIfdef] ,ITvarid "x" ,ITequal ,ITinteger (IL {il_text = SourceText "1", il_neg = False, il_value = 1}) ,ITcppIgnored [L loc ITcppEndif] ,ITeof] In time, ITcppIgnored will be pushed into a comment - - - - - ba8eca9c by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Tidy up before re-visiting the continuation mechanic - - - - - 37f29de9 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Switch preprocessor to continuation passing style Proof of concept, needs tidying up - - - - - 26a44e45 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Small cleanup - - - - - 957e9fbc by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Get rid of some cruft - - - - - fd702662 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Starting to integrate. Need to get the pragma recognised and set - - - - - 725f77c9 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Make cppTokens extend to end of line, and process CPP comments - - - - - 196b44dc by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Remove unused ITcppDefined - - - - - a76e00af by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Allow spaces between # and keyword for preprocessor directive - - - - - f328c99b by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Process CPP continuation lines They are emited as separate ITcppContinue tokens. Perhaps the processing should be more like a comment, and keep on going to the end. BUT, the last line needs to be slurped as a whole. - - - - - 337b85ae by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Accumulate CPP continuations, process when ready Can be simplified further, we only need one CPP token - - - - - 39bb5972 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Simplify Lexer interface. Only ITcpp We transfer directive lines through it, then parse them from scratch in the preprocessor. - - - - - d793afa9 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Deal with directive on last line, with no trailing \n - - - - - 73aa548c by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Start parsing and processing the directives - - - - - 43fd6c16 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Prepare for processing include files - - - - - ea14b5d1 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Move PpState into PreProcess And initParserState, initPragState too - - - - - 31691fc8 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Process nested include files Also move PpState out of Lexer.x, so it is easy to evolve it in a ghci session, loading utils/check-cpp/Main.hs - - - - - cefc0b99 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Split into separate files - - - - - 7fdb5df3 by Alan Zimmerman at 2025-04-23T18:20:32+01:00 Starting on expression parser. But it hangs. Time for Text.Parsec.Expr - - - - - f6234083 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Start integrating the ghc-cpp work From https://github.com/alanz/ghc-cpp - - - - - f9c295b7 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 WIP - - - - - 81269b9e by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Fixup after rebase - - - - - 79082e9c by Alan Zimmerman at 2025-04-23T18:20:33+01:00 WIP - - - - - a1a2928e by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Fixup after rebase, including all tests pass - - - - - 640d5544 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Change pragma usage to GHC_CPP from GhcCPP - - - - - ca99b0f7 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Some comments - - - - - 1005c070 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Reformat - - - - - 6473a770 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Delete unused file - - - - - d724e532 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Rename module Parse to ParsePP - - - - - ce03ec8d by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Clarify naming in the parser - - - - - 9733b4b6 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 WIP. Switching to alex/happy to be able to work in-tree Since Parsec is not available - - - - - 583e0b18 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Layering is now correct - GHC lexer, emits CPP tokens - accumulated in Preprocessor state - Lexed by CPP lexer, CPP command extracted, tokens concated with spaces (to get rid of token pasting via comments) - if directive lexed and parsed by CPP lexer/parser, and evaluated - - - - - 644dcd5e by Alan Zimmerman at 2025-04-23T18:20:33+01:00 First example working Loading Example1.hs into ghci, getting the right results ``` {-# LANGUAGE GHC_CPP #-} module Example1 where y = 3 x = "hello" "bye now" foo = putStrLn x ``` - - - - - f0c697ec by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Rebase, and all tests pass except whitespace for generated parser - - - - - e98d5c51 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 More plumbing. Ready for testing tomorrow. - - - - - 416afac3 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Proress. Renamed module State from Types And at first blush it seems to handle preprocessor scopes properly. - - - - - 96c3bbfa by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Insert basic GHC version macros into parser __GLASGOW_HASKELL__ __GLASGOW_HASKELL_FULL_VERSION__ __GLASGOW_HASKELL_PATCHLEVEL1__ __GLASGOW_HASKELL_PATCHLEVEL2__ - - - - - 0f44f60b by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Re-sync check-cpp for easy ghci work - - - - - b597d3af by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Get rid of warnings - - - - - e2c95bd2 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Rework macro processing, in check-cpp Macros kept at the top level, looked up via name, multiple arity versions per name can be stored - - - - - 71c8b69b by Alan Zimmerman at 2025-04-23T18:20:33+01:00 WIP. Can crack arguments for #define Next step it to crack out args in an expansion - - - - - 65952a64 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 WIP on arg parsing. - - - - - cc5b7bae by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Progress. Still screwing up nested parens. - - - - - 97cf33ce by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Seems to work, but has redundant code - - - - - 122e4141 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Remove redundant code - - - - - d1cd5d4a by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Reformat - - - - - eae48cdb by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Expand args, single pass Still need to repeat until fixpoint - - - - - 19fa7863 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Fixed point expansion - - - - - 9dcb60a4 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Sync the playground to compiler - - - - - f763146c by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Working on dumping the GHC_CPP result But We need to keep the BufSpan in a comment - - - - - a4b011ef by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Keep BufSpan in queued comments in GHC.Parser.Lexer - - - - - e6217a9d by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Getting close to being able to print the combined tokens showing what is in and what is out - - - - - d88f20db by Alan Zimmerman at 2025-04-23T18:20:33+01:00 First implementation of dumpGhcCpp. Example output First dumps all macros in the state, then the source, showing which lines are in and which are out ------------------------------ - |#define FOO(A,B) A + B - |#define FOO(A,B,C) A + B + C - |#if FOO(1,FOO(3,4)) == 8 - |-- a comment |x = 1 - |#else - |x = 5 - |#endif - - - - - 3e5a0ceb by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Clean up a bit - - - - - f608ef03 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Add -ddump-ghc-cpp option and a test based on it - - - - - f915ca13 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Restore Lexer.x rules, we need them for continuation lines - - - - - f2563161 by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Lexer.x: trying to sort out the span for continuations - We need to match on \n at the end of the line - We cannot simply back up for it - - - - - 42a531cc by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Inserts predefined macros. But does not dump properly Because the cpp tokens have a trailing newline - - - - - a1cc1d0d by Alan Zimmerman at 2025-04-23T18:20:33+01:00 Remove unnecessary LExer rules We *need* the ones that explicitly match to the end of the line. - - - - - 4a83790f by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Generate correct span for ITcpp Dump now works, except we do not render trailing `\` for continuation lines. This is good enough for use in test output. - - - - - 87f30d80 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Reduce duplication in lexer - - - - - cf56bda3 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Tweaks - - - - - 93fd2d62 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Insert min_version predefined macros into state The mechanism now works. Still need to flesh out the full set. - - - - - 32b4ce0c by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Trying my alternative pragma syntax. It works, but dumpGhcCpp is broken, I suspect from the ITcpp token span update. - - - - - ea43b3e5 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Pragma extraction now works, with both CPP and GHC_CPP For the following {-# LANGUAGE CPP #-} #if __GLASGOW_HASKELL__ >= 913 {-# LANGUAGE GHC_CPP #-} #endif We will enable GHC_CPP only - - - - - 28d56603 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Remove some tracing - - - - - 82e3be4f by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Fix test exes for changes - - - - - 94076bba by Alan Zimmerman at 2025-04-23T18:20:34+01:00 For GHC_CPP tests, normalise config-time-based macros - - - - - 006c0eb7 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 WIP - - - - - 2169b0b2 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 WIP again. What is wrong? - - - - - 16aba8e4 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Revert to dynflags for normal not pragma lexing - - - - - 66e7ea5c by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Working on getting check-exact to work properly - - - - - bf5b32e9 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Passes CppCommentPlacement test - - - - - 67b46e02 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Starting on exact printing with GHC_CPP While overriding normal CPP - - - - - 67e5784b by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Correctly store CPP ignored tokens as comments By populating the lexeme string in it, based on the bufpos - - - - - 1ede50b5 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 WIP - - - - - 2494ae3b by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Simplifying - - - - - 8513be6d by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Update the active state logic - - - - - 5497f216 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Work the new logic into the mainline code - - - - - 62907fad by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Process `defined` operator - - - - - d9f18ec7 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Manage lexer state while skipping tokens There is very intricate layout-related state used when lexing. If a CPP directive blanks out some tokens, store this state when the blanking starts, and restore it when they are no longer being blanked. - - - - - 6440b6fd by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Track the last token buffer index, for ITCppIgnored We need to attach the source being skipped in an ITCppIgnored token. We cannot simply use its BufSpan as an index into the underlying StringBuffer as it counts unicode chars, not bytes. So we update the lexer state to store the starting StringBuffer location for the last token, and use the already-stored length to extract the correct portion of the StringBuffer being parsed. - - - - - 94f160f2 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Process the ! operator in GHC_CPP expressions - - - - - 043cd4fc by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Predefine a constant when GHC_CPP is being used. - - - - - 856fc6d8 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 WIP - - - - - 979b586e by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Skip lines directly in the lexer when required - - - - - cf16b372 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Properly manage location when accepting tokens again - - - - - 7147c8aa by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Seems to be working now, for Example9 - - - - - 38b6f99e by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Remove tracing - - - - - 63000ef1 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Fix parsing '*' in block comments Instead of replacing them with '-' - - - - - b8b8683d by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Keep the trailing backslash in a ITcpp token - - - - - 917cf766 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Deal with only enabling one section of a group. A group is an instance of a conditional introduced by #if/#ifdef/#ifndef, and ending at the final #endif, including intermediate #elsif sections - - - - - 8a61cdb0 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Replace remaining identifiers with 0 when evaluating As per the spec - - - - - 8d6e5059 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Snapshot before rebase - - - - - 406f310f by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Skip non-processed lines starting with # - - - - - f09197cd by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Export generateMacros so we can use it in ghc-exactprint - - - - - 4400f575 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Fix rebase - - - - - b3f1b80b by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Expose initParserStateWithMacrosString - - - - - ebd1e495 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Fix buggy lexer cppSkip It was skipping all lines, not just ones prefixed by # - - - - - b209229d by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Fix evaluation of && to use the correct operator - - - - - 59dd06b5 by Alan Zimmerman at 2025-04-23T18:20:34+01:00 Deal with closing #-} at the start of a line - - - - - b15335d6 by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Add the MIN_VERSION_GLASGOW_HASKELL predefined macro - - - - - 2462186e by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Include MIN_VERSION_GLASGOW_HASKELL in GhcCpp01.stderr - - - - - ad5c2dbd by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Use a strict map for macro defines - - - - - 28e8853d by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Process TIdentifierLParen Which only matters at the start of #define - - - - - 5c80a798 by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Do not provide TIdentifierLParen paren twice - - - - - fbcdf374 by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Handle whitespace between identifier and '(' for directive only - - - - - 37b1be80 by Alan Zimmerman at 2025-04-23T18:20:35+01:00 Expose some Lexer bitmap manipulation helpers - - - - - 557ec85d by Alan Zimmerman at 2025-04-23T20:50:22+01:00 Deal with line pragmas as tokens Blows up for dumpGhcCpp though - - - - - 82 changed files: - compiler/GHC.hs - compiler/GHC/Cmm/Lexer.x - compiler/GHC/Cmm/Parser.y - compiler/GHC/Cmm/Parser/Monad.hs - compiler/GHC/Core/Opt/ConstantFold.hs - compiler/GHC/Driver/Backpack.hs - compiler/GHC/Driver/Config/Parser.hs - compiler/GHC/Driver/Downsweep.hs - compiler/GHC/Driver/Flags.hs - compiler/GHC/Driver/Main.hs - compiler/GHC/Driver/Make.hs - compiler/GHC/Driver/Pipeline.hs - compiler/GHC/Driver/Pipeline/Execute.hs - compiler/GHC/Driver/Session.hs - compiler/GHC/Parser.hs-boot - compiler/GHC/Parser.y - compiler/GHC/Parser/Annotation.hs - compiler/GHC/Parser/HaddockLex.x - compiler/GHC/Parser/Header.hs - compiler/GHC/Parser/Lexer.x - compiler/GHC/Parser/PostProcess.hs - compiler/GHC/Parser/PostProcess/Haddock.hs - + compiler/GHC/Parser/PreProcess.hs - + compiler/GHC/Parser/PreProcess/Eval.hs - + compiler/GHC/Parser/PreProcess/Lexer.x - + compiler/GHC/Parser/PreProcess/Macro.hs - + compiler/GHC/Parser/PreProcess/ParsePP.hs - + compiler/GHC/Parser/PreProcess/Parser.y - + compiler/GHC/Parser/PreProcess/ParserM.hs - + compiler/GHC/Parser/PreProcess/State.hs - compiler/GHC/Parser/Utils.hs - compiler/GHC/SysTools/Cpp.hs - compiler/ghc.cabal.in - docs/users_guide/debugging.rst - ghc/GHCi/UI.hs - hadrian/src/Rules/SourceDist.hs - hadrian/stack.yaml.lock - libraries/ghc-internal/src/GHC/Internal/LanguageExtensions.hs - testsuite/tests/count-deps/CountDepsParser.stdout - testsuite/tests/driver/T4437.hs - testsuite/tests/ghc-api/T11579.hs - + testsuite/tests/ghc-cpp/GhcCpp01.hs - + testsuite/tests/ghc-cpp/GhcCpp01.stderr - + testsuite/tests/ghc-cpp/all.T - testsuite/tests/interface-stability/template-haskell-exports.stdout - + testsuite/tests/printer/CppCommentPlacement.hs - + testsuite/tests/simplCore/should_compile/T25976.hs - testsuite/tests/simplCore/should_compile/all.T - + utils/check-cpp/.ghci - + utils/check-cpp/.gitignore - + utils/check-cpp/Eval.hs - + utils/check-cpp/Example1.hs - + utils/check-cpp/Example10.hs - + utils/check-cpp/Example11.hs - + utils/check-cpp/Example12.hs - + utils/check-cpp/Example13.hs - + utils/check-cpp/Example2.hs - + utils/check-cpp/Example3.hs - + utils/check-cpp/Example4.hs - + utils/check-cpp/Example5.hs - + utils/check-cpp/Example6.hs - + utils/check-cpp/Example7.hs - + utils/check-cpp/Example8.hs - + utils/check-cpp/Example9.hs - + utils/check-cpp/Lexer.x - + utils/check-cpp/Macro.hs - + utils/check-cpp/Main.hs - + utils/check-cpp/ParsePP.hs - + utils/check-cpp/ParseSimulate.hs - + utils/check-cpp/Parser.y - + utils/check-cpp/ParserM.hs - + utils/check-cpp/PreProcess.hs - + utils/check-cpp/README.md - + utils/check-cpp/State.hs - + utils/check-cpp/run.sh - utils/check-exact/Main.hs - utils/check-exact/Parsers.hs - utils/check-exact/Preprocess.hs - utils/check-exact/Utils.hs - utils/haddock/haddock-api/src/Haddock/Backends/Hyperlinker/Parser.hs - utils/haddock/haddock-api/src/Haddock/Parser.hs - utils/haddock/haddock-api/src/Haddock/Types.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c88a8a4bb158989f999278708c9ce7c... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/c88a8a4bb158989f999278708c9ce7c... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Alan Zimmerman (@alanz)