-
3b3a5dec
by Ben Gamari at 2025-05-15T16:10:01-04:00
Don't emit unprintable characters when printing Uniques
When faced with an unprintable tag we now instead print the codepoint
number.
Fixes #25989.
(cherry picked from commit e832b1fadee66e8d6dd7b019368974756f8f8c46)
-
e1ef8974
by Mike Pilgrem at 2025-05-16T16:09:14-04:00
Translate iff in Haddock documentation into everyday English
-
b37711f9
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
GHC-CPP: first rough proof of concept
Processes
#define FOO
#ifdef FOO
x = 1
#endif
Into
[ITcppIgnored [L loc ITcppDefine]
,ITcppIgnored [L loc ITcppIfdef]
,ITvarid "x"
,ITequal
,ITinteger (IL {il_text = SourceText "1", il_neg = False, il_value = 1})
,ITcppIgnored [L loc ITcppEndif]
,ITeof]
In time, ITcppIgnored will be pushed into a comment
-
155274a4
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Tidy up before re-visiting the continuation mechanic
-
e67cc209
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Switch preprocessor to continuation passing style
Proof of concept, needs tidying up
-
1a0613de
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Small cleanup
-
28bb3dcd
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Get rid of some cruft
-
af244265
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Starting to integrate.
Need to get the pragma recognised and set
-
4df9b8db
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Make cppTokens extend to end of line, and process CPP comments
-
571a3557
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Remove unused ITcppDefined
-
04444ced
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Allow spaces between # and keyword for preprocessor directive
-
56022164
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Process CPP continuation lines
They are emited as separate ITcppContinue tokens.
Perhaps the processing should be more like a comment, and keep on
going to the end.
BUT, the last line needs to be slurped as a whole.
-
f20ff9a2
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Accumulate CPP continuations, process when ready
Can be simplified further, we only need one CPP token
-
35e31452
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Simplify Lexer interface. Only ITcpp
We transfer directive lines through it, then parse them from scratch
in the preprocessor.
-
c9b03ce5
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Deal with directive on last line, with no trailing \n
-
e1f18f92
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Start parsing and processing the directives
-
651b1e66
by Alan Zimmerman at 2025-05-17T09:54:42+01:00
Prepare for processing include files
-
76f05ae3
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Move PpState into PreProcess
And initParserState, initPragState too
-
f71d75df
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Process nested include files
Also move PpState out of Lexer.x, so it is easy to evolve it in a ghci
session, loading utils/check-cpp/Main.hs
-
9a5d961d
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Split into separate files
-
12fe7c28
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Starting on expression parser.
But it hangs. Time for Text.Parsec.Expr
-
eab98997
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Start integrating the ghc-cpp work
From https://github.com/alanz/ghc-cpp
-
23e5c90c
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
WIP
-
109fe6fc
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Fixup after rebase
-
b218f624
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
WIP
-
5946cc99
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Fixup after rebase, including all tests pass
-
f7c374ac
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Change pragma usage to GHC_CPP from GhcCPP
-
b63bfb1d
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Some comments
-
885d9ab6
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Reformat
-
9595ad65
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Delete unused file
-
090a3e45
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Rename module Parse to ParsePP
-
8d85e179
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Clarify naming in the parser
-
eabddb9a
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
WIP. Switching to alex/happy to be able to work in-tree
Since Parsec is not available
-
de751411
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Layering is now correct
- GHC lexer, emits CPP tokens
- accumulated in Preprocessor state
- Lexed by CPP lexer, CPP command extracted, tokens concated with
spaces (to get rid of token pasting via comments)
- if directive lexed and parsed by CPP lexer/parser, and evaluated
-
e640ec71
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
First example working
Loading Example1.hs into ghci, getting the right results
```
{-# LANGUAGE GHC_CPP #-}
module Example1 where
y = 3
x =
"hello"
"bye now"
foo = putStrLn x
```
-
6220e06f
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Rebase, and all tests pass except whitespace for generated parser
-
183393b6
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
More plumbing. Ready for testing tomorrow.
-
42f8b67e
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Proress. Renamed module State from Types
And at first blush it seems to handle preprocessor scopes properly.
-
c7804af7
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Insert basic GHC version macros into parser
__GLASGOW_HASKELL__
__GLASGOW_HASKELL_FULL_VERSION__
__GLASGOW_HASKELL_PATCHLEVEL1__
__GLASGOW_HASKELL_PATCHLEVEL2__
-
2bd6ae63
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Re-sync check-cpp for easy ghci work
-
5c143c42
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Get rid of warnings
-
6d51665d
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Rework macro processing, in check-cpp
Macros kept at the top level, looked up via name, multiple arity
versions per name can be stored
-
93a18c1e
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
WIP. Can crack arguments for #define
Next step it to crack out args in an expansion
-
75bf2b6b
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
WIP on arg parsing.
-
8b5d99d8
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Progress. Still screwing up nested parens.
-
a529b10f
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Seems to work, but has redundant code
-
05d7ca7f
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Remove redundant code
-
fcb2387e
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Reformat
-
dfaf1a46
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Expand args, single pass
Still need to repeat until fixpoint
-
c806eb22
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Fixed point expansion
-
86403450
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Sync the playground to compiler
-
917a66b2
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Working on dumping the GHC_CPP result
But We need to keep the BufSpan in a comment
-
3bb0bb30
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Keep BufSpan in queued comments in GHC.Parser.Lexer
-
82106a2e
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Getting close to being able to print the combined tokens
showing what is in and what is out
-
493c0253
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
First implementation of dumpGhcCpp.
Example output
First dumps all macros in the state, then the source, showing which
lines are in and which are out
------------------------------
- |#define FOO(A,B) A + B
- |#define FOO(A,B,C) A + B + C
- |#if FOO(1,FOO(3,4)) == 8
- |-- a comment
|x = 1
- |#else
- |x = 5
- |#endif
-
a8d628b2
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Clean up a bit
-
206e4773
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Add -ddump-ghc-cpp option and a test based on it
-
65bb5bc2
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Restore Lexer.x rules, we need them for continuation lines
-
5726e351
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Lexer.x: trying to sort out the span for continuations
- We need to match on \n at the end of the line
- We cannot simply back up for it
-
7612ec92
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Inserts predefined macros. But does not dump properly
Because the cpp tokens have a trailing newline
-
62b5ea7d
by Alan Zimmerman at 2025-05-17T09:54:43+01:00
Remove unnecessary LExer rules
We *need* the ones that explicitly match to the end of the line.
-
47d703cf
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Generate correct span for ITcpp
Dump now works, except we do not render trailing `\` for continuation
lines. This is good enough for use in test output.
-
aef0b466
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Reduce duplication in lexer
-
cc158a75
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Tweaks
-
c192915c
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Insert min_version predefined macros into state
The mechanism now works. Still need to flesh out the full set.
-
601395ff
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Trying my alternative pragma syntax.
It works, but dumpGhcCpp is broken, I suspect from the ITcpp token
span update.
-
61117c67
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Pragma extraction now works, with both CPP and GHC_CPP
For the following
{-# LANGUAGE CPP #-}
#if __GLASGOW_HASKELL__ >= 913
{-# LANGUAGE GHC_CPP #-}
#endif
We will enable GHC_CPP only
-
96540e19
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Remove some tracing
-
2bf2c60f
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Fix test exes for changes
-
a6e90845
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
For GHC_CPP tests, normalise config-time-based macros
-
6665d0fa
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
WIP
-
03283165
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
WIP again. What is wrong?
-
b56db99f
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Revert to dynflags for normal not pragma lexing
-
75d67c2a
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Working on getting check-exact to work properly
-
0908eb85
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Passes CppCommentPlacement test
-
8880d51a
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Starting on exact printing with GHC_CPP
While overriding normal CPP
-
685963fd
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Correctly store CPP ignored tokens as comments
By populating the lexeme string in it, based on the bufpos
-
29f82644
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
WIP
-
addfca69
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Simplifying
-
37a6f59f
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Update the active state logic
-
e1e11679
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Work the new logic into the mainline code
-
1f8c610f
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Process `defined` operator
-
e3948c03
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Manage lexer state while skipping tokens
There is very intricate layout-related state used when lexing. If a
CPP directive blanks out some tokens, store this state when the
blanking starts, and restore it when they are no longer being blanked.
-
b1ffd86f
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Track the last token buffer index, for ITCppIgnored
We need to attach the source being skipped in an ITCppIgnored token.
We cannot simply use its BufSpan as an index into the underlying
StringBuffer as it counts unicode chars, not bytes.
So we update the lexer state to store the starting StringBuffer
location for the last token, and use the already-stored length to
extract the correct portion of the StringBuffer being parsed.
-
68494b79
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Process the ! operator in GHC_CPP expressions
-
cd161831
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Predefine a constant when GHC_CPP is being used.
-
fcc441ab
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
WIP
-
42240bf2
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Skip lines directly in the lexer when required
-
df58fdcb
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Properly manage location when accepting tokens again
-
0fd128b4
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Seems to be working now, for Example9
-
73ec0a2d
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Remove tracing
-
c0f73ffd
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Fix parsing '*' in block comments
Instead of replacing them with '-'
-
089cf569
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Keep the trailing backslash in a ITcpp token
-
47d41734
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Deal with only enabling one section of a group.
A group is an instance of a conditional introduced by
#if/#ifdef/#ifndef,
and ending at the final #endif, including intermediate #elsif sections
-
1a3104bb
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Replace remaining identifiers with 0 when evaluating
As per the spec
-
23449d3b
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Snapshot before rebase
-
ce898e7a
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Skip non-processed lines starting with #
-
428a0aa4
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Export generateMacros so we can use it in ghc-exactprint
-
8bfe5dee
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Fix rebase
-
ba5cf313
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Expose initParserStateWithMacrosString
-
7cc2a4dd
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Fix buggy lexer cppSkip
It was skipping all lines, not just ones prefixed by #
-
200ba48c
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Fix evaluation of && to use the correct operator
-
c5b56896
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Deal with closing #-} at the start of a line
-
542a9e65
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Add the MIN_VERSION_GLASGOW_HASKELL predefined macro
-
c291b710
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Include MIN_VERSION_GLASGOW_HASKELL in GhcCpp01.stderr
-
c80ad8a9
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Use a strict map for macro defines
-
b13f79c9
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Process TIdentifierLParen
Which only matters at the start of #define
-
733ba1ef
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Do not provide TIdentifierLParen paren twice
-
14469313
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Handle whitespace between identifier and '(' for directive only
-
b59f7099
by Alan Zimmerman at 2025-05-17T09:54:44+01:00
Expose some Lexer bitmap manipulation helpers
-
b5a71225
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Deal with line pragmas as tokens
Blows up for dumpGhcCpp though
-
6a4fe098
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Allow strings delimited by a single quote too
-
acf20184
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Allow leading whitespace on cpp directives
As per https://timsong-cpp.github.io/cppwp/n4140/cpp#1
-
8a829253
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Implement GHC_CPP undef
-
484c6719
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Sort out expansion of no-arg macros, in a context with args
And make the expansion bottom out, in the case of recursion
-
b44c6320
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Fix GhcCpp01 test
The LINE pragma stuff works in ghc-exactprint when specifically
setting flag to emit ITline_pragma tokens
-
b9eb081e
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Process comments in CPP directives
-
17348375
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Correctly lex pragmas with finel #-} on a newline
-
a168f964
by Alan Zimmerman at 2025-05-17T09:54:45+01:00
Do not process CPP-style comments
-
7248c292
by Alan Zimmerman at 2025-05-18T09:10:52+01:00
Allow cpp-style comments when GHC_CPP enabled