[Git][ghc/ghc][master] 7 commits: Make cmm 'import "package" name;' syntax use consistent label types
Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 9f85f034 by Duncan Coutts at 2026-04-30T04:52:42-04:00 Make cmm 'import "package" name;' syntax use consistent label types There is a little-used syntactic form in cmm imports: import "package" foo; Which means to import foo from the given package (unit id, specified as a string). This syntax is somewhat reminiscent of GHC's package import extension. This syntax form is not used in the rts cmm code, nor any of the boot libraries. It may not be used at all. Unclear. Change the kind of CLabel this syntax generates to be consistent with the others. The other cmm imports use ForeignLabel with ForeignLabelInExternalPackage. For some reason this form was using CmmLabel. Change that to also be ForeignLabel but with ForeignLabelInPackage. This specifies a specific package, rather than an unnamed external package. - - - - - a811f68f by Duncan Coutts at 2026-04-30T04:52:42-04:00 Change default cmm import statements to be internal Previously a cmm statement like: import foo; meant to expect the symbol from a different shared library than the current one. Now it means to expect the symbol from the same shared library as the current one. We'll add explicit syntax to indicate that it's a foreign import. Most existing uses are in fact intenal (rts to rts), so few imports will need to be annotated foreign. Examples would include cmm code in libraries (other than the rts) that need to access RTS APIs. In practice, this makes no difference whatsoever at the moment on any platform other than windows (where building Haskell libs as shared libs does not fully work yet), since the 'labelDynamic' treats all such labels as foreign, irrespective of the foreign label source. - - - - - 17fe5d1d by Duncan Coutts at 2026-04-30T04:52:42-04:00 Add cmm import syntax 'import DATA foo;' as better name for CLOSURE The existing syntax is: import CLOSURE foo; The new syntax is import DATA foo; This means to interpret the symbol foo as refering to data (i.e. a global constant or variable) rather than to code (a function). The historical syntax for this uses CLOSURE, which is rather misleading. Presumably this was done to avoid introducing new reserved words. Be less squemish about new reserved words and add DATA and use that. Keep the existing CLOSURE syntax as an alias for compatibility. - - - - - 3a530d68 by Duncan Coutts at 2026-04-30T04:52:42-04:00 Add cmm 'import extern name;' syntax Since the default for cmm imports is now for symbols within the same shared object, we need a way to indicate we want a symbol from an external shared object: import extern foo; -- for a function import extern DATA foo; -- for data This adds a new reserved word 'extern'. We don't expect to have to use this much. Most cmm imports are intra-DSO. This makes no difference currently on ELF and MachO platforms, but does make a difference to the linking conventions on PE (Windows). In future it's plausible we could take make distinctions on ELF or MachO, so it's worth trying to get it right. Windows can be the guinea pig. - - - - - 2b8e44c7 by Duncan Coutts at 2026-04-30T04:52:42-04:00 Add cmm syntax 'import "package" DATA foo;' for completeness We already have: import DATA foo; -- for data imports import "package" foo; -- for imports from a given unitid There's no reason not to have both at once: import "package" DATA foo; So add that. - - - - - ee05e5cc by Duncan Coutts at 2026-04-30T04:52:42-04:00 Improve the commentary for the cmm import grammar. AFAIK, this is the only place where GHC-style Cmm syntax is documented. - - - - - b35946ad by Duncan Coutts at 2026-04-30T04:52:42-04:00 Add a changelog.d entry for the .cmm import syntax changes - - - - - 3 changed files: - + changelog.d/cmm-import-syntax-changes - compiler/GHC/Cmm/Lexer.x - compiler/GHC/Cmm/Parser.y Changes: ===================================== changelog.d/cmm-import-syntax-changes ===================================== @@ -0,0 +1,34 @@ +section: cmm +synopsis: Changes to Cmm hand-written syntax for symbol imports. +issues: #27162 +mrs: !15135 + +description: { + In hand-written Cmm, there is syntax to declare symbol names from outside of + the current .cmm file (e.g. .c or .cmm files). + + The existing syntax is + + > import foo; -- for a function + > import CLOSURE foo; -- for data + + and this implicitly meant that the symbol (`foo`) could be found in an + external shared library, not the current one. There was no syntax to specify + that the symbol should be found in the current shared library, i.e. in a + .cmm file (or .hs file) in the current Haskell package. + + The new syntax assumes local by default and allows specifying external: + + > import foo; -- for a function in the current lib + > import DATA foo; -- for data in the current lib + > import extern foo; -- for a function in an external lib + > import extern DATA foo; -- for data in an external lib + > import "unitid" foo; -- for a function in the Haskell unit "unitid" + > import "unitid" DATA foo; -- for data in the Haskell unit "unitid" + + In practice, the only platform where this can be expected to make a + difference is on Windows, and only when compiling each Haskell package as a + separate .dll dynamic library. +} + + ===================================== compiler/GHC/Cmm/Lexer.x ===================================== @@ -174,6 +174,8 @@ data CmmToken | CmmT_return | CmmT_returns | CmmT_import + | CmmT_extern + | CmmT_DATA | CmmT_switch | CmmT_case | CmmT_default @@ -273,6 +275,8 @@ reservedWordsFM = listToUFM $ ( "return", CmmT_return ), ( "returns", CmmT_returns ), ( "import", CmmT_import ), + ( "extern", CmmT_extern ), + ( "DATA", CmmT_DATA ), ( "switch", CmmT_switch ), ( "case", CmmT_case ), ( "default", CmmT_default ), ===================================== compiler/GHC/Cmm/Parser.y ===================================== @@ -372,6 +372,8 @@ import qualified Data.ByteString.Char8 as BS8 'return' { L _ (CmmT_return) } 'returns' { L _ (CmmT_returns) } 'import' { L _ (CmmT_import) } + 'extern' { L _ (CmmT_extern) } + 'DATA' { L _ (CmmT_DATA) } 'switch' { L _ (CmmT_switch) } 'case' { L _ (CmmT_case) } 'default' { L _ (CmmT_default) } @@ -643,18 +645,42 @@ importNames importName :: { (FastString, CLabel) } - -- A label imported without an explicit packageId. - -- These are taken to come from some foreign, unnamed package. + -- A code label imported from within the same shared library. : NAME - { ($1, mkForeignLabel $1 ForeignLabelInExternalPackage IsFunction) } + { ($1, mkForeignLabel $1 ForeignLabelInThisPackage IsFunction) } - -- as previous 'NAME', but 'IsData' - | 'CLOSURE' NAME - { ($2, mkForeignLabel $2 ForeignLabelInExternalPackage IsData) } + -- A data label imported from within the same shared library. + | 'DATA' NAME + { ($2, mkForeignLabel $2 ForeignLabelInThisPackage IsData) } - -- A label imported with an explicit UnitId. + -- CLOSURE is a historical alias for DATA in this context. + | 'CLOSURE' NAME + { ($2, mkForeignLabel $2 ForeignLabelInThisPackage IsData) } + + -- A code label imported from another unamed shared library. These may + -- come from a foreign shared library, or from the shared library for + -- an unnamed Haskell package. This corresponds on Windows/PE to + -- __declspec(dllimport) in C. + | 'extern' NAME + { ($2, mkForeignLabel $2 ForeignLabelInExternalPackage IsFunction) } + + -- A data label imported from another unamed shared library. + -- This corresponds on Windows/PE to __declspec(dllimport) in C (but + -- cmm doesn't know about data vs function symbols so we have to say). + | 'extern' 'DATA' NAME + { ($3, mkForeignLabel $3 ForeignLabelInExternalPackage IsData) } + + -- A code label imported from the shared library for a Haskell package + -- with the given UnitId. Such labels behave as local when used within + -- the specified unit, or as extern otherwise. | STRING NAME - { ($2, mkCmmCodeLabel (UnitId (mkFastString $1)) $2) } + { ($2, mkForeignLabel $2 (ForeignLabelInPackage (UnitId (mkFastString $1))) IsFunction) } + + -- A data label imported from the shared library for a Haskell package + -- with the given UnitId. Such labels behave as local when used within + -- the specified unit, or as extern otherwise. + | STRING 'DATA' NAME + { ($3, mkForeignLabel $3 (ForeignLabelInPackage (UnitId (mkFastString $1))) IsData) } names :: { [FastString] } View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/9797052b974b3356c34b457558ffda3... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/9797052b974b3356c34b457558ffda3... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Marge Bot (@marge-bot)