[Git][ghc/ghc][wip/ubsan] hadrian: add support for building with UndefinedBehaviorSanitizer
Cheng Shao pushed to branch wip/ubsan at Glasgow Haskell Compiler / GHC Commits: 2cf983a0 by Cheng Shao at 2025-11-08T13:28:25+01:00 hadrian: add support for building with UndefinedBehaviorSanitizer This patch adds a +ubsan flavour transformer to hadrian to build all stage1+ C/C++ code with UndefinedBehaviorSanitizer. This is particularly useful to catch potential undefined behavior in the RTS codebase. - - - - - 5 changed files: - hadrian/doc/flavours.md - hadrian/src/Flavour.hs - rts/rts.cabal - testsuite/driver/testglobals.py - testsuite/driver/testlib.py Changes: ===================================== hadrian/doc/flavours.md ===================================== @@ -238,6 +238,10 @@ The supported transformers are listed below: <td><code>thread_sanitizer</code></td> <td>Build the runtime system with ThreadSanitizer support</td> </tr> + <tr> + <td><code>ubsan</code></td> + <td>Build all stage1+ C/C++ code with UndefinedBehaviorSanitizer support</td> + </tr> <tr> <td><code>llvm</code></td> <td>Use GHC's LLVM backend (`-fllvm`) for all stage1+ compilation.</td> ===================================== hadrian/src/Flavour.hs ===================================== @@ -7,6 +7,7 @@ module Flavour , addArgs , splitSections , enableThreadSanitizer + , enableUBSan , enableLateCCS , enableHashUnitIds , enableDebugInfo, enableTickyGhc @@ -33,6 +34,9 @@ import Data.Either import Data.Map (Map) import qualified Data.Map as M import qualified Data.Set as Set +import GHC.Platform.ArchOS +import Oracles.Flag +import Oracles.Setting import Packages import Flavour.Type import Settings.Parser @@ -53,6 +57,7 @@ flavourTransformers = M.fromList , "no_split_sections" =: noSplitSections , "thread_sanitizer" =: enableThreadSanitizer False , "thread_sanitizer_cmm" =: enableThreadSanitizer True + , "ubsan" =: enableUBSan , "llvm" =: viaLlvmBackend , "profiled_ghc" =: enableProfiledGhc , "no_dynamic_ghc" =: disableDynamicGhcPrograms @@ -258,6 +263,66 @@ enableThreadSanitizer instrumentCmm = addArgs $ notStage0 ? mconcat ] ] +-- | Whether or not -shared-libsan should be passed to clang at +-- link-time. +-- +-- See +-- https://github.com/llvm/llvm-project/blob/llvmorg-21.1.5/clang/lib/Driver/Sa..., +-- clang defaults to -shared-libsan on darwin/windows and +-- -static-libsan on linux. In general, -static-libsan is incredibly +-- problematic when multiple copies of the sanitizer runtimes coexist +-- in the same address space due to being linked into multiple Haskell +-- libraries. So we should explicitly specify `-shared-libsan` if +-- needed. +-- +-- A small downside of -shared-libsan is the clang-specific sanitizer +-- runtime shared library path needs to be manually specified via +-- @export LD_LIBRARY_PATH=$(dirname $(clang -print-libgcc-file-name +-- -rtlib=compiler-rt))@ for ld.so to find it at runtime. +needSharedLibSAN :: Action Bool +needSharedLibSAN = do + is_clang <- flag CcLlvmBackend + is_default_shared_libsan <- anyTargetOs [OSDarwin, OSMinGW32] + pure $ is_clang && not is_default_shared_libsan + +-- | Build all stage1+ C/C++ code with UndefinedBehaviorSanitizer +-- support: +-- https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html. +-- +-- Note that we also pass -fno-sanitize=function to clang, since +-- "runtime call to function foo through pointer to incorrect function +-- type" is unfortunately pretty common (e.g. evac_fn in rts) and +-- impact the signal to noise ratio of UBSAN warnings. gcc doesn't +-- implement this instrumentation though. +enableUBSan :: Flavour -> Flavour +enableUBSan = + addArgs $ + notStage0 + ? mconcat + [ package rts + ? builder (Cabal Flags) + ? arg "+ubsan" + <> (needSharedLibSAN ? arg "+shared-libsan"), + builder (Ghc CompileHs) + ? arg "-optc-fsanitize=undefined" + <> (flag CcLlvmBackend ? arg "-optc-fno-sanitize=function"), + builder (Ghc CompileCWithGhc) + ? arg "-optc-fsanitize=undefined" + <> (flag CcLlvmBackend ? arg "-optc-fno-sanitize=function"), + builder (Ghc CompileCppWithGhc) + ? arg "optcxx-fsanitize=undefined" + <> (flag CcLlvmBackend ? arg "-optcxx-fno-sanitize=function"), + builder (Ghc LinkHs) + ? arg "-optc-fsanitize=undefined" + <> arg "-optl-fsanitize=undefined" + <> (needSharedLibSAN ? arg "-optl-shared-libsan") + <> (flag CcLlvmBackend ? arg "-optc-fno-sanitize=function"), + builder (Cc CompileC) + ? arg "-fsanitize=undefined" + <> (flag CcLlvmBackend ? arg "-fno-sanitize=function"), + builder Testsuite ? arg "--config=have_ubsan=True" + ] + -- | Use the LLVM backend in stages 1 and later. viaLlvmBackend :: Flavour -> Flavour viaLlvmBackend = addArgs $ notStage0 ? builder Ghc ? arg "-fllvm" ===================================== rts/rts.cabal ===================================== @@ -91,6 +91,19 @@ flag thread-sanitizer in @rts/include/rts/TSANUtils.h@. default: False manual: True +flag ubsan + description: + Link with -fsanitize=undefined, to be enabled when building with + UndefinedBehaviorSanitizer. + default: False + manual: True +flag shared-libsan + description: + Link with -shared-libsan, to guarantee only one copy of the + sanitizer runtimes exist in the address space. See + needSharedLibSAN in hadrian/src/Flavour.hs. + default: False + manual: True library -- rts is a wired in package and @@ -200,6 +213,12 @@ library cc-options: -fsanitize=thread ld-options: -fsanitize=thread + if flag(ubsan) + ld-options: -fsanitize=undefined + + if flag(shared-libsan) + ld-options: -shared-libsan + if os(linux) -- the RTS depends upon libc. while this dependency is generally -- implicitly added by `cc`, we must explicitly add it here to ensure ===================================== testsuite/driver/testglobals.py ===================================== @@ -186,6 +186,9 @@ class TestConfig: # Are we running in a ThreadSanitizer-instrumented build? self.have_thread_sanitizer = False + # Are we running with UndefinedBehaviorSanitizer enabled? + self.have_ubsan = False + # Do symbols use leading underscores? self.leading_underscore = False ===================================== testsuite/driver/testlib.py ===================================== @@ -1090,6 +1090,8 @@ def llvm_build ( ) -> bool: def have_thread_sanitizer( ) -> bool: return config.have_thread_sanitizer +def have_ubsan( ) -> bool: + return config.have_ubsan def gcc_as_cmmp() -> bool: return config.cmm_cpp_is_gcc View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/2cf983a083bb8b251eff4f8a0aacf40a... -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/2cf983a083bb8b251eff4f8a0aacf40a... You're receiving this email because of your account on gitlab.haskell.org.
participants (1)
-
Cheng Shao (@TerrorJack)