Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC
Commits:
43fa8be8 by sheaf at 2025-11-11T11:51:18-05:00
localRegistersConflict: account for assignment LHS
This commit fixes a serious oversight in GHC.Cmm.Sink.conflicts,
specifically the code that computes which local registers conflict
between an assignment and a Cmm statement.
If we have:
assignment: = <expr>
node: =
then clearly the two conflict, because we cannot move one statement past
the other, as they assign two different values to the same local
register. (Recall that 'conflicts (local_reg,expr) node' is False if and
only if the assignment 'local_reg = expr' can be safely commuted past
the statement 'node'.)
The fix is to update 'GHC.Cmm.Sink.localRegistersConflict' to take into
account the following two situations:
(1) 'node' defines the LHS local register of the assignment,
(2) 'node' defines a local register used in the RHS of the assignment.
The bug is precisely that we were previously missing condition (1).
Fixes #26550
- - - - -
79dfcfe0 by sheaf at 2025-11-11T11:51:18-05:00
Update assigned register format when spilling
When we come to spilling a register to put new data into it, in
GHC.CmmToAsm.Reg.Linear.allocRegsAndSpill_spill, we need to:
1. Spill the data currently in the register. That is, do a spill
with a format that matches what's currently in the register.
2. Update the register assignment, allocating a virtual register to
this real register, but crucially **updating the format** of this
assignment.
Due to shadowing in the Haskell code for allocRegsAndSpill_spill, we
were mistakenly re-using the old format. This could lead to a situation
where:
a. We were using xmm6 to store a Double#.
b. We want to store a DoubleX2# into xmm6, so we spill the current
content of xmm6 to the stack using a scalar move (correct).
c. We update the register assignment, but we fail to update the format
of the assignment, so we continue to think that xmm6 stores a
Double# and not a DoubleX2#.
d. Later on, we need to spill xmm6 because it is getting clobbered by
another instruction. We then decide to only spill the lower 64 bits
of the register, because we still think that xmm6 only stores a
Double# and not a DoubleX2#.
Fixes #26542
- - - - -
7 changed files:
- compiler/GHC/Cmm/Sink.hs
- compiler/GHC/CmmToAsm/Reg/Linear.hs
- + testsuite/tests/simd/should_run/T26542.hs
- + testsuite/tests/simd/should_run/T26542.stdout
- + testsuite/tests/simd/should_run/T26550.hs
- + testsuite/tests/simd/should_run/T26550.stdout
- testsuite/tests/simd/should_run/all.T
Changes:
=====================================
compiler/GHC/Cmm/Sink.hs
=====================================
@@ -26,76 +26,74 @@ import Data.Maybe
import GHC.Exts (inline)
--- -----------------------------------------------------------------------------
--- Sinking and inlining
+--------------------------------------------------------------------------------
+{- Note [Sinking and inlining]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Sinking is an optimisation pass that
+ (a) moves assignments closer to their uses, to reduce register pressure
+ (b) pushes assignments into a single branch of a conditional if possible
+ (c) inlines assignments to registers that are mentioned only once
+ (d) discards dead assignments
--- This is an optimisation pass that
--- (a) moves assignments closer to their uses, to reduce register pressure
--- (b) pushes assignments into a single branch of a conditional if possible
--- (c) inlines assignments to registers that are mentioned only once
--- (d) discards dead assignments
---
--- This tightens up lots of register-heavy code. It is particularly
--- helpful in the Cmm generated by the Stg->Cmm code generator, in
--- which every function starts with a copyIn sequence like:
---
--- x1 = R1
--- x2 = Sp[8]
--- x3 = Sp[16]
--- if (Sp - 32 < SpLim) then L1 else L2
---
--- we really want to push the x1..x3 assignments into the L2 branch.
---
--- Algorithm:
---
--- * Start by doing liveness analysis.
---
--- * Keep a list of assignments A; earlier ones may refer to later ones.
--- Currently we only sink assignments to local registers, because we don't
--- have liveness information about global registers.
---
--- * Walk forwards through the graph, look at each node N:
---
--- * If it is a dead assignment, i.e. assignment to a register that is
--- not used after N, discard it.
---
--- * Try to inline based on current list of assignments
--- * If any assignments in A (1) occur only once in N, and (2) are
--- not live after N, inline the assignment and remove it
--- from A.
---
--- * If an assignment in A is cheap (RHS is local register), then
--- inline the assignment and keep it in A in case it is used afterwards.
---
--- * Otherwise don't inline.
---
--- * If N is assignment to a local register pick up the assignment
--- and add it to A.
---
--- * If N is not an assignment to a local register:
--- * remove any assignments from A that conflict with N, and
--- place them before N in the current block. We call this
--- "dropping" the assignments.
---
--- * An assignment conflicts with N if it:
--- - assigns to a register mentioned in N
--- - mentions a register assigned by N
--- - reads from memory written by N
--- * do this recursively, dropping dependent assignments
---
--- * At an exit node:
--- * drop any assignments that are live on more than one successor
--- and are not trivial
--- * if any successor has more than one predecessor (a join-point),
--- drop everything live in that successor. Since we only propagate
--- assignments that are not dead at the successor, we will therefore
--- eliminate all assignments dead at this point. Thus analysis of a
--- join-point will always begin with an empty list of assignments.
---
---
--- As a result of above algorithm, sinking deletes some dead assignments
--- (transitively, even). This isn't as good as removeDeadAssignments,
--- but it's much cheaper.
+This tightens up lots of register-heavy code. It is particularly
+helpful in the Cmm generated by the Stg->Cmm code generator, in
+which every function starts with a copyIn sequence like:
+
+ x1 = R1
+ x2 = Sp[8]
+ x3 = Sp[16]
+ if (Sp - 32 < SpLim) then L1 else L2
+
+we really want to push the x1..x3 assignments into the L2 branch.
+
+Algorithm:
+
+ * Start by doing liveness analysis.
+
+ * Keep a list of assignments A; earlier ones may refer to later ones.
+ Currently we only sink assignments to local registers, because we don't
+ have liveness information about global registers.
+
+ * Walk forwards through the graph, look at each node N:
+
+ * If it is a dead assignment, i.e. assignment to a register that is
+ not used after N, discard it.
+
+ * Try to inline based on current list of assignments
+ * If any assignments in A (1) occur only once in N, and (2) are
+ not live after N, inline the assignment and remove it
+ from A.
+
+ * If an assignment in A is cheap (RHS is local register), then
+ inline the assignment and keep it in A in case it is used afterwards.
+
+ * Otherwise don't inline.
+
+ * If N is an assignment to a local register, pick up the assignment
+ and add it to A.
+
+ * If N is not an assignment to a local register:
+ * remove any assignments from A that conflict with N, and
+ place them before N in the current block. We call this
+ "dropping" the assignments.
+ (See Note [When does an assignment conflict?] for what it means for
+ A to conflict with N.)
+
+ * do this recursively, dropping dependent assignments
+
+ * At an exit node:
+ * drop any assignments that are live on more than one successor
+ and are not trivial
+ * if any successor has more than one predecessor (a join-point),
+ drop everything live in that successor. Since we only propagate
+ assignments that are not dead at the successor, we will therefore
+ eliminate all assignments dead at this point. Thus analysis of a
+ join-point will always begin with an empty list of assignments.
+
+As a result of above algorithm, sinking deletes some dead assignments
+(transitively, even). This isn't as good as removeDeadAssignments,
+but it's much cheaper.
+-}
-- -----------------------------------------------------------------------------
-- things that we aren't optimising very well yet.
@@ -648,110 +646,171 @@ okToInline _ _ _ = True
-- -----------------------------------------------------------------------------
+{- Note [When does an assignment conflict?]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+An assignment 'A' conflicts with a statement 'N' if any of the following
+conditions are satisfied:
+
+ (C1) 'A' assigns to a register mentioned in 'N'
+ (C2) 'A' mentions a register assigned by 'N'
+ (C3) 'A' reads from memory written by 'N'
+
+In such a situation, it is not safe to commute 'A' past 'N'. For example,
+it is not safe to commute
+
+ A: r = 1
+ N: s = r
+
+because 'r' may be undefined or hold a different value before 'A'.
+
+Remarks:
+
+ (C3) includes all foreign calls, as they may modify the heap/stack.
+
+ (C1) includes the following two situations:
+
+ (C1a) 'N' defines the LHS register in the assignment 'A', for example:
+
+ A: r = <expr>
+ N: r =
+
+ (C1b) 'N' defines a register used in the RHS of 'A', for example:
+
+ A: r = s
+ N: s = <expr>
+
+ (C1c) 'suspendThread' clobbers every global register not backed by a
+ real register, as noted in #19237.
+
+Forgetting (C1a) led to bug #26550, in which we incorrectly commuted
+
+ A: _c1rB::Fx2V128 = <0.0 :: W64, 0.0 :: W64>
+ N: _c1rB::Fx2V128 = %MO_VF_Insert_2_W64(<0.0 :: W64,0.0 :: W64>,%MO_F_Add_W64(F64[R1 + 7], 3.0 :: W64),0 :: W32)
+
+-}
+
-- | @conflicts (r,e) node@ is @False@ if and only if the assignment
-- @r = e@ can be safely commuted past statement @node@.
+--
+-- See Note [When does an assignment conflict?].
conflicts :: Platform -> Assignment -> CmmNode O x -> Bool
-conflicts platform (r, rhs, addr) node
+conflicts platform assig@(r, rhs, addr) node
- -- (1) node defines registers used by rhs of assignment. This catches
- -- assignments and all three kinds of calls. See Note [Sinking and calls]
- | globalRegistersConflict platform rhs node = True
- | localRegistersConflict platform rhs node = True
+ -- (C1) node defines registers that are either the assigned register or
+ -- are used by the rhs of the assignment.
+ -- This catches assignments and all three kinds of calls.
+ -- See Note [Sinking and calls]
+ | globalRegistersConflict platform rhs node = True
+ | localRegistersConflict platform assig node = True
- -- (2) node uses register defined by assignment
+ -- (C2) node uses register defined by assignment
| foldRegsUsed platform (\b r' -> r == r' || b) False node = True
- -- (3) a store to an address conflicts with a read of the same memory
+ -- (C3) Node writes to memory that is read by the assignment.
+
+ -- (a) a store to an address conflicts with a read of the same memory
| CmmStore addr' e _ <- node
, memConflicts addr (loadAddr platform addr' (cmmExprWidth platform e)) = True
- -- (4) an assignment to Hp/Sp conflicts with a heap/stack read respectively
- | HeapMem <- addr, CmmAssign (CmmGlobal (GlobalRegUse Hp _)) _ <- node = True
- | StackMem <- addr, CmmAssign (CmmGlobal (GlobalRegUse Sp _)) _ <- node = True
- | SpMem{} <- addr, CmmAssign (CmmGlobal (GlobalRegUse Sp _)) _ <- node = True
+ -- (b) an assignment to Hp/Sp conflicts with a heap/stack read respectively
+ | CmmAssign (CmmGlobal (GlobalRegUse Hp _)) _ <- node
+ , memConflicts addr HeapMem
+ = True
+ | CmmAssign (CmmGlobal (GlobalRegUse Sp _)) _ <- node
+ , memConflicts addr StackMem
+ = True
- -- (5) foreign calls clobber heap: see Note [Foreign calls clobber heap]
+ -- (c) foreign calls clobber heap: see Note [Foreign calls clobber heap]
| CmmUnsafeForeignCall{} <- node, memConflicts addr AnyMem = True
- -- (6) suspendThread clobbers every global register not backed by a real
- -- register. It also clobbers heap and stack but this is handled by (5)
+ -- (d) native calls clobber any memory
+ | CmmCall{} <- node, memConflicts addr AnyMem = True
+
+ -- (C1c) suspendThread clobbers every global register not backed by a real
+ -- register. (It also clobbers heap and stack, but this is handled by (C3)(c) above.)
| CmmUnsafeForeignCall (PrimTarget MO_SuspendThread) _ _ <- node
, foldRegsUsed platform (\b g -> globalRegMaybe platform g == Nothing || b) False rhs
= True
- -- (7) native calls clobber any memory
- | CmmCall{} <- node, memConflicts addr AnyMem = True
-
- -- (8) otherwise, no conflict
| otherwise = False
{- Note [Inlining foldRegsDefd]
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- foldRegsDefd is, after optimization, *not* a small function so
- it's only marked INLINEABLE, but not INLINE.
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+foldRegsDefd is, after optimization, *not* a small function so
+it's only marked INLINEABLE, but not INLINE.
- However in some specific cases we call it *very* often making it
- important to avoid the overhead of allocating the folding function.
-
- So we simply force inlining via the magic inline function.
- For T3294 this improves allocation with -O by ~1%.
+However in some specific cases we call it *very* often making it
+important to avoid the overhead of allocating the folding function.
+So we simply force inlining via the magic inline function.
+For T3294 this improves allocation with -O by ~1%.
-}
--- Returns True if node defines any global registers that are used in the
--- Cmm expression
+-- | Returns @True@ if @node@ defines any global registers that are used in the
+-- Cmm expression.
+--
+-- See (C1) in Note [When does an assignment conflict?].
globalRegistersConflict :: Platform -> CmmExpr -> CmmNode e x -> Bool
globalRegistersConflict platform expr node =
-- See Note [Inlining foldRegsDefd]
inline foldRegsDefd platform (\b r -> b || globalRegUsedIn platform (globalRegUse_reg r) expr)
False node
+ -- NB: no need to worry about (C1a), as the LHS of an assignment is always
+ -- a local register, never a global register.
--- Returns True if node defines any local registers that are used in the
--- Cmm expression
-localRegistersConflict :: Platform -> CmmExpr -> CmmNode e x -> Bool
-localRegistersConflict platform expr node =
- -- See Note [Inlining foldRegsDefd]
- inline foldRegsDefd platform (\b r -> b || regUsedIn platform (CmmLocal r) expr)
- False node
-
--- Note [Sinking and calls]
--- ~~~~~~~~~~~~~~~~~~~~~~~~
--- We have three kinds of calls: normal (CmmCall), safe foreign (CmmForeignCall)
--- and unsafe foreign (CmmUnsafeForeignCall). We perform sinking pass after
--- stack layout (see Note [Sinking after stack layout]) which leads to two
--- invariants related to calls:
---
--- a) during stack layout phase all safe foreign calls are turned into
--- unsafe foreign calls (see Note [Lower safe foreign calls]). This
--- means that we will never encounter CmmForeignCall node when running
--- sinking after stack layout
---
--- b) stack layout saves all variables live across a call on the stack
--- just before making a call (remember we are not sinking assignments to
--- stack):
---
--- L1:
--- x = R1
--- P64[Sp - 16] = L2
--- P64[Sp - 8] = x
--- Sp = Sp - 16
--- call f() returns L2
--- L2:
---
--- We will attempt to sink { x = R1 } but we will detect conflict with
--- { P64[Sp - 8] = x } and hence we will drop { x = R1 } without even
--- checking whether it conflicts with { call f() }. In this way we will
--- never need to check any assignment conflicts with CmmCall. Remember
--- that we still need to check for potential memory conflicts.
---
--- So the result is that we only need to worry about CmmUnsafeForeignCall nodes
--- when checking conflicts (see Note [Unsafe foreign calls clobber caller-save registers]).
--- This assumption holds only when we do sinking after stack layout. If we run
--- it before stack layout we need to check for possible conflicts with all three
--- kinds of calls. Our `conflicts` function does that by using a generic
--- foldRegsDefd and foldRegsUsed functions defined in DefinerOfRegs and
--- UserOfRegs typeclasses.
+-- | Given an assignment @local_reg := expr@, return @True@ if @node@ defines any
+-- local registers mentioned in the assignment.
--
+-- See (C1) in Note [When does an assignment conflict?].
+localRegistersConflict :: Platform -> Assignment -> CmmNode e x -> Bool
+localRegistersConflict platform (r, expr, _) node =
+ -- See Note [Inlining foldRegsDefd]
+ inline foldRegsDefd platform
+ (\b r' ->
+ b
+ || r' == r -- (C1a)
+ || regUsedIn platform (CmmLocal r') expr -- (C1b)
+ )
+ False node
+
+{- Note [Sinking and calls]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We have three kinds of calls: normal (CmmCall), safe foreign (CmmForeignCall)
+and unsafe foreign (CmmUnsafeForeignCall). We perform sinking pass after
+stack layout (see Note [Sinking after stack layout]) which leads to two
+invariants related to calls:
+
+ a) during stack layout phase all safe foreign calls are turned into
+ unsafe foreign calls (see Note [Lower safe foreign calls]). This
+ means that we will never encounter CmmForeignCall node when running
+ sinking after stack layout
+
+ b) stack layout saves all variables live across a call on the stack
+ just before making a call (remember we are not sinking assignments to
+ stack):
+
+ L1:
+ x = R1
+ P64[Sp - 16] = L2
+ P64[Sp - 8] = x
+ Sp = Sp - 16
+ call f() returns L2
+ L2:
+
+ We will attempt to sink { x = R1 } but we will detect conflict with
+ { P64[Sp - 8] = x } and hence we will drop { x = R1 } without even
+ checking whether it conflicts with { call f() }. In this way we will
+ never need to check any assignment conflicts with CmmCall. Remember
+ that we still need to check for potential memory conflicts.
+
+So the result is that we only need to worry about CmmUnsafeForeignCall nodes
+when checking conflicts (see Note [Unsafe foreign calls clobber caller-save registers]).
+This assumption holds only when we do sinking after stack layout. If we run
+it before stack layout we need to check for possible conflicts with all three
+kinds of calls. Our `conflicts` function does that by using a generic
+foldRegsDefd and foldRegsUsed functions defined in DefinerOfRegs and
+UserOfRegs typeclasses.
+-}
-- An abstraction of memory read or written.
data AbsMem
=====================================
compiler/GHC/CmmToAsm/Reg/Linear.hs
=====================================
@@ -504,8 +504,8 @@ genRaInsn block_live new_instrs block_id instr r_dying w_dying = do
platform <- getPlatform
case regUsageOfInstr platform instr of { RU read written ->
do
- let real_written = [ rr | RegWithFormat {regWithFormat_reg = RegReal rr} <- written ]
- let virt_written = [ VirtualRegWithFormat vr fmt | RegWithFormat (RegVirtual vr) fmt <- written ]
+ let real_written = [ rr | RegWithFormat {regWithFormat_reg = RegReal rr} <- written ]
+ let virt_written = [ VirtualRegWithFormat vr fmt | RegWithFormat (RegVirtual vr) fmt <- written ]
-- we don't need to do anything with real registers that are
-- only read by this instr. (the list is typically ~2 elements,
@@ -939,35 +939,39 @@ allocRegsAndSpill_spill reading keep spills alloc r@(VirtualRegWithFormat vr fmt
-- we have a temporary that is in both register and mem,
-- just free up its register for use.
- | (temp, (RealRegUsage my_reg _old_fmt), slot) : _ <- candidates_inBoth
- = do spills' <- loadTemp r spill_loc my_reg spills
+ | (temp, (RealRegUsage cand_reg _old_fmt), slot) : _ <- candidates_inBoth
+ = do spills' <- loadTemp r spill_loc cand_reg spills
let assig1 = addToUFM_Directly assig temp (InMem slot)
- let assig2 = addToUFM assig1 vr $! newLocation spill_loc (RealRegUsage my_reg fmt)
+ let assig2 = addToUFM assig1 vr $! newLocation spill_loc (RealRegUsage cand_reg fmt)
setAssigR $ toRegMap assig2
- allocateRegsAndSpill reading keep spills' (my_reg:alloc) rs
+ allocateRegsAndSpill reading keep spills' (cand_reg:alloc) rs
-- otherwise, we need to spill a temporary that currently
-- resides in a register.
- | (temp_to_push_out, RealRegUsage my_reg fmt) : _
+ | (temp_to_push_out, RealRegUsage cand_reg old_reg_fmt) : _
<- candidates_inReg
= do
- (spill_store, slot) <- spillR (RegWithFormat (RegReal my_reg) fmt) temp_to_push_out
+ -- Spill what's currently in the register, with the format of what's in the register.
+ (spill_store, slot) <- spillR (RegWithFormat (RegReal cand_reg) old_reg_fmt) temp_to_push_out
-- record that this temp was spilled
recordSpill (SpillAlloc temp_to_push_out)
- -- update the register assignment
+ -- Update the register assignment:
+ -- - the old data is now only in memory,
+ -- - the new data is now allocated to this register;
+ -- make sure to use the new format (#26542)
let assig1 = addToUFM_Directly assig temp_to_push_out (InMem slot)
- let assig2 = addToUFM assig1 vr $! newLocation spill_loc (RealRegUsage my_reg fmt)
+ let assig2 = addToUFM assig1 vr $! newLocation spill_loc (RealRegUsage cand_reg fmt)
setAssigR $ toRegMap assig2
-- if need be, load up a spilled temp into the reg we've just freed up.
- spills' <- loadTemp r spill_loc my_reg spills
+ spills' <- loadTemp r spill_loc cand_reg spills
allocateRegsAndSpill reading keep
(spill_store ++ spills')
- (my_reg:alloc) rs
+ (cand_reg:alloc) rs
-- there wasn't anything to spill, so we're screwed.
=====================================
testsuite/tests/simd/should_run/T26542.hs
=====================================
@@ -0,0 +1,52 @@
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
+module Main where
+
+import GHC.Exts
+
+type D8# = (# DoubleX2#, Double#, DoubleX2#, Double#, DoubleX2# #)
+type D8 = (Double, Double, Double, Double, Double, Double, Double, Double)
+
+unD# :: Double -> Double#
+unD# (D# x) = x
+
+mkD8# :: Double -> D8#
+mkD8# x =
+ (# packDoubleX2# (# unD# x, unD# (x + 1) #)
+ , unD# (x + 2)
+ , packDoubleX2# (# unD# (x + 3), unD# (x + 4) #)
+ , unD# (x + 5)
+ , packDoubleX2# (# unD# (x + 6), unD# (x + 7) #)
+ #)
+{-# NOINLINE mkD8# #-}
+
+unD8# :: D8# -> D8
+unD8# (# v0, x2, v1, x5, v2 #) =
+ case unpackDoubleX2# v0 of
+ (# x0, x1 #) ->
+ case unpackDoubleX2# v1 of
+ (# x3, x4 #) ->
+ case unpackDoubleX2# v2 of
+ (# x6, x7 #) ->
+ (D# x0, D# x1, D# x2, D# x3, D# x4, D# x5, D# x6, D# x7)
+{-# NOINLINE unD8# #-}
+
+type D32# = (# D8#, D8#, D8#, D8# #)
+type D32 = (D8, D8, D8, D8)
+
+mkD32# :: Double -> D32#
+mkD32# x = (# mkD8# x, mkD8# (x + 8), mkD8# (x + 16), mkD8# (x + 24) #)
+{-# NOINLINE mkD32# #-}
+
+unD32# :: D32# -> D32
+unD32# (# x0, x1, x2, x3 #) =
+ (unD8# x0, unD8# x1, unD8# x2, unD8# x3)
+{-# NOINLINE unD32# #-}
+
+main :: IO ()
+main = do
+ let
+ !x = mkD32# 0
+ !ds = unD32# x
+ print ds
=====================================
testsuite/tests/simd/should_run/T26542.stdout
=====================================
@@ -0,0 +1 @@
+((0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0),(8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0),(16.0,17.0,18.0,19.0,20.0,21.0,22.0,23.0),(24.0,25.0,26.0,27.0,28.0,29.0,30.0,31.0))
=====================================
testsuite/tests/simd/should_run/T26550.hs
=====================================
@@ -0,0 +1,29 @@
+{-# LANGUAGE MagicHash #-}
+{-# LANGUAGE UnboxedTuples #-}
+
+module Main where
+
+import GHC.Exts
+
+type D3# = (# Double#, DoubleX2# #)
+
+unD# :: Double -> Double#
+unD# (D# x) = x
+
+mkD3# :: Double -> D3#
+mkD3# x =
+ (# unD# (x + 2)
+ , packDoubleX2# (# unD# (x + 3), unD# (x + 4) #)
+ #)
+{-# NOINLINE mkD3# #-}
+
+main :: IO ()
+main = do
+ let
+ !(# _ten, eleven_twelve #) = mkD3# 8
+ !(# eleven, twelve #) = unpackDoubleX2# eleven_twelve
+
+ putStrLn $ unlines
+ [ "eleven: " ++ show (D# eleven)
+ , "twelve: " ++ show (D# twelve)
+ ]
=====================================
testsuite/tests/simd/should_run/T26550.stdout
=====================================
@@ -0,0 +1,3 @@
+eleven: 11.0
+twelve: 12.0
+
=====================================
testsuite/tests/simd/should_run/all.T
=====================================
@@ -51,6 +51,11 @@ test('int64x2_shuffle_baseline', [], compile_and_run, [''])
test('T25658', [], compile_and_run, ['']) # #25658 is a bug with SSE2 code generation
test('T25659', [], compile_and_run, [''])
+# This test case uses SIMD instructions, even though the bug isn't in any way
+# tied to SIMD registers. It's useful to include it in this file so that
+# we re-use the logic for which architectures to run the test on.
+test('T26550', [], compile_and_run, ['-O1 -fno-worker-wrapper'])
+
# Ensure we set the CPU features we have available.
#
# This is especially important with the LLVM backend, as LLVM can otherwise
@@ -139,6 +144,7 @@ test('T22187', [],compile,[''])
test('T22187_run', [],compile_and_run,[''])
test('T25062_V16', [], compile_and_run, [''])
test('T25561', [], compile_and_run, [''])
+test('T26542', [], compile_and_run, [''])
# Even if the CPU we run on doesn't support *executing* those tests we should try to
# compile them.
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6ead7d06a0d83db8a3c2931103b7723...
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/6ead7d06a0d83db8a3c2931103b7723...
You're receiving this email because of your account on gitlab.haskell.org.