Re: [commit: ghc] simd: Fixup stack spills when generating AVX instructions. (b787b5d)

Do we have a longer term solution for this? I love the SIMD work, so
please don't take this as a complaint, I agree with getting it done
for now with the mangler and truth is it's doubtful we'll et around to
adding TNTC support to LLVM anytime soon, so mangler is here to stay
for now.
BUT. I would like to see some thought about how in an ideal world with
lots of free time to hack on GHC we'd kill the mangler. Do we know
that plan for this patch? If so it may be nice to document it
somewhere and perhaps create a trac ticket.
On 14 February 2013 23:16, Geoffrey Mainland
Repository : ssh://darcs.haskell.org//srv/darcs/ghc
On branch : simd
http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc...
---------------------------------------------------------------
commit b787b5d1e687fb28643cdd6a847ccc26bb014a79 Author: Geoffrey Mainland
Date: Sat Nov 26 12:45:23 2011 +0000 Fixup stack spills when generating AVX instructions.
LLVM uses aligned AVX moves to spill values onto the stack, which requires 32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM inserts extra instructions that munge the stack pointer. This is very very bad for the GHC calling convention, so we tell LLVM to assume the stack is 32-byte aligned. This patch rewrites the spill instructions that LLVM generates so they do not require an aligned stack.
---------------------------------------------------------------
compiler/llvmGen/LlvmMangler.hs | 38 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs index 83a2be7..745dcc6 100644 --- a/compiler/llvmGen/LlvmMangler.hs +++ b/compiler/llvmGen/LlvmMangler.hs @@ -20,6 +20,10 @@ import System.IO import Data.List ( sortBy ) import Data.Function ( on )
+#if x86_64_TARGET_ARCH +#define REWRITE_AVX +#endif + -- Magic Strings secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified :: B.ByteString secStmt = B.pack "\t.section\t" @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler" #-} do w <- openBinaryFile f2 WriteMode ss <- readSections r w hClose r + let fixed = (map rewriteAVX . fixTables) ss let fixed = fixTables ss mapM_ (writeSection w) fixed hClose w @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do B.hPutStrLn w hdr B.hPutStrLn w cts
+#if REWRITE_AVX +rewriteAVX :: Section -> Section +rewriteAVX = rewriteVmovaps . rewriteVmovdqa + +rewriteVmovdqa :: Section -> Section +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu + where + vmovdqa, vmovdqu :: B.ByteString + vmovdqa = B.pack "vmovdqa" + vmovdqu = B.pack "vmovdqu" + +rewriteVmovap :: Section -> Section +rewriteVmovap = rewriteInstructions vmovap vmovup + where + vmovap, vmovup :: B.ByteString + vmovap = B.pack "vmovap" + vmovup = B.pack "vmovup" + +rewriteInstructions :: B.ByteString -> B.ByteString -> Section -> Section +rewriteInstructions matchBS replaceBS (hdr, cts) = + (hdr, loop cts) + where + loop :: B.ByteString -> B.ByteString + loop cts = + case B.breakSubstring cts matchBS of + (hd,tl) | B.null tl -> hd + | otherwise -> hd `B.append` replaceBS `B.append` + loop (B.drop (B.length matchBS) tl) +#else /* !REWRITE_AVX */ +rewriteAVX :: Section -> Section +rewriteAVX = id +#endif /* !REWRITE_SSE */ + -- | Reorder and convert sections so info tables end up next to the -- code. Also does stack fixups. fixTables :: [Section] -> [Section]
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits

I thought I might provoke you with that patch :) Note that it is only on the simd branch at the moment. The stack-alignment problem I'm having and the TNTC problem are the same in one respect: either we mangle LLVM's assembly output, or we get patches into LLVM. In an ideal world, we would get patches into LLVM that solve both issues. My issue would be solved if we could patch LLVM so that llc could be told to use unaligned moves for SSE/AVX register spills. I also need to revisit the Win32 stack alignment issue. Geoff On 02/14/2013 11:49 PM, David Terei wrote:
Do we have a longer term solution for this? I love the SIMD work, so please don't take this as a complaint, I agree with getting it done for now with the mangler and truth is it's doubtful we'll et around to adding TNTC support to LLVM anytime soon, so mangler is here to stay for now.
BUT. I would like to see some thought about how in an ideal world with lots of free time to hack on GHC we'd kill the mangler. Do we know that plan for this patch? If so it may be nice to document it somewhere and perhaps create a trac ticket.
On 14 February 2013 23:16, Geoffrey Mainland
wrote: Repository : ssh://darcs.haskell.org//srv/darcs/ghc
On branch : simd
http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc...
---------------------------------------------------------------
commit b787b5d1e687fb28643cdd6a847ccc26bb014a79 Author: Geoffrey Mainland
Date: Sat Nov 26 12:45:23 2011 +0000 Fixup stack spills when generating AVX instructions.
LLVM uses aligned AVX moves to spill values onto the stack, which
requires
32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM inserts extra instructions that munge the stack pointer. This is very very bad for the GHC calling convention, so we tell LLVM to assume the stack is 32-byte aligned. This patch rewrites the spill instructions that LLVM generates so they do not require an aligned stack.
---------------------------------------------------------------
compiler/llvmGen/LlvmMangler.hs | 38 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs index 83a2be7..745dcc6 100644 --- a/compiler/llvmGen/LlvmMangler.hs +++ b/compiler/llvmGen/LlvmMangler.hs @@ -20,6 +20,10 @@ import System.IO import Data.List ( sortBy ) import Data.Function ( on )
+#if x86_64_TARGET_ARCH +#define REWRITE_AVX +#endif + -- Magic Strings secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified :: B.ByteString secStmt = B.pack "\t.section\t" @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler" #-} do w <- openBinaryFile f2 WriteMode ss <- readSections r w hClose r + let fixed = (map rewriteAVX . fixTables) ss let fixed = fixTables ss mapM_ (writeSection w) fixed hClose w @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do B.hPutStrLn w hdr B.hPutStrLn w cts
+#if REWRITE_AVX +rewriteAVX :: Section -> Section +rewriteAVX = rewriteVmovaps . rewriteVmovdqa + +rewriteVmovdqa :: Section -> Section +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu + where + vmovdqa, vmovdqu :: B.ByteString + vmovdqa = B.pack "vmovdqa" + vmovdqu = B.pack "vmovdqu" + +rewriteVmovap :: Section -> Section +rewriteVmovap = rewriteInstructions vmovap vmovup + where + vmovap, vmovup :: B.ByteString + vmovap = B.pack "vmovap" + vmovup = B.pack "vmovup" + +rewriteInstructions :: B.ByteString -> B.ByteString -> Section -> Section +rewriteInstructions matchBS replaceBS (hdr, cts) = + (hdr, loop cts) + where + loop :: B.ByteString -> B.ByteString + loop cts = + case B.breakSubstring cts matchBS of + (hd,tl) | B.null tl -> hd + | otherwise -> hd `B.append` replaceBS `B.append` + loop (B.drop (B.length matchBS) tl) +#else /* !REWRITE_AVX */ +rewriteAVX :: Section -> Section +rewriteAVX = id +#endif /* !REWRITE_SSE */ + -- | Reorder and convert sections so info tables end up next to the -- code. Also does stack fixups. fixTables :: [Section] -> [Section]
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

OK. Could you add some documentation to this page on the mangler I
just created please?
http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM/M...
Just a sentence is enough, but I'd like to keep track of what we are
using the mangler for in a very readable form (i.e., more than the
source code).
Cheers,
David
On 15 February 2013 09:41, Geoffrey Mainland
I thought I might provoke you with that patch :) Note that it is only on the simd branch at the moment.
The stack-alignment problem I'm having and the TNTC problem are the same in one respect: either we mangle LLVM's assembly output, or we get patches into LLVM. In an ideal world, we would get patches into LLVM that solve both issues. My issue would be solved if we could patch LLVM so that llc could be told to use unaligned moves for SSE/AVX register spills. I also need to revisit the Win32 stack alignment issue.
Geoff
On 02/14/2013 11:49 PM, David Terei wrote:
Do we have a longer term solution for this? I love the SIMD work, so please don't take this as a complaint, I agree with getting it done for now with the mangler and truth is it's doubtful we'll et around to adding TNTC support to LLVM anytime soon, so mangler is here to stay for now.
BUT. I would like to see some thought about how in an ideal world with lots of free time to hack on GHC we'd kill the mangler. Do we know that plan for this patch? If so it may be nice to document it somewhere and perhaps create a trac ticket.
On 14 February 2013 23:16, Geoffrey Mainland
wrote: Repository : ssh://darcs.haskell.org//srv/darcs/ghc
On branch : simd
http://hackage.haskell.org/trac/ghc/changeset/b787b5d1e687fb28643cdd6a847ccc...
---------------------------------------------------------------
commit b787b5d1e687fb28643cdd6a847ccc26bb014a79 Author: Geoffrey Mainland
Date: Sat Nov 26 12:45:23 2011 +0000 Fixup stack spills when generating AVX instructions.
LLVM uses aligned AVX moves to spill values onto the stack, which
requires
32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM inserts extra instructions that munge the stack pointer. This is very very bad for the GHC calling convention, so we tell LLVM to assume the stack is 32-byte aligned. This patch rewrites the spill instructions that LLVM generates so they do not require an aligned stack.
---------------------------------------------------------------
compiler/llvmGen/LlvmMangler.hs | 38 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs index 83a2be7..745dcc6 100644 --- a/compiler/llvmGen/LlvmMangler.hs +++ b/compiler/llvmGen/LlvmMangler.hs @@ -20,6 +20,10 @@ import System.IO import Data.List ( sortBy ) import Data.Function ( on )
+#if x86_64_TARGET_ARCH +#define REWRITE_AVX +#endif + -- Magic Strings secStmt, infoSec, newLine, textStmt, dataStmt, syntaxUnified :: B.ByteString secStmt = B.pack "\t.section\t" @@ -47,6 +51,7 @@ llvmFixupAsm dflags f1 f2 = {-# SCC "llvm_mangler" #-} do w <- openBinaryFile f2 WriteMode ss <- readSections r w hClose r + let fixed = (map rewriteAVX . fixTables) ss let fixed = fixTables ss mapM_ (writeSection w) fixed hClose w @@ -90,6 +95,39 @@ writeSection w (hdr, cts) = do B.hPutStrLn w hdr B.hPutStrLn w cts
+#if REWRITE_AVX +rewriteAVX :: Section -> Section +rewriteAVX = rewriteVmovaps . rewriteVmovdqa + +rewriteVmovdqa :: Section -> Section +rewriteVmovdqa = rewriteInstructions vmovdqa vmovdqu + where + vmovdqa, vmovdqu :: B.ByteString + vmovdqa = B.pack "vmovdqa" + vmovdqu = B.pack "vmovdqu" + +rewriteVmovap :: Section -> Section +rewriteVmovap = rewriteInstructions vmovap vmovup + where + vmovap, vmovup :: B.ByteString + vmovap = B.pack "vmovap" + vmovup = B.pack "vmovup" + +rewriteInstructions :: B.ByteString -> B.ByteString -> Section -> Section +rewriteInstructions matchBS replaceBS (hdr, cts) = + (hdr, loop cts) + where + loop :: B.ByteString -> B.ByteString + loop cts = + case B.breakSubstring cts matchBS of + (hd,tl) | B.null tl -> hd + | otherwise -> hd `B.append` replaceBS `B.append` + loop (B.drop (B.length matchBS) tl) +#else /* !REWRITE_AVX */ +rewriteAVX :: Section -> Section +rewriteAVX = id +#endif /* !REWRITE_SSE */ + -- | Reorder and convert sections so info tables end up next to the -- code. Also does stack fixups. fixTables :: [Section] -> [Section]
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
participants (2)
-
David Terei
-
Geoffrey Mainland