code generation and backends Re: Design discussion for atomic primops to land in 7.8

27 Aug 2013

      Austin also raises an important point:

Its high time we explore changes to the ABI/per architecture calling
convention and how they might impact performance.

 I think this is something worth exploring after the 7.8 release
(but not in the next month of prerelease engineering, though I need to suss
out some of that a bit more with the parties who care)

I had started working on a backwards compatible ABI change for llvm a few
months back that I could easily get LLVM folks to merge in, but I'd rather
only give the LLVM folks 1 set of ABI change patches within a 6month
period, where we've had the time to experiment with breaking changes as an
option too, rather than given them a partial patch now, and then a breaking
changes calling convention patch a few months from now.

(because if we don't bundle LLVM with GHC, we need to actually tie ABI
versions to LLVM major version releases, which isn't a problem, but just
requires thoughtful cooperation and such)

cheers
-Carter

On Tue, Aug 27, 2013 at 12:51 PM, Austin Seipp  wrote:
...
To do this, IMO we'd also really have to start shipping our own copy
of LLVM. The current situation (use what we have configured or in
$PATH) won't really become feasible later on.
On platforms like ARM where there is no NCG, the mismatches can become
super painful, and it makes depending on certain features of the IR or
compiler toolchain (like an advanced, ISA-aware vectorizer in LLVM
3.3+) way more difficult, aside from being a management nightmare.
Fixing it does require taking a hit on things like build times,
though. Or we could use binary releases, but we occasionally may want
to tweak and/or fix things. If we ship our own LLVM for example, it's
reasonable to assume sometime in the future we'll want to change the
ABI during a release.
This does bring other benefits. Max Bolingbroke had an old alias
analysis plugin for LLVM that made a noticeable improvement on certain
kinds of programs, but shipping it against an arbitrary LLVM is
infeasible. Stuff like this could now be possible too.
In a way, I think there's some merit to having a simple, integrated
code generator that does the correct thing, with a high performance
option as we have now. LLVM is a huge project, and there's definitely
some part of me that thinks this may not lower our complexity budget
as much as we think, only shift parts of it around ('second rate'
platforms like PPC/ARM expose way more bugs in my experience, and
tracking them across such a massive surface area can be quite
difficult.) It's very stable and well tested, but an unequivocal
dependency on hundreds of thousands of lines of deeply complex code is
a big question no matter what.
But, the current NCG isn't that 'simple correct thing' either, though.
I think it's easily one of the least understood parts of the compiler
with a long history, it's rarely refactored or modified (very unlike
other parts,) and it's maintained only as necessary. Which doesn't
bode well for its future in any case.
...
On 26/08/13 08:17, Ben Lippmeier wrote:
...
...
> Well, what's the long term plan?  Is the LLVM backend going to
    become the only backend at some point?
I wouldn't argue against ditching the NCG entirely. It's hard to
    justify fixing NCG performance problems when fixing them won't
    make the NCG faster than LLVM, and everyone uses LLVM anyway.
We're going to need more and more SIMD support when processors
    supporting the Larrabee New Instructions (LRBni) appear on
    people's desks. At that time there still won't be a good enough
    reason to implement those instructions in the NCG.
I hope to implement SIMD support for the native code gen soon. It's
not a huge task and having feature parity between LLVM and NCG would
be good.
Will you also update the SIMD support, register allocators, and calling
conventions in 2015 when AVX-512 lands on the desktop? On all supported
platforms? What about support for the x86 vcompress and vexpand
instructions with mask registers? What about when someone finally asks
for packed conversions between 16xWord8s and 16xFloat32s where you need
to split the result into four separate registers? LLVM does that
automatically.
I've been down this path before. In 2007 I implemented a separate graph
colouring register allocator in the NCG to supposably improve GHC's
numeric performance, but the LLVM backend subsumed that work and now
having two separate register allocators is more of a maintenance burden
than a help to anyone. At the time, LLVM was just becoming well known,
so it wasn't obvious that implementing a new register allocator was a
largely a redundant piece of work -- but I think it's clear now. I was
happy to work on the project at the time, and I learned a lot from it,
but when starting new projects now I also try to imagine the system that
will replace the one I'm dreaming of.
Of course, you should do what interests you -- I'm just pointing out a
strategic consideration.
The existence of LLVM is definitely an argument not to put any more
effort
into backend optimisation in GHC, at least for those optimisations that
LLVM
can already do.
But as for whether the NCG is needed at all - there are a few ways that
...
LLVM backend needs to be improved before it can be considered to be a
complete replacement for the NCG:
1. Compilation speed.  LLVM approximately doubles compilation time.
Avoiding
going via the textual intermediate syntax would probably help here.
2. Shared library support (#4210, #5786).  It works (or worked?) on a
couple
of platforms.  But even on those platforms it generated worse code than
...
NCG due to using dynamic references for *all* symbols, whereas the NCG
knows
which symbols live in a separate package and need to use dynamic
references.
3. Some low-level optimisation problems (#4308, #5567).  The LLVM backend
generates bad code for certain critical bits of the runtime, perhaps due
to
lack of good aliasing information.  This hasn't been revisited in the
On Mon, Aug 26, 2013 at 3:19 PM, Simon Marlow  wrote:
the
the
light
...
of the new codegen, so perhaps it's better now.
Someone should benchmark the LLVM backend against the NCG with new
codegen
in GHC 7.8.  It's possible that the new codegen is getting a slight boost
because it doesn't have to split up proc points, so it can do better code
generation for let-no-escapes. (It's also possible that LLVM is being
penalised a bit for the same reason - I spent more time peering at
NCG-generated code than LLVM-generated code).
These are some good places to start if you want to see GHC drop the NCG.
Cheers,
        Simon
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs
--
Regards,
Austin - PGP: 4096R/0x91384671
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

Carter Schonwald

tags

participants (1)