
I don't know my way around the GHC source tree. How can I get the list of optimizations implemented with Hoopl? Is there overlap with LLVM's optimization passes? If so, has anyone compared the implementations at all? Should one group be stealing ideas from the other? Or apples and oranges? Thanks, Greg

Hello Greg, Hoopl passes live in compiler/cmm; searching for DataflowLattice will turn up lattice definitions which are the core of the analyses and rewrites. Unfortunately, the number of true Hoopl optimizations was somewhat reduced when Simon Marlow did aggressive performance optimizations to get the new code generator shipped with GHC by default, but I think we hope to add some more interesting passes for -O3, etc. Hoopl and LLVM's approaches to optimization are quite different. LLVM uses SSA representation, whereas Hoopl uses the Chamber-Lerner-Grove algorithm to do analyses without requiring single-assignment. The other barrier you're likely to run into is the fact that GHC generated C-- code looks very different from conventional compiler output. Hope that helps, Edward Excerpts from Greg Fitzgerald's message of Mon Dec 10 14:24:02 -0800 2012:
I don't know my way around the GHC source tree. How can I get the list of optimizations implemented with Hoopl? Is there overlap with LLVM's optimization passes? If so, has anyone compared the implementations at all? Should one group be stealing ideas from the other? Or apples and oranges?
Thanks, Greg

On Mon, Dec 10, 2012 at 2:24 PM, Greg Fitzgerald
Should one group be stealing ideas from the other? Or apples and oranges?
In my opinion we should only implement optimizations in Hoopl that LLVM cannot do due to lack high-level information that we might have gotten rid of before we reach the LLVM code generator*. I don't think we should spend time re-implementing LLVM passes in Hoopl. We have limited time on our hands (much less than the LLVM team) and we're unlikely to do a better job than them here, as we're talking about implementing standard, imperative code optimization. I think our time is better spent on 1) making sure we pass enough information to LLVM for it to do a good job and 2) working on higher-level optimizations (e.g. Core-to-Core). * This also means that I think we should try to preserve any information LLVM might need all the way down to the code generator. -- Johan

| In my opinion we should only implement optimizations in Hoopl that
| LLVM cannot do due to lack high-level information that we might have
| gotten rid of before we reach the LLVM code generator*. I don't think
Indeed. And I think there is probably quite a lot that is in reach for C--, but out of reach for LLVM. Why? Because before we pass the code to LLVM we do CPS-conversion. So code that started like this:
f() {
x = blah
z = blah2
p,q = g(x)
res = z + p - q
return res
}
Turns into something more like this:
f () {
x = blah
z = blah2
...push z on stack...
...push fr1 onto stack...
jump g(x)
}
fr1( p,q ) {
z = ...pop from stack...
res = z + p - q
return res
}
Notice that the stack is now *explicit* rather than implicit, and LLVM has no hope of moving the assignment to z past the call to g (which is trivial in the original). I can explain WHY we do this (there is stuff on the wiki) but the fact is that we do, and it's no accident.
Some transformations and optimisations are only possible before the CPS conversion, which means LLVM can't do it. There is real meat here.
Moreover, one of Simon M's last acts was to switch to the "new codgen path", which uses the new C-- rep etc. That leaves us *ready* to do some traditional-style optimisations on C--, but we have not *actually done* so. The field is wide open.
For example, I had a masters student three years ago (Marcej Wos) who implement the classic tail-call-becomes-loop optimisation, and got a surprisingly large benefit. His code is rotted now, but I must dig out his Masters project report which described it all rather well.
In short, go for it. But as Johan says, concentrate on things that LLVM can't get.
Simon
| -----Original Message-----
| From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-
| bounces@haskell.org] On Behalf Of Johan Tibell
| Sent: 10 December 2012 22:40
| To: Greg Fitzgerald
| Cc: GHC Users Mailing List
| Subject: Re: Hoopl vs LLVM?
|
| On Mon, Dec 10, 2012 at 2:24 PM, Greg Fitzgerald

Cool info!
Would love to see that report if you can dig it up :)
-Carter
On Tue, Dec 11, 2012 at 2:16 PM, Simon Peyton-Jones
| In my opinion we should only implement optimizations in Hoopl that | LLVM cannot do due to lack high-level information that we might have | gotten rid of before we reach the LLVM code generator*. I don't think
Indeed. And I think there is probably quite a lot that is in reach for C--, but out of reach for LLVM. Why? Because before we pass the code to LLVM we do CPS-conversion. So code that started like this: f() { x = blah z = blah2 p,q = g(x) res = z + p - q return res } Turns into something more like this: f () { x = blah z = blah2 ...push z on stack... ...push fr1 onto stack... jump g(x) }
fr1( p,q ) { z = ...pop from stack... res = z + p - q return res }
Notice that the stack is now *explicit* rather than implicit, and LLVM has no hope of moving the assignment to z past the call to g (which is trivial in the original). I can explain WHY we do this (there is stuff on the wiki) but the fact is that we do, and it's no accident.
Some transformations and optimisations are only possible before the CPS conversion, which means LLVM can't do it. There is real meat here.
Moreover, one of Simon M's last acts was to switch to the "new codgen path", which uses the new C-- rep etc. That leaves us *ready* to do some traditional-style optimisations on C--, but we have not *actually done* so. The field is wide open.
For example, I had a masters student three years ago (Marcej Wos) who implement the classic tail-call-becomes-loop optimisation, and got a surprisingly large benefit. His code is rotted now, but I must dig out his Masters project report which described it all rather well.
In short, go for it. But as Johan says, concentrate on things that LLVM can't get.
Simon
| -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto: glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Johan Tibell | Sent: 10 December 2012 22:40 | To: Greg Fitzgerald | Cc: GHC Users Mailing List | Subject: Re: Hoopl vs LLVM? | | On Mon, Dec 10, 2012 at 2:24 PM, Greg Fitzgerald
wrote: | > Should one group be stealing ideas from the other? Or apples and oranges? | | In my opinion we should only implement optimizations in Hoopl that | LLVM cannot do due to lack high-level information that we might have | gotten rid of before we reach the LLVM code generator*. I don't think | we should spend time re-implementing LLVM passes in Hoopl. We have | limited time on our hands (much less than the LLVM team) and we're | unlikely to do a better job than them here, as we're talking about | implementing standard, imperative code optimization. I think our time | is better spent on 1) making sure we pass enough information to LLVM | for it to do a good job and 2) working on higher-level optimizations | (e.g. Core-to-Core). | | * This also means that I think we should try to preserve any | information LLVM might need all the way down to the code generator. | | -- Johan | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users _______________________________________________ Cvs-ghc mailing list Cvs-ghc@haskell.org http://www.haskell.org/mailman/listinfo/cvs-ghc

On Tue, Dec 11, 2012 at 11:16 AM, Simon Peyton-Jones
Notice that the stack is now *explicit* rather than implicit, and LLVM has no hope of moving the assignment to z past the call to g (which is trivial in the original). I can explain WHY we do this (there is stuff on the wiki) but the fact is that we do, and it's no accident.
I'd definitely be interesting in understanding why as it, like you say, makes it harder for LLVM to understand what our code does and optimize it well. -- Johan

On 11/12/12 21:33, Johan Tibell wrote:
On Tue, Dec 11, 2012 at 11:16 AM, Simon Peyton-Jones
wrote: Notice that the stack is now *explicit* rather than implicit, and LLVM has no hope of moving the assignment to z past the call to g (which is trivial in the original). I can explain WHY we do this (there is stuff on the wiki) but the fact is that we do, and it's no accident.
I'd definitely be interesting in understanding why as it, like you say, makes it harder for LLVM to understand what our code does and optimize it well.
The example that Simon gave is a good illustration: f() { x = blah z = blah2 p,q = g(x) res = z + p - q return res } In this function, for example, a Hoopl pass would be able to derive something about the value of z from its assignment (blah2), and use that information in the assignment to res, e.g. for constant propagation, or more powerful partial value optimisations. However, the code that LLVM sees will look like this: f () { x = blah z = blah2 Sp[8] = z jump g(x) } fr1( p,q ) { z = Sp[8]; res = z + p - q return res } Now, all that LLVM knows is that z was read from Sp[8], it has no more information about its value. Cheers, Simon

On Wed, Dec 12, 2012 at 4:35 AM, Simon Marlow
Now, all that LLVM knows is that z was read from Sp[8], it has no more information about its value.
Are you saying Hoopl can deduce the original form from the CPS-version? Or that LLVM can't encode the original form? Or within GHC, LLVM is thrown in late in the game, where neither Hoopl nor LLVM can be of much use. -Greg

On 12.12 09:06, Greg Fitzgerald wrote:
On Wed, Dec 12, 2012 at 4:35 AM, Simon Marlow
wrote: Now, all that LLVM knows is that z was read from Sp[8], it has no more information about its value.
Are you saying Hoopl can deduce the original form from the CPS-version? Or that LLVM can't encode the original form? Or within GHC, LLVM is thrown in late in the game, where neither Hoopl nor LLVM can be of much use.
-Greg
I think what happens is that in GHC the Hoopl optimizations are performed before the CPS transformation whereas LLVM runs after the transformation. So Hoopl will have the access to the easier version of the above code and LLVM to the more difficult one. Cheers, Michal

On 12/12/12 17:06, Greg Fitzgerald wrote:
On Wed, Dec 12, 2012 at 4:35 AM, Simon Marlow
mailto:marlowsd@gmail.com> wrote: Now, all that LLVM knows is that z was read from Sp[8], it has no more information about its value.
Are you saying Hoopl can deduce the original form from the CPS-version? Or that LLVM can't encode the original form? Or within GHC, LLVM is thrown in late in the game, where neither Hoopl nor LLVM can be of much use.
We can run Hoopl passes on the pre-CPS code, but LLVM only sees the post-CPS code. Cheers, Simon

On Wed, Dec 12, 2012 at 4:35 AM, Simon Marlow
On 11/12/12 21:33, Johan Tibell wrote:
I'd definitely be interesting in understanding why as it, like you say, makes it harder for LLVM to understand what our code does and optimize it well.
The example that Simon gave is a good illustration:
<snip>
My question was more: why do we CPS transform? I guess it's because we manage our own stack? -- Johan

On 12/12/12 17:37, Johan Tibell wrote:
On Wed, Dec 12, 2012 at 4:35 AM, Simon Marlow
wrote: On 11/12/12 21:33, Johan Tibell wrote:
I'd definitely be interesting in understanding why as it, like you say, makes it harder for LLVM to understand what our code does and optimize it well.
The example that Simon gave is a good illustration:
<snip>
My question was more: why do we CPS transform? I guess it's because we manage our own stack?
Right. In fact, LLVM does its own CPS transform (but doesn't call it that) when the code contains non-tail function calls. We give LLVM code with tail-calls only. The choice about whether to manage our own stack is *very* deep, and has ramifications all over the system. Changing it would mean a completely new backend and replacing a lot of the RTS, that is if you could find a good scheme for tracking pointers in the stack - I'm not sure LLVM is up to the job without more work. It could probably be done, but it's a huge undertaking and it's not at all clear that you could do any better than GHC currently does. We generate very good code from idiomatic Haskell; where we fall down is in heavy numerical and loopy code, where LLVM does a much better job. Cheers, Simon

| > My question was more: why do we CPS transform? I guess it's because we | > manage our own stack? | | Right. In fact, LLVM does its own CPS transform (but doesn't call it | that) when the code contains non-tail function calls. We give LLVM code | with tail-calls only. | | The choice about whether to manage our own stack is *very* deep, and has | ramifications all over the system. Changing it would mean a completely | new backend and replacing a lot of the RTS, that is if you could find a | good scheme for tracking pointers in the stack - I'm not sure LLVM is up | to the job without more work. It could probably be done, but it's a | huge undertaking and it's not at all clear that you could do any better | than GHC currently does. We generate very good code from idiomatic | Haskell; where we fall down is in heavy numerical and loopy code, where | LLVM does a much better job. Right on the money. Incidentally, for the heavy numerical work, the inner loops that GHC generates often don't have calls or allocations, and so they DO get through to LLVM which CAN optimise them. That's why DPH and vector programs benefit particularly from LLVM. Simon

Thank you all for your replies.
On Tue, Dec 11, 2012 at 11:16 AM, Simon Peyton-Jones
And I think there is probably quite a lot that is in reach for C--, but out of reach for LLVM. Why? Because before we pass the code to LLVM we do CPS-conversion.
Is there a case for making a C/C++ compiler target C-- instead of LLVM? Or does its optimizations cater more to functional programming or lazy evaluation? Thanks, Greg

Right now there isn't really an overlap with LLVM passes. In general
the feeling of many of the GHC developers is that we don't want to
spend time duplicating what LLVM already does. That said, GHC is an
open source project so someone may be inclined to do so and there
isn't a lot of reason not to accept such patches (Hoopl can be slow
though, so would be -O3 or -O2). We just feel that aligning GHC with
LLVM is the more productive approach going forward, especially when
for example its the only backend on ARM.
In terms of comparison, the high level arch is different as Edward
pointed out. Otherwise LLVM mostly implements fairly known
optimisation algorithms, so not much to steal there.
On 10 December 2012 14:24, Greg Fitzgerald
I don't know my way around the GHC source tree. How can I get the list of optimizations implemented with Hoopl? Is there overlap with LLVM's optimization passes? If so, has anyone compared the implementations at all? Should one group be stealing ideas from the other? Or apples and oranges?
Thanks, Greg
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (8)
-
Carter Schonwald
-
David Terei
-
Edward Z. Yang
-
Greg Fitzgerald
-
Johan Tibell
-
Michal Terepeta
-
Simon Marlow
-
Simon Peyton-Jones