How far compilers are allowed to go with optimizations?

Hi all, some time ago me and my friend had a discussion about optimizations performed by GHC. We wrote a bunch of nested list comprehensions that selected objects based on their properties (e.g. finding products with a price higher than 50). Then we rewrote our queries in a more efficient form by using tree-based maps as indices over the data. My friend knows how to turn nested list comprehensions into queries that use maps and claims this could be automated. He argued that it would be fine if compiler did such an optimization automatically, since we can guarantee that both versions have exactly the same semantics and if they have the same semantics we are free to use faster version. From a functional point of view this is a good point, but nevertheless I objected to his opinion, claiming that if compiler performed such a high-level optimization - replace underlying data structure with a different one and turn one algorithm into a completely different one - programmer wouldn't be able to reason about space behaviour of a program. I concluded that such a solution should not be built into a compiler but instead turned into an EDSL. I would like to hear your opinion on this. How far a compiler can go with transforming code? I don't want to concentrate on discussing whether such an optimization is possible to implement from a technical point of view. What I'm interested in is whether such kind of high-level optimization could be accepted. Janek

You don't reason about the bits churned out by a compiler but about the actual code you write. If you want to preserve such information during the compilation process, you probably want to run the compiler without any optimization flags at all. At the moment, with the way you are thinking about it, you'd have to reason about every program based on every compiler version it can go through. What happens instead is that people prove their program correct and other people prove the compiler correct. What it does behind the scenes doesn't matter: it's meant to preserve the exact semantics and if it does extra stuff on top that doesn't change those (like optimisation) you should be happy as now it will run more efficiently in practice, and not just on paper. On 06/02/13 11:45, Jan Stolarek wrote:
programmer wouldn't be able to reason about space behaviour of a program

This is pretty much a core idea behind Data Parallel Haskell - it
transforms nested data parallel programs into flat ones. That's
crucial to actually making it perform well and is an algorithmic
change to your program. If you can reason about your program, and
perhaps have an effective cost model for reasoning about how it *will*
behave, then I don't see why not.*
Now, on a slight tangent, in practice, I guess it depends on your
target market. C programs don't necessarily expose the details to make
such rich optimizations possible. And Haskell programmers generally
rely on optimizations to make order of magnitude performance
difference in constant factors, although perhaps not in direct big-O
terms (and no, this isn't necessarily bad) - you will rarely see such
a change from various optimizing C compilers. The kinds and scope of
optimization opportunities are simply different. We're also not immune
to violations of intuition because of Compiler Things that have
nothing to do with data structures; even 'functionally equivalent'
programs can have drastically different space characteristics, just
due to sharing changes from CSE, eta reduction, or how the compiler is
feeling on any particular day of the week (I kid, I kid.) But overall,
we really do have vast amounts of robust, general optimization
opportunities - so I don't see why not try and exploit them if
reasonable.
* Note that intuitions about how the compiler performs tasks like
inlining/rewrites/boxing are not necessarily the most theoretically
nice or even sane ways to reason about things. Which is what we do
now. In practice you do need to know about performance characteristics
with pretty much any language and a million other specific things, and
I'm not sure Haskell is necessarily worse off here than anything else.
On Wed, Feb 6, 2013 at 5:45 AM, Jan Stolarek
Hi all,
some time ago me and my friend had a discussion about optimizations performed by GHC. We wrote a bunch of nested list comprehensions that selected objects based on their properties (e.g. finding products with a price higher than 50). Then we rewrote our queries in a more efficient form by using tree-based maps as indices over the data. My friend knows how to turn nested list comprehensions into queries that use maps and claims this could be automated. He argued that it would be fine if compiler did such an optimization automatically, since we can guarantee that both versions have exactly the same semantics and if they have the same semantics we are free to use faster version. From a functional point of view this is a good point, but nevertheless I objected to his opinion, claiming that if compiler performed such a high-level optimization - replace underlying data structure with a different one and turn one algorithm into a completely different one - programmer wouldn't be able to reason about space behaviour of a program. I concluded that such a solution should not be built into a compiler but instead turned into an EDSL.
I would like to hear your opinion on this. How far a compiler can go with transforming code? I don't want to concentrate on discussing whether such an optimization is possible to implement from a technical point of view. What I'm interested in is whether such kind of high-level optimization could be accepted.
Janek
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Regards, Austin

On Wed, Feb 6, 2013 at 1:18 PM, Austin Seipp
Now, on a slight tangent, in practice, I guess it depends on your target market. C programs don't necessarily expose the details to make such rich optimizations possible. And Haskell programmers generally rely on optimizations to make order of magnitude performance difference in constant factors, although perhaps not in direct big-O terms (and no, this isn't necessarily bad)
I think some GHC optimizations do change at least space usage, even in big-O terms. For example, strictness analysis can make folds go from O(n) to O(1) space, I think. Regards, Erik

You're right, somehow I didn't thought that DPH is doing exactly the same thing. Well, I think this is a convincing argument. Janek

As a software developer, who typically inherits code to work on rather
than simply writing new, I see a potential of aggressive compiler
optimizations causing trouble. It goes like this:
Programmer P inherits some application/system to improve upon. Someday
he spots some piece of rather badly written code. So he sets out and
rewrites that piece, happy with the improvement it brings to clarity
and likely also to efficiency.
The code goes into production and, disaster. The new "improved"
version runs 3 times slower than the old, making it practically
unusable. The new version has to be rolled back with loss of uptime
and functionality and management is not happy with P.
It just so happened that the old code triggered some aggressive
optimization unbeknownst to everyone, **including the original
developer**, while the new code did not. (This optimization maybe even
was triggered only on a certain version of the compiler, the one that
happened to be in use at the time.)
I fear being P some day.
Maybe this is something that would never happen in practice, but how
to be sure...
/Johan
2013/2/6 Jan Stolarek
You're right, somehow I didn't thought that DPH is doing exactly the same thing. Well, I think this is a convincing argument.
Janek
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sat, Feb 09, 2013 at 09:56:12AM +0100, Johan Holmquist wrote:
As a software developer, who typically inherits code to work on rather than simply writing new, I see a potential of aggressive compiler optimizations causing trouble. It goes like this:
Programmer P inherits some application/system to improve upon. Someday he spots some piece of rather badly written code. So he sets out and rewrites that piece, happy with the improvement it brings to clarity and likely also to efficiency.
The code goes into production and, disaster. The new "improved" version runs 3 times slower than the old, making it practically unusable. The new version has to be rolled back with loss of uptime and functionality and management is not happy with P.
It just so happened that the old code triggered some aggressive optimization unbeknownst to everyone, **including the original developer**, while the new code did not. (This optimization maybe even was triggered only on a certain version of the compiler, the one that happened to be in use at the time.)
I fear being P some day.
Maybe this is something that would never happen in practice, but how to be sure...
An interesting point, but I think here "P" is still at fault. If we're talking about important software, there will be regression tests, both in terms of quality and performance? Surely there will be a canary period, parallel running of the old and new system, etc.? regards, iustin

On 02/09/2013 09:56 AM, Johan Holmquist wrote: [--snip--]
It just so happened that the old code triggered some aggressive optimization unbeknownst to everyone, **including the original developer**, while the new code did not. (This optimization maybe even was triggered only on a certain version of the compiler, the one that happened to be in use at the time.)
I fear being P some day.
Maybe this is something that would never happen in practice, but how to be sure...
It's definitely a valid point, but isn't that an argument *for* testing for preformance regressions rather than *against* compiler optimizations? Actually, it'd be really nice if you could have statically verifiable big-O performance assertions in the code. I'm guessing that a lot of work will have been done in this area. Anyone have any pointers to such work? Regards,

I guess I fall more to the "reason about code" side of the scale rather than "testing the code" side. Testing seem to induce false hopes about finding all defects even to the point where the tester is blamed for not finding a bug rather than the developer for introducing it. [Bardur]
It's definitely a valid point, but isn't that an argument *for* testing for preformance regressions rather than *against* compiler optimizations?
We could test for regressions and pass. Then upgrade to a new version of compiler and test would no longer pass. And vice versa. Maybe that's your point too. :) [Iustin]
Surely there will be a canary period, parallel running of the old and new system, etc.?
Is that common? I have not seen it and I do think my workplace is a rather typical one. Also, would we really want to preserve the old "bad" code just because it happened to trigger some optimization? Don't get me wrong, I am all for compiler optimizations and the benefits they bring as well as testing. Just highlighting some potential downsides. Regards /Johan

On 02/09/2013 10:50 AM, Johan Holmquist wrote:
I guess I fall more to the "reason about code" side of the scale rather than "testing the code" side. Testing seem to induce false hopes about finding all defects even to the point where the tester is blamed for not finding a bug rather than the developer for introducing it.
Oh, I'm definitely also on that side, but you have to do the best you can with the tools you have :).
[Bardur]
It's definitely a valid point, but isn't that an argument *for* testing for preformance regressions rather than *against* compiler optimizations?
We could test for regressions and pass. Then upgrade to a new version of compiler and test would no longer pass. And vice versa. Maybe that's your point too. :)
Indeed :).
[Iustin]
Surely there will be a canary period, parallel running of the old and new system, etc.?
Is that common? I have not seen it and I do think my workplace is a rather typical one.
I don't know about "common", but I've seen it done a few times. However, it's mostly been in situations where major subsystems have been rewritten and you _really_ want to make sure things still work as they should in production. Sometimes you can get away with just making the new-and-shiny code path a configure-time option and keeping the old-and-beaten code path. (Tends to be messy code-wise until you can excise the old code path, but what're you gonna do?)
Also, would we really want to preserve the old "bad" code just because it happened to trigger some optimization?
These things depend a lot on the situation at hand -- if it's something 99% of your users will hit, then yes, probably... until you can figure out why the new-and-shiny code *doesn't* get optimized appropriately.
Don't get me wrong, I am all for compiler optimizations and the benefits they bring as well as testing. Just highlighting some potential downsides.
It's all tradeoffs :). Regards,

Inline.
On Sat, Feb 9, 2013 at 1:50 AM, Johan Holmquist
I guess I fall more to the "reason about code" side of the scale rather than "testing the code" side. Testing seem to induce false hopes about finding all defects even to the point where the tester is blamed for not finding a bug rather than the developer for introducing it.
Performance critical code should absolutely be analyzed regularly. Trivial changes such as adding a field to a structure can have a huge impact on performance. This is true in C or Haskell. When you find a performance regression use a profiler to help understand what caused the regression.
Surely there will be a canary period, parallel running of the old and new system, etc.?
Is that common? I have not seen it and I do think my workplace is a rather typical one.
Yes, at least in some circles. I've seen it commonly done in one of two forms: a) New version is rolled out that takes a duplicated stream of production traffic, new service results including timing are checked against old version and then discarded. After some bake time (days or more) without regressions, serve results from the new version and use the old version as a reference. b) New version is rolled to a subset of the servers first, (sometimes) initially biased towards serving employees or the local intranet. Run for a day or two before rolling out to more servers. Rinse, repeat. Also, would we really want to preserve the old "bad" code just because
it happened to trigger some optimization?
The old code doesn't sound so bad if you goal is performance. The old code also should give a lot of details in terms of core and assembly that should help someone fix the new code. -n

On Sat, Feb 9, 2013 at 3:56 AM, Johan Holmquist
The code goes into production and, disaster. The new "improved" version runs 3 times slower than the old, making it practically unusable. The new version has to be rolled back with loss of uptime and functionality and management is not happy with P.
It just so happened that the old code triggered some aggressive optimization unbeknownst to everyone, **including the original developer**, while the new code did not. (This optimization maybe even
This leads ultimately to not allowing compilers to optimize at all. I suspect that's a bad plan. Keep in mind that a modern web application may be heavily enough used that it doesn't even need to be a "hyper-optimization"; even small changes in performance can scale to large performance differences. Also... what happens when it's not just manual optimization but a bug fix that triggers this? Maybe this is something that would never happen in practice, but how
to be sure...
If this really scares you, disable all compiler optimization. Now you can be sure even at large scales where even small changes can have huge effects... and now you'd better be good at hand optimization. And writing code in assembly language so you can get that optimization. This sounds like going backwards to me. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

As a software developer, who typically inherits code to work on rather than simply writing new, I see a potential of aggressive compiler optimizations causing trouble.
I would be grateful if someone could explain the difference between "aggressive optimisation" and "obviously sensible compilation" to me in a way that doesn't boil down to "what I'm used to". These days, C compilers do things like automatic vectorization that were regarded as "aggressive optimisation" back when C was new and Real Programmers used 'cat' as their editor of choice. (Been there, done that, don't fit the T-shirt any more.)
Programmer P inherits some application/system to improve upon. Someday he spots some piece of rather badly written code. So he sets out and rewrites that piece, happy with the improvement it brings to clarity and likely also to efficiency.
The code goes into production and, disaster. The new "improved" version runs 3 times slower than the old, making it practically unusable. The new version has to be rolled back with loss of uptime and functionality and management is not happy with P.
There are three fairly obvious comments here. (1) The real villain is the original programmer who didn't leave comments explaining _why_ the code was written in a strange way. (2) It is far more likely that clean code will trigger an optimisation than dirty code. Case in point: I have some 4-way merge code that was written in C with heavy use of M4 to create inlined unfolded stuff that kept registers full and produced serious computational goodness. It gave a really nice performance boost in the old days. Modern compilers take one glance at it, turn up their noses, and switch several kinds of optimisation off, so that it actually runs slower than naive code. (It still _could_ be compiled to go blindingly fast, it's just that compilers say "too many labels, too hard" and stop trying.) (3) A performance regression test would catch this, no?
It just so happened that the old code triggered some aggressive optimization unbeknownst to everyone, **including the original developer**, while the new code did not.
So what _was_ the original developer's reason for writing strange code?

I was about to leave this topic not to swamp the list with something that appears to go nowere. But now I feel that I must answer the comments, so here it goes. By "agressive optimisation" I mean an optimisation that drastically reduces run-time performance of (some part of) the program. So I guess automatic vectorisation could fall into this term. In my scenario the original programmer did not have any reason for the strange code -- it just happened. I'm sure we all write bad/suboptimal code sometimes. Hence there would be no comment describing this in the code. And compiling without optimisations would not prevent the above -- only if the original programmer(s) had compiled without optimisations but this was obviously not the case. We seem to agree that the only way to know what a change will do to the performance of a program is testing (using whatever tools are available). That is what makes me a bit uncomfortable because then we don't really understand it. Disabling all optimisations would definitely be a step backwards and not where we want to go. Then again, maybe this problem is mostly of theoretical concern and not at all common in practice. I guess it **is** far more likely that clean code will trigger optimisations. But we cannot really know for sure... Regards Johan

On 2/11/13 11:47 AM, Johan Holmquist wrote:
I was about to leave this topic not to swamp the list with something that appears to go nowere. But now I feel that I must answer the comments, so here it goes.
By "agressive optimisation" I mean an optimisation that drastically reduces run-time performance of (some part of) the program. So I guess automatic vectorisation could fall into this term.
In that case, I'd say automatic vectorization counts. List fusion also counts, but I'm pretty sure you don't want to get rid of that either. IMHO, the only thing that distinguishes "aggressive" optimizations from the rest is that programmers find them to be finicky. There are two general causes of perceived finickiness: (1) The optimization really is finicky and is triggered or not in highly unpredictable ways. This is often the case when a optimization is new, because it hasn't been battle-tested enough to make it reliable to the diversity we see in real-world code. The early days of list fusion were certainly like that. As were the early days of optimizations that depend on alias detection. IME, these things tend to get straightened out in a few version cycles. If they don't, then the optimization truly is finicky and therefore is something bad that should be avoided. I haven't found that situation to be very common however. (2) The optimization is reliable (enough) but the programmer doesn't understand it well enough and thus inadvertently breaks it when doing "innocuous" code transformations. Again, this is generally the case any time a new optimization shows up. The only way to fix this, really, is to wait for people to learn new habits and to internalize a new model of how the compiler works. Good explanations of the technology often help that process along; but don't obviate the need for the process. Both of those situations are triggered by an optimization being new, and both of them tend to be resolved when the optimization is no longer new. Thus, I don't think it makes much sense to disallow compilers from making "aggressive" optimizations, because it is only through doing so that those optimizations can be rendered non-aggressive. In all of this, I'm reminded of a recent article on code metrics: http://www.neverworkintheory.org/?p=488 The key idea there is to use automatically generated code refactorings in order to see how different versions of "the same" code were rated. Perhaps it'd be worth doing this sort of thing in order to identify which optimizations are stable under particular code transformations. -- Live well, ~wren

On Mon, Feb 11, 2013 at 6:47 PM, Johan Holmquist
By "agressive optimisation" I mean an optimisation that drastically reduces run-time performance of (some part of) the program. So I guess automatic vectorisation could fall into this term.
Even something like running the program on a different CPU or different OS version can "drastically" improve or harm the performance of a particular program, without any change in the compiler itself. If you care about performance, the only real recourse is to have benchmarks / performance tests that verify the things you care about, and run them regularly in your target environment so that any performance-critical changes are noticed right away. -- mithrandi, i Ainil en-Balandor, a faer Ambar

On Wed, Feb 6, 2013 at 6:45 AM, Jan Stolarek
nevertheless I objected to his opinion, claiming that if compiler performed such a high-level optimization - replace underlying data structure with a different one and turn one algorithm into a completely different one - programmer wouldn't be able to reason about space behaviour of a program. I concluded that such a solution should not be built into a compiler but instead turned into an EDSL.
For what it's worth, the main dividing line between -O1 and -O2 in gcc is that -O2 may change space or time behavior in unexpected ways. (This may explain e.g. https://plus.google.com/u/0/102208456519922110915/posts/DZsZ6mvA4T6) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net
participants (12)
-
Austin Seipp
-
Bardur Arantsson
-
Brandon Allbery
-
Erik Hesselink
-
Iustin Pop
-
Jan Stolarek
-
Johan Holmquist
-
Mateusz Kowalczyk
-
Nathan Howell
-
ok@cs.otago.ac.nz
-
Tristan Seligmann
-
wren ng thornton