
John Meacham wrote:
On Wed, Jan 18, 2006 at 06:18:29PM +0300, Bulat Ziganshin wrote:
:) even C version performs only 20 millions of additions in one second because this program is most limited by memory throughput - it access to 24 memory bytes per each addition. GHC just can't produce simple loops even for "imperative" code. JHC can be much better in that area, i strongly recommend Sven to try it
Jhc doesn't have 'true' arrays yet, partially because I have not decided how points-to analysis should work for them. (I will probably just union all their points-to information since they most likely will be filled by the same routine).
GHCs indirect calls are really killing its performance in tight loops. I think there is room for collaboration between the various compilers there, since we are all moving to a c-- back end (in theory) we could work on a common c-- -> C translator that searches out such uneeded indirections and zaps them before they get to gcc which doesn't handle them well at all.
A simple tail-recursive loop shouldn't contain any indirect jumps in GHC, we are careful to compile jumps to known locations into absolute jumps (though, of course we don't do points-to). My impression is that it is the lack of real low-level loop optimisation in GHC that is really hurting with these examples, and we don't get to take advantage of GCC's loop optimiser because the C code we generate doesn't look enough like a loop - that's something we can improve on in some cases, perhaps, but when I looked at it I didn't find any quick hacks to improve things. Cheers, Simon