
I can never resist messages like these, even when I'm meant to be doing other things. It's very helpful when people offer fairly precise performance-bug reports. Thanks! | I am wondering whether there is a particular reason why the | optimiser doesn't pull the | | (1) a = NO_CCS PArray! [wild1 mba#]; This one is a definite bug. It turns out that the head of the before-ghci-branch doesn't have this bug, so I'm disinclined to investigate it further. | (2) case w of wild3 { | I# e# -> | | As for (2), the loop would be nice and straight if that | unboxing where outside of the loop - as it is, we break the | pipeline once per iteration it seems This one is a bit harder. Basically we want to make a wrapper for a recursive function if it's sure to evaluate its free variables. In fact the 'liberate-case' pass (which isn't switched on in 4.08) is meant to do just this. It's in simplCore/LiberateCase.lhs, and it's not very complicated. I've just tried it and it doesn't seem to have the desired effect, but I'm sure that's for a boring reason. If anyone would like to fix it, go ahead! (You can't just say '-fliberate-case' on the command line to make it go; you have to add -fliberate-case at a sensible point to the minusOflags in driver/Main.hs.) Incidentally, you'll find that -ddump-simpl gives you a dump that is pretty close to STG and usually much more readable. Most performance bugs show up there. -dverbose-simpl gives you more clues about what is happening where. | Also if somebody is looking at the attached source, I was | wondering why, when I use the commented out code in | `newPArray', I get a lot worse code (the STG code is in a | comment at the end of the file). In particular, the lambda | abstraction is not inlined, whereas `fill' gets inlined into | the code of which the dump is above. Is it because the | compiler has a lot harder time with explicit recursion than | with fold/build? If so, the right RULES magic should allow | me to do the same for my own recursively defined | combinators, shouldn't it? I couldn't figure out exactly what you meant. The only commented out code is STG code. Maybe send a module with the actual source you are bothered about. S