
John Meacham
writes: I always use them in combination. simply because optimization can drastically change the memory useage/profile. profiling the unoptimized version seems rather moot.
Exactly. If this is indeed the intention, I guess the stack overflow is a bug?
Yes, strictly speaking. It may be a spurious case of the code generated with -O -prof being slightly different from that generated by -O alone - this happens because having cost centres floating around in the intermediate code disables some transformations that would normally be applicable. Nevertheless, if you can provide a test case we'll look into it.
BTW. does O2 still do not much more than -O? it seems to reduce the memory footprint of some of my apps pretty noticbly.
Not sure. The docs seem to indicate the speed improvement is negligible.
It's normally negligible. -O2 turns on one more optimisation pass, which *might* have a significant impact on your program if it hits the inner loop, but in most cases just costs you extra compilation time for not much benefit.
I thought I saw some hints in the GHC docs, including using -fvia-C, but I couldn't find them, and I'm not sure if they would be still current.
Memory footprint is a problem, I wonder if GHC makes any effort to pack strict data types? I.e.
data D1 = A | B data D2 = A2 | B2 | C2
data D3 = D !D1 !D2 -- could fit inside e.g. a Word8?
We don't do any useful optimsation of the representation of D3 here, but we could (it's been on my ToDo list since I implemented -funbox-strict-fields some time ago). In a similar vein, the strictness analyser doesn't take advantage of strict enumerated types - it could map them to Int#, for example. Semitagging (an optimisation in the works along with optimistic evaluation) will provide some of the benefit that a more efficient representation would yield here.
Is there an elegant way to achieve this manually (if I know I'll need large arrays of D3s, for instance -- can I map them to arrays of Word8 or a similar type?)
If you map D1 and D2 to Int by hand, then use data D3 = D !Int !Int you'll get the speed benefit, but not all the space (each field will still take up a 32-bit word). Cheers, Simon