
Hi all,
I've decided to try to implement the proposal included in the end of
this message. To do so I need to write a function
hasPointerSizedRepr :: TyCon -> Bool
This function would check that that the TyCon is either
* a newtype, which representation type has a pointer-sized representation, or
* an algebraic data type, with one field that has a pointer-sized
representation.
I'm kinda lost in all the data types that GHC defines to represent
types. I've gotten no further than
hasPointerSizedRepr :: TyCon -> Bool
hasPointerSizedRepr tc@(AlgTyCon {}) = case algTcRhs tc of
DataTyCon{ data_cons = [data_con] }
-> ...
NewTyCon { data_con = [data_con] }
-> ...
_ -> False
hasPointerSizedRepr _ = False
I could use some pointers (no pun intended!) at this point. The
function ought to return True for all the following types:
data A = A Int#
newtype B = B A
data C = C !B
data D = D !C
data E = E !()
data F = F !D
One part that confuses me is figuring out the representation type of a
data constructor after unpacking. For example, the function should not
return true if called on G in this example:
data G = G !H
data H = H {-# UNPACK #-} !I
data I = I !Int !Int
because if we unpacked H into G's constructor it would take up two
words, due to I being unpacked.
Does DataCon contain the unpacked representation of the data
constructor or only the before-optimizations representation?
Cheers,
Johan
On Thu, Feb 16, 2012 at 4:25 PM, Johan Tibell
Hi all,
I've been thinking about this some more and I think we should definitely unpack primitive types (e.g. Int, Word, Float, Double, Char) by default.
The worry is that reboxing will cost us, but I realized today that at least one other language, Java, does this already today and even though it hurts performance in some cases, it seems to be a win on average. In Java all primitive fields get auto-boxed/unboxed when stored in polymorphic fields (e.g. in a HashMap which stores keys and fields as Object pointers.) This seems analogous to our case, except we might also unbox when calling lazy functions.
Here's an idea of how to test this hypothesis:
1. Get a bunch of benchmarks. 2. Change GHC to make UNPACK a no-op for primitive types (as library authors have already worked around the lack of unpacking by using this pragma.) 3. Run the benchmarks. 4. Change GHC to always unpack primitive types (regardless of the presence of an UNPACK pragma.) 5. Run the benchmarks. 6. Compare the results.
Number (1) might be what's keeping us back right now, as we feel that we don't have a good benchmark set. I suggest we try with nofib first and see if there's a different and then move on to e.g. the shootout benchmarks.
I imagine that ignoring UNPACK pragmas selectively wouldn't be too hard. Where the relevant code?
Cheers, Johan