
"Simon Marlow"
writes: You can save storage by using strict constructor fields and using the -funbox-strict-fields flag. eg.
This would be switched on automatically with -O, no?
Actually no... but perhaps it's time to turn it on. The reason it's not on by default is that there are cases where it can make performance *worse* - storing the value unboxed occasionally requires us to reconstruct the boxed version eg. for passing to a polymorphic function or storing in an ordinary list.
data FloatList = FloatNil | FloatCons !Float FloatList
..while without strictness, the compiler wouldn't be able to unbox it, and then we'd have three words per cons, *plus* two words per Float, right?
Yes, spot on.
requires 3 words for each float in the list, because its actual representation is
data FloatList = FloatNil | FloatCons Float# FloatList
I'm going to sketch my data structure, and try to reason a bit about it. Let's see
I have an (Array Int Foo), where Foo is a set of nullary constructors (in other words, cheap?). I have a list of Bar = Bar (Array Int Foo) Int (where the array is the same for all elements), and construct a list of tuples (Int, Bar).
So the array shouldn't be too costly, I think - but perhaps an UArray would reduce cost from three words to one word per element?
It already costs one word per element (think of nullary constructors as free - we only have one copy of each nullary constructor which everybody points to). If the array is the same for each element, why store it in the list at all?
The list of Bars should require three words per cons cell, and three words for each Bar, and then two words for the Ints (which could be saved by an exclamation mark).
Yes.
The list of tuples should again cost three words per cons, three per tuple, plus two words per Int (with the Bars already existing and only moved around).
Summing up, the array is 3n, the first list is (3+3+2)n = 7n, and the second list is (3+3+2)n = 7n, for a grand total of 17n, or multiplied with four to get the bytes, 68n bytes.
This isn't too far from what I observe, so I hope my analysis is somewhat correct. :-)
Yes, sounds right.
---
Now for improvement upon it: I'll try to use an UArray, and see if that reduces the array size somewhat.
I don't think it'll make a difference.
For the list of Bar's, I can strictify the Ints to save 2 words. But if I defined
data BarList = Bar (Array Int Foo) !Int (BarList) |BarNil,
I should be able to get away with 4 words per Bar, shouldn't I (disregarding the actual array, which is shared anyway)?
And for the tuple-list, I could do
data TupleList = Tuple !Int Bar TupleList | TupleNil
and get 4n, which is equally good.
---
This is assuming the compiler doesn't do any of this on its own...
The compiler won't do any of this on its own. Remember that you can't pass a BarList to standard list functions like length, for example, because the types (and representations) are different. The compiler won't make data constructor fieds strict - except in simple cases it would need a sophisticated whole-program analysis to ensure that adding the annotation wouldn't change the behaviour of the program, and AFAIK no-one has attempted this. Cheers, Simon

"Simon Marlow"
Actually no... but perhaps it's time to turn it on.
Yeah, I noticed a slight saving by turning an Int into an !Int, but only when using -funbox-... option.
So the array shouldn't be too costly, I think - but perhaps an UArray would reduce cost from three words to one word per element?
It already costs one word per element (think of nullary constructors as free - we only have one copy of each nullary constructor which everybody points to).
*slap forehead* Ah, of course. However, with a small number of constructors, I might be able to squeeze things into less than a word (i.e. pointer at 32 bits). I could e.g. use an UArray over Word32 and represent a sequence of my nullaries in each word. (> If the array is the same for each element, why store it in the list at
all?
I realise this, but eventually I want to have my Bar's referring to a (much smaller than my n) set of arrays.)
This isn't too far from what I observe, so I hope my analysis is somewhat correct. :-)
Yes, sounds right.
Cool. That means there's a potential here; I'll get to it.
This is assuming the compiler doesn't do any of this on its own...
The compiler won't do any of this on its own. Remember that you can't pass a BarList to standard list functions like length, for example,
Yes, I see. Would it be possible to have a standard strict list, i.e. something equivalent of data SList a = SNil | SCons !a SList (which could be a member of the same class as the normal lists, and have the usual functions (length, ++, isPrefixOf...) overloaded)? -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Hi, from the GHC documentation (5.17.1 in the libraries section), I get the impression that (!) is a member of the IArray class. While I'm messing about with kinds and stuff getting this properly instantiated, I get an error claiming that EST.lhs:44: Class `IArray' does not have a method `!' In the instance declaration for `IArray EST n' Experimenting with GHCi, I finally worked out Prelude> :i IArray.IArray -- ArrayBase.IArray is a class class (ArrayBase.HasBounds a) => ArrayBase.IArray a :: (* -> * -> *) e where { ArrayBase.unsafeArray :: forall i. (PrelArr.Ix i) => (i, i) -> [(Int, e)] -> a i e; ArrayBase.unsafeAt :: forall i. (PrelArr.Ix i) => a i e -> Int -> e; ArrayBase.unsafeReplace :: forall i. (PrelArr.Ix i) => a i e -> [(Int, e)] -> a i e {- has default method -}; ArrayBase.unsafeAccum :: forall e' i. (PrelArr.Ix i) => (e -> e' -> e) -> a i e -> [(Int, e')] -> a i e {- has default method -}; ArrayBase.unsafeAccumArray :: forall e' i. (PrelArr.Ix i) => (e -> e' -> e) -> e -> (i, i) -> [(Int, e')] -> a i e {- has default method -}; } So, unless I'm misinterpreting something, the documentation is incorrect? What I'd like to have, is Int-indexed access to my classes. ATM, I've solved it with my own class "Indexed" providing a (?) operator. This is (obviously) not consistent with the standard types, and in addition collides slightly with the implicit parameters syntax (works if I leave a space between the ? and the index) -kzm -- If I haven't seen further, it is by standing in the footprints of giants
participants (2)
-
ketil@ii.uib.no
-
Simon Marlow