
I'm building some stuff here that is basically a glorified and specialized version of quicksort. It is very quick, and works like a charm for my purposes, except that it consumes way too much space, about 100 bytes times the size of the input list. Which means that, for my data sets, the program exceeds available RAM and uses swap, which in turn kills performance.
Is there any guideline how much space the various constructs use? E.g. how much for cons cells (or whatever that make up lists), how much for a data type with only nullary data constructors -- should I use Word8s instead? -- how much for tuples, and so on. I realize I can (and I do somewhat) use profiling to determine heap use, but it'd be nice to be able to do back-of-envelope calculations and have some idea of what's the right approach.
Ok, roughly speaking, in GHC every object requires one word plus one word for each field (a word is 4 bytes on a 32-bit machine). In other words, a cons cell will require 3 words. An Int requires two words. A Char usually requires no words (chars up to 256 are cached in the RTS). Nullary constructors require no dynamic storage. The smallest dynamic object is 2 words: ie. a Word8 requires 2 words, even though the useful data it contains is only a single byte. Of course, all of this just applies to fully-evaluated structures - any suspensions (thunks) will screw up the calculations, and certainly the amount of temporary allocation required to *generate* a particular structure is very rarely predictable. You can save storage by using strict constructor fields and using the -funbox-strict-fields flag. eg. data FloatList = FloatNil | FloatCons !Float FloatList requires 3 words for each float in the list, because its actual representation is data FloatList = FloatNil | FloatCons Float# FloatList (Float# is an unboxed float). We heartily recommend strict constructor fields: apart from saving space and time, they help avoid space leaks. Using -funbox-strict-fields is a good way to get the benefits of unboxed types without digging around in GHC's low-level primitive types & operations. Does that help? Cheers, Simon

"Simon Marlow"
You can save storage by using strict constructor fields and using the -funbox-strict-fields flag. eg.
This would be switched on automatically with -O, no?
data FloatList = FloatNil | FloatCons !Float FloatList
..while without strictness, the compiler wouldn't be able to unbox it, and then we'd have three words per cons, *plus* two words per Float, right?
requires 3 words for each float in the list, because its actual representation is
data FloatList = FloatNil | FloatCons Float# FloatList
I'm going to sketch my data structure, and try to reason a bit about it. Let's see I have an (Array Int Foo), where Foo is a set of nullary constructors (in other words, cheap?). I have a list of Bar = Bar (Array Int Foo) Int (where the array is the same for all elements), and construct a list of tuples (Int, Bar). So the array shouldn't be too costly, I think - but perhaps an UArray would reduce cost from three words to one word per element? The list of Bars should require three words per cons cell, and three words for each Bar, and then two words for the Ints (which could be saved by an exclamation mark). The list of tuples should again cost three words per cons, three per tuple, plus two words per Int (with the Bars already existing and only moved around). Summing up, the array is 3n, the first list is (3+3+2)n = 7n, and the second list is (3+3+2)n = 7n, for a grand total of 17n, or multiplied with four to get the bytes, 68n bytes. This isn't too far from what I observe, so I hope my analysis is somewhat correct. :-) --- Now for improvement upon it: I'll try to use an UArray, and see if that reduces the array size somewhat. For the list of Bar's, I can strictify the Ints to save 2 words. But if I defined data BarList = Bar (Array Int Foo) !Int (BarList) |BarNil, I should be able to get away with 4 words per Bar, shouldn't I (disregarding the actual array, which is shared anyway)? And for the tuple-list, I could do data TupleList = Tuple !Int Bar TupleList | TupleNil and get 4n, which is equally good. --- This is assuming the compiler doesn't do any of this on its own...
Does that help?
As usual, your postings are enlightening. I haven't gotten around to try things out yet, so I don't know how it turns out in practice. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

The need to recompile everything after upgrading was recently mentioned, I think, on this list. After upgrading from .1 to .2 (trying to get profiling on track), ghci gave me a slightly cryptic message about an unknow symbol ("stg_gc_enter_1"). No big deal, and after I removed the .o's, everything seems fine. But perhaps ghc could tag files with its version, so that ghci (and possibly ghc when used for linking?) could be a bit more explicit? As you probably can tell, I'm a great fan of ghc's "...probably because f is called with too few parameters in..." type messages :-) -kzm -- If I haven't seen further, it is by standing in the footprints of giants
participants (2)
-
ketil@ii.uib.no
-
Simon Marlow