
On 9/28/12 2:48 PM, Mario Blažević wrote:
On 12-09-26 08:07 PM, wren ng thornton wrote:
On 9/25/12 1:57 PM, Sjoerd Visscher wrote:
Maybe we could make a literal [a,b,c] turn into unpack [a,b,c]# where [a,b,c]# is a statically-allocated vector?
I'm kinda surprised this isn't already being done. Just doing this seems like it'd be a good undertaking, regardless of whether we get overloaded list literals. Just storing the literal as a C-like array and inflating it to a list/array/vector at runtime seems like it should be a big win for code that uses a lot of literals.
Why?
I'm surprised that this is an issue at all. If list literals you are talking about are constant, wouldn't GHC apply constant folding and construct the list only the first time it's needed?
The problem is: if the list is stored naively in the .data segment (as apparently it is), then we have to store all the pointer structure as well as the data. This hugely bloats the disk footprint for programs. That is, all the reasons why String=[Char] is bad at runtime are also reasons why this representation is bad at objectcode time. For most lists, the pointer structure is a considerable portion of the total memory cost. During runtime this overhead is (or at least may be) unavoidable due to the dynamic nature of program execution; but there's no reason to have this overhead in the compiled format of the program since it's trivial to generate it from a compact representation (e.g., storing lists as C-style arrays + lengths). The only conceivable benefit of storing lists on disk using their heap representation is to allow treating the .data segment as if it were part of the heap, i.e., to have zero-cost inflation and to allow GC to ignore that "part of the heap". However, for lists, I can't imagine this actually being beneficial in practice. This sort of thing is more beneficial for large structures of static data (e.g., sets, maps,...). But then for large static data, we still usually want a non-heap representation (e.g., cache-oblivious datastructures), since we're liable to only look at the data rather than to change it. It's only when we have lots of static "mutable" data that it makes sense to take heap snapshots. -- Live well, ~wren