
On Fri, Oct 08, 2004 at 12:43:45PM -0400, Robert Dockins wrote:
x = [3,5,7] primes = 2:x odds = 1:x You can't do sharing like this if your lists are doubly-linked; lots of cool algorithms depend on this sharing.
That constraint makes various other things painful. I suppose there is no one-size-fits-all solution.
Then perhaps it is worth considering having multiple implementations and choosing between them with pragmas and/or command line switches (with a sensible default naturally). Maybe doubly linked lists are not a great idea, but if we had a good implementation with, eg. O(1) access to both ends of the list but poor sharing, we can choose to use it only in cases where queue semantics are important and sharing is not. It would be nice to be able to monkey about with that kind of "under the hood" functionality w/o having to make any code changes. You could also do fun things like have chained-buffer list implementations for [Word8], [Char] etc.
What I thought would be interesting to implement (and should be possible with ghc) would be as lists were evaluated, if they were packed into UArrays by the thunk-evaluation code. basically, the update mechanism, rather than replace a node with a constant Char thunk, would write the char to a UArray-like structure (of a certain block size) and replace the code pointer to one that knows how to read the list out of said UArray structure. That way, lists would behave like normal, but once evaluated, would be optimized into efficient arrays. also, the work of the update-analysis phase could be leveraged to avoid this when we know the list is not shared. A variation would be to not do this in the update code, but rather the GC scavaging code, if it notices chains of unboxable values in HNF. This is more advantagous as we would avoid the work if the lists are GCed and has no run-time cost. At the time of the GC, we might also have more information, like the full size of the list (if it is fully evaluated) which could help us choose buffer sizes more efficiently or pack it into 8 bit chars if we find out it is all ascii. John -- John Meacham - ⑆repetae.net⑆john⑈