
On 10 May 2004 14:56, JP Bernardy wrote:
--- Simon Marlow
wrote: Ross's point about requiring append to be strict in order to guarantee O(n) toList is a good one. In fact, the current implementation isn't quite correct: either fromList should be strict, or append should check for an empty Many constructor.
That's what exists in the current published version (on my "web page"): fromList guarantees that the Many constructor never has [] as argument.
True, thanks for pointing it out.
Anyways, 'toList' will still be O(m) where m = number of subtrees. Moreover, I hardly expect more that O(n) empty subtrees in a sequence.
O(number of elements) is a tighter constraint than O(number of subtrees). If you build a list with 10 elements using 100 appends, then toList will only require O(10) rather than O(100). If you do more than one toList on the same Seq, you'll notice the difference. But, the tradeoff is strictness in append. It may be that we need multiple versions of Seq, but I suspect that not much would be gained by having all three: 1. ordinary lists with lazy append 2. Seq with O(1) lazy append 3. Seq with O(1) strict append and O(n) toList If we have 1 & 3, then I'm guessing that 2 would be very rarely required. Cheers, Simon

Hi all, Sorry I'm adding my 2c so late, but the problem is an instance of a general tradeoff between data structural abstraction and "lazyness", which is explained at http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/principles.html (search for "lazy"). In short, we can only have concrete lazy data structures, since *any* optimisation (re-arrangement) will destroy some lazyness. So the problem of DData.Seq is that it has no clear design guidelines that specify whether it is to be thought of as a concrete structures (i.e. users know the Data Constructors, as for the built-in Haskell lists), or an abstract structure (which is the only way to provide worst-case time bounds independent from the construction of the sequence, only dependent from its size). On Sun, 9 May 2004, Ross Paterson wrote:
You don't need a balanced tree, but the O(n) bound requires the invariant that no proper subtree is empty. Otherwise, a tree containing n elements can be of unbounded size, and take an unbounded time to convert to a list.
I suspect there is no satisfactory general-purpose sequence implementation (DData.Seq isn't trying to be one -- it does no balancing.) A balanced tree would make most things O(log n), but other implementations give you O(1) for various subsets of operations (and the smaller the subset, the smaller the constant factor, as a rule of thumb). It seems essential to offer a range of implementations.
On Tue, 11 May 2004, Simon Marlow wrote:
But, the tradeoff is strictness in append. It may be that we need multiple versions of Seq, but I suspect that not much would be gained by having all three:
1. ordinary lists with lazy append 2. Seq with O(1) lazy append 3. Seq with O(1) strict append and O(n) toList
If we have 1 & 3, then I'm guessing that 2 would be very rarely required.
Well, Dessy has only 1 and 3 and that for good reason. In fact, this discussion does somehow miss the point, since none of these three are most "general-purpose Sequences", with a lot of unnecessary O(N) operations (e.g. for 'last' and (!!)). I think the most general implementation are Democratic Sequences (with O(log N) bounds for all operations that allow that) followed by Deques (with O(1) bounds for all tip-based operations, which I think are all that allow that). Write-Only Sequences (as DData.Seq) are already an "optimised case" that should only be used when needed, since the user has to take into account, that every "read access" implies some kind of O(N) "conversion" that is not shared among those calls. When using Dessy, one has the advantage of a good design: each data structure is optimal in some sense. Streams are optimal for laziness, all others are different optima in time (and space) bounds. Dessy is at the edge of what is possible, so you never have to ask questions like "could we make this a little more lazy?". I would appreciate if someone could appreciate this ;-) Robert
participants (2)
-
Robert Will
-
Simon Marlow