
On 18 May 2004 09:24, Robert Will wrote:
On Tue, 11 May 2004, Simon Marlow wrote:
But, the tradeoff is strictness in append. It may be that we need multiple versions of Seq, but I suspect that not much would be gained by having all three:
1. ordinary lists with lazy append 2. Seq with O(1) lazy append 3. Seq with O(1) strict append and O(n) toList
If we have 1 & 3, then I'm guessing that 2 would be very rarely required.
Well, Dessy has only 1 and 3 and that for good reason.
In fact, this discussion does somehow miss the point, since none of these three are most "general-purpose Sequences", with a lot of unnecessary O(N) operations (e.g. for 'last' and (!!)).
The reason to prefer a sequence with an O(N) 'last' over one with O(log N) 'last' would be if the O(N) version had lower constant factor overhead and you didn't need to use the O(N) operations. Measurements please! Cheers, Simon

----- Original Message -----
From: "Simon Marlow"
The reason to prefer a sequence with an O(N) 'last' over one with O(log N) 'last' would be if the O(N) version had lower constant factor overhead and you didn't need to use the O(N) operations.
That argument can be generalised: IF you don't need some operations which are overly inefficient in implementation X THEN any advantage of X will suffice to make it your preference over an implementation without any overly inefficient operations. Given an appropriate precondition (the IF part..) almost any data structure can be one's (objective!) preference. I think however that the idea of general-purpose data structures is to impose as little preconditions as possible. (According to the motto "Preliminary optimisation...".) Truly general-purpose data structures can be used before or even without thinking about all those preconditions, and they can be kept, even if preconditions change (adaptability, portability). The built-in lists (and other more concrete data types) have more advantages than constant factors, but all together with some preconditions that one has to learn and to master. We all know that functional programming is easier to learn than imperative programming, but that it is actually harder if one knows already imperative programming. Similarly I think that programming with (democratic) abstract data structures is easier (to learn and to do), but actually harder when one is used to preconditioned concreteness. Nothing against algebraic lists and other data structures, I just say that they shouldn't come first when learning and shouldn't be the default when programming. Robert

----- Original Message -----
From: "Simon Marlow"
The reason to prefer a sequence with an O(N) 'last' over one with O(log N) 'last' would be if the O(N) version had lower constant factor overhead and you didn't need to use the O(N) operations.
That argument can be generalised: IF you don't need some operations which are overly inefficient in implementation X THEN any advantage of X will suffice to make it your preference over an implementation without any overly inefficient operations. Given an appropriate precondition (the IF part..) almost any data structure can be one's (objective!) preference. I think however that the idea of general-purpose data structures is to impose as little preconditions as possible. (According to the motto "Preliminary optimisation...".) Truly general-purpose data structures can be used before or even without thinking about all those preconditions, and they can be kept, even if preconditions change (adaptability, portability). The built-in lists (and other more concrete data types) have more advantages than constant factors, but all together with some preconditions that one has to learn and to master. We all know that functional programming is easier to learn than imperative programming, but that it is actually harder if one knows already imperative programming. Similarly I think that programming with (democratic) abstract data structures is easier (to learn and to do), but actually harder when one is used to preconditioned concreteness. Nothing against algebraic lists and other data structures, I just say that they shouldn't come first when learning and shouldn't be the default when programming. Robert

G'day all.
Quoting Robert Will
That argument can be generalised: IF you don't need some operations which are overly inefficient in implementation X THEN any advantage of X will suffice to make it your preference over an implementation without any overly inefficient operations.
Or more succinctly: The best implementation is the one that matches the characteristics of your application/data the closest. Of course, it's usually not obvious what those characteristics are in advance. Cheers, Andrew Bromage

ajb@spamcop.net writes:
The best implementation is the one that matches the characteristics of your application/data the closest.
Of course, it's usually not obvious what those characteristics are in advance.
There is a data-structure benchmarking kit called "auburn" http://www.cs.york.ac.uk/fp/auburn sadly no longer actively maintained. One of the things it can do is to generate "datatype usage graphs" (dugs) for any given abstract datatype. These record information like the relative proportions of appends,reads,writes, etc. in a particular usage scenario. A series of competing implementations of the abstract datatype can be benchmarked against a series of randomly generated dugs. Furthermore, auburn has an automated decision procedure for deciding which implementation would be best for your application, based on the previous benchmarking. Regards, Malcolm
participants (4)
-
ajb@spamcop.net
-
Malcolm Wallace
-
Robert Will
-
Simon Marlow