Re: DData revision

Am Sonntag, 14. März 2004 18:31 schrieben Sie:
[...]
And why does Seq.concat have the type Seq (Seq a) -> Seq a instead of [Seq a] -> Seq a which would be more consistent with the other modules?
I gave it some thought, and concluded that it is more important for Seq to be consitent with itself than with other modules. This really should be solved with type classes, yet we chose not to introduce new classes with DData.
Moreover, Seq is an instance of Monoid, so 'mconcat' can be used.
But why does LambdaSeq.concat have the type [LambdaSeq a] -> LambdaSeq a instead of LambdaSeq (LambdaSeq a) -> LambdaSeq a then?
[...]
Cheers, JP.
Wolfgang

--- Wolfgang Jeltsch
Am Sonntag, 14. M�rz 2004 18:31 schrieben Sie:
[...]
And why does Seq.concat have the type Seq (Seq a) -> Seq a instead of [Seq a] -> Seq a which would be more consistent with the other modules?
I gave it some thought, and concluded that it is more important for Seq to be consitent with itself than with other modules. This really should be solved with type classes, yet we chose not to introduce new classes with DData.
Moreover, Seq is an instance of Monoid, so 'mconcat' can be used.
But why does LambdaSeq.concat have the type [LambdaSeq a] -> LambdaSeq a instead of LambdaSeq (LambdaSeq a) -> LambdaSeq a then?
I kept the old sequences (implemented as lambdas) as LambdaSeq, but did not work on it any more, so it is rather not up to date. I thought to remove it eventually. Do you think it is any worth keeping? Cheers, JP. __________________________________ Do you Yahoo!? Yahoo! Mail - More reliable, more storage, less spam http://mail.yahoo.com

Am Montag, 15. März 2004 13:11 schrieb JP Bernardy:
[...]
I kept the old sequences (implemented as lambdas) as LambdaSeq, but did not work on it any more, so it is rather not up to date. I thought to remove it eventually. Do you think it is any worth keeping?
If LambdaSeq doesn't have any advantage over Seq (or only little advantage), I would remove it.
Cheers, JP.
Wolfgang

On Mon, 15 Mar 2004, JP Bernardy wrote:
I kept the old sequences (implemented as lambdas) as LambdaSeq, but did not work on it any more, so it is rather not up to date. I thought to remove it eventually. Do you think it is any worth keeping?
No, rationale see http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/Stream.hs By the way: a Sequence implementation based on OrdList is not too attractive either, since it provides no sensible worst-case bounds: users don't get real abstraction. More info: http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/#impl (Keyword: ropes) http://www.stud.tu-ilmenau.de/~robertw/dessy/fun/democratic Bob

On Fri, 19 Mar 2004, Robert Will wrote:
By the way: a Sequence implementation based on OrdList is not too attractive either...
I have had a look at OrdList (in GHC 5.00 and JPs DData) and this has really no advantage over the simple Catenable Sequences, in fact it only uses data cells where the other one uses closures. Also the implementor has not been too clever, since in the function OLtoList he uses the Closure-trick to create the resulting list, but just the same would have been achieved by a call to foldr! Anyway such Sequence implementation are more a Bugfix, than an Abstract Data Structure implementation. They work around the bad behaviour of "stack-based list".++. I belive that using an accumulator to avoid ++'s bad behaviour is the second most used optimisation in functional programs. (Using an accumulator is just the same than using closures, since a function with one additional argument is just the same as a function returning a closure.) Wadler (1987!) has even formalised that optimisation so far it could even be used automatically in a compiler: http://homepages.inf.ed.ac.uk/wadler/papers/vanish/vanish.pdf If Haskell implementations would use this, programmers could really spare some efforts and programs would be clearer. Anyway, I strongly dissuade including such an ad-hoc data structure into an exposed library. (Anyway, with my new simple balancing algorithms, building worst-case efficient sequences is a trivial Exercise, Dessy alone provides five different implementations...) Robert

I belive that using an accumulator to avoid ++'s bad behaviour is the second most used optimisation in functional programs. [...] Wadler [...]
If Haskell implementations would use this, programmers could really spare some efforts and programs would be clearer.
GHC does implement this kind of optimization to remove ++ and :. They work pretty well (see Simon Peyton Jones' page for papers about it and follow citations to find other recent work on the topic). In practice though, it is useful to have data structures, algorithms, etc. that already contain the optimization though. Some of the major reasons are: 1) The analyses that drive the optimizations are fragile and hard to predict. When the optimization triggers, you can get a large speedup or dramatic reduction in memory consumption but then you make a small, insignificant change to your program, the optimization can't be applied an you get a large change in performance. Lesson: predictable performance is an important property. 2) This particular class of optimizations contain a space-time tradeoff. If the data structure is shared, we can save time by building one copy and using it multiple times or we can save space by building a fresh copy each time. How is the compiler to know which you want to apply? For some programs, you want to save space, for others, you want to save time. Indeed, for small datasets, you want to save time while for bigger datasets you often have to choose to save space at the cost of time. Lesson: some programming decisions can't be automated. -- Alastair Reid www.haskell-consulting.com

On Sun, 21 Mar 2004, Alastair Reid wrote:
If Haskell implementations would use [++ optimisation], programmers could really spare some efforts and programs would be clearer.
In practice though, it is useful to have data structures, algorithms, etc. that already contain the optimization though.
Of course it is. But we do need data structures that perform well on a wide range of operations, not just one! That's why the CatenableSeq is useless in practice: it is easier to use closures or accumulators manually, which, for example, is done automatically when using 'foldr'.
Lesson: predictable performance is an important property.
Yeah, and just this was a feature of Wadler's proposal: his transformation is predictable.
Lesson: some programming decisions can't be automated.
Yeah, and in such cases it would be best, if the programmer could choose between two different implementations of the same abstract structure. However, the choice between CatenableSeq (or manual use of closures) and any other implementation will almost always be the other implementation Robert
participants (4)
-
Alastair Reid
-
JP Bernardy
-
Robert Will
-
Wolfgang Jeltsch