
G'day all.
Quoting JP Bernardy
Rencent posts about DData and such make me think... Shouldn't it be integrated in the hierarchical libraries?
Possibly, possibly not. DData is a really good set of data structures. However, it does duplicate some stuff which is already there.
Besides, it looks like Edison has been dropped from GHC.
Daan has expressed some interest in unifying DData and Edison, which isn't a bad idea. Edison has been dropped because: - the version that was there isn't actively supported, - the version that _is_ actively supported isn't ready for "prime time" yet, and - nobody is entirely sure that Edison is quite what people want anyway.
More generally, is there a plan for a unified collections framework in the hierarchical libraries?
No, but I would love some concrete suggestions. Here's my attempt to get the ball rolling... 1. There are a lot of things which are logically collections but existing libraries do not consider as (e.g. PackedString, Array, Graph). In a truly unified library, these would be collections. The downside is that fiddling with commonly used code isn't something which should be undertaken lightly, and you'd better be certain that what you're replacing it with is an improvement. 2. There is a lot of merit in Edison's approach where every data structure supports every operation that is possible, even if it's inefficient. Chris Okasaki did this because he often found himself, say, using a stack ADT (which only supports push, pop, top, isEmpty etc) but occasionally needing to peer at the second or third element from the top of the stack. You don't want to have to change data structures just because of this. The main unintended consequences of this are that a) the interface for a data structure looks overwhelming, and b) the support for choosing an appropriate data structure is quite poor. Why would you choose a LazyPairingHeap over a LeftistHeap anyway? As a programmer, I just want to pick a PriorityQueue and get one with reasonable performance characteristics across the board, only going further if I need it. This has led many people to believe that Edison is too heavyweight. 3. Data types with destructive update semantics (some of them thread-safe, some of them not) are becoming increasingly important. A truly unified approach should take this into account, easing the transition from a data type without destructive update to one with as much as possible, should a programmer find this is needed. The unified approach to Arrays is a good example of how this might work in practice. It seems to me that would make sense to decree that all data structures live in a monad which is naturally its "home". This "home" might be the identity monad for many data structures, and for those cases, a set of additional functions would be provided which don't require the monad to be passed. 4. "Proxy containers" are important. On-disk data structures, in-memory compressed representations and fake containers which compute elements on the fly are all good examples of this. To support these, it should be straightforward (as much as possible) to turn your data structure into a unified-collection-architecture-compliant collection. In addition, algorithms should be data-structure-agnostic if at all possible. [Aside: The C++ STL is often touted as a good example of a data structure library where algorithms are data-structure-agnostic, however, this is only partly true. Pretty much all proxy containers are not STL-compliant. This has led to the odd situation where std::vector<int> is a container but std::vector<bool> is not.] The "left-fold-like combinator", which has received some press lately, seems like a good place to start looking. Cheers, Andrew Bromage