
Hi all,
On Tue, Feb 10, 2009 at 3:26 PM, Malcolm Wallace
If you have ideas for student projects that you think would benefit the Haskell community, now is the time to start discussing them on mailing lists of your choice. We especially encourage students to communicate with the wider community (...)
I have a project that I've tinkered with doing for a while, but have never actually gotten around to, and I'm wondering whether there would be interest from the community in using it. Actually, two related projects: (1) A library for working with hashable data structures, including a hashable finger tree with sequence, map and set types based on it. Surprisingly, there does not seem to be much library support for working with hashes in Haskell beyond the bare-bones, imperative Data.HashTable, even though pure functional programming and hashes Go Well Together. I'm pulling these bounds out of my intuition, but hashable collections should give amortized O(1) comparison for equality and amortized O(log n) comparison for order, and you could use them as set values and map keys without efficiency blowing up in your face. The library should probably also include support for hashtables based on the various kinds of pure and monad-encapsulated arrays we have in Haskell, and maybe include support for interning values; unsafePerformIO, weak references and friends were originally introduced with the intention to support interned values in a way that programmers can tailor to their own needs, but it would be nice to have some default library support. (2) Based on this, a serialization library that would recognize if it has already written a particular value from memory to the same file, and write a pointer to the first occurrence of that value instead of serializing the actual value again, making it efficient to serialize versioned data structures. The idea is to make something similar to HAppS' MACID, but where MACID serializes the different kinds of updates that you can do for your data, this library would simply do the update in memory, then serialize the whole updated value to disk, but actually *writing* only the *new* parts of the data structure. (I think this could be simpler to use, and would be in a sense more elegant and 'haskelly' than the MACID approach. On the other hand, we already *have* MACID.) So, anybody out there who looks at these things and thinks "this could be useful"? :-) Thanks, - Benja