[Haskell-cafe] Re: [Haskell] Google Summer of Code 2009

12 Feb 2009

      Hi all,

On Tue, Feb 10, 2009 at 3:26 PM, Malcolm Wallace
 wrote:
...
If you have ideas for student projects that you think would benefit the
Haskell community, now is the time to start discussing them on mailing
lists of your choice.  We especially encourage students to communicate
with the wider community (...)
I have a project that I've tinkered with doing for a while, but have
never actually gotten around to, and I'm wondering whether there would
be interest from the community in using it. Actually, two related
projects:

(1) A library for working with hashable data structures, including a
hashable finger tree with sequence, map and set types based on it.
Surprisingly, there does not seem to be much library support for
working with hashes in Haskell beyond the bare-bones, imperative
Data.HashTable, even though pure functional programming and hashes Go
Well Together. I'm pulling these bounds out of my intuition, but
hashable collections should give amortized O(1) comparison for
equality and amortized O(log n) comparison for order, and you could
use them as set values and map keys without efficiency blowing up in
your face. The library should probably also include support for
hashtables based on the various kinds of pure and monad-encapsulated
arrays we have in Haskell, and maybe include support for interning
values; unsafePerformIO, weak references and friends were originally
introduced with the intention to support interned values in a way that
programmers can tailor to their own needs, but it would be nice to
have some default library support.

(2) Based on this, a serialization library that would recognize if it
has already written a particular value from memory to the same file,
and write a pointer to the first occurrence of that value instead of
serializing the actual value again, making it efficient to serialize
versioned data structures. The idea is to make something similar to
HAppS' MACID, but where MACID serializes the different kinds of
updates that you can do for your data, this library would simply do
the update in memory, then serialize the whole updated value to disk,
but actually *writing* only the *new* parts of the data structure. (I
think this could be simpler to use, and would be in a sense more
elegant and 'haskelly' than the MACID approach. On the other hand, we
already *have* MACID.)

So, anybody out there who looks at these things and thinks "this could
be useful"? :-)

Thanks,
- Benja

[Haskell-cafe] Re: [Haskell] Google Summer of Code 2009

Benja Fallenstein