
Robert Dockins wrote:
Robert Dockins wrote: So, what you want is a sequence of sequences that can be
On Jul 30, 2006, at 5:28 PM, Brian Hulley wrote: transparently converted to a "flattened" sequence and vice versa?
Yes as long as the conversion between them takes no time at all - the sequence of sequences and flattened sequence must coexist simultaneously. The concrete data structure is a sequence of sequences and the flattened sequence is a particular view of it.
And you also want to keep track of the total number of lines and characters within each line. Additionally, you want to keep track of the maximum number of characters in any one line.
Edison has support for transparently keeping track of the size of a sequence.
http://www.eecs.tufts.edu/~rdocki01/docs/edison/Data-Edison-Seq- SizedSeq.html
I used this in an initial version of an edit buffer when I just used a SizedSeq wrapping a BinaryRandList to store the text as a sequence of chars. But the lack of ability to also index by line number and keep track of max line length was the problem that led me to use a finger tree instead. Of course I could have used a sequence of chars, a sequence of line lengths, and a bag of line lengths to keep track of everything, and kept them in sync, but after reading the FingerTree paper I was seduced by the idea of being able to represent all this stuff at once in a single data structure.
It may well be possible to create a slightly generalized wrapper that keeps track of arbitrary "measures". (If they can be computed by a function which is associative, commutative and has a unit). Humm, sort of an incremental fold.... I like it.
I got this from the FingerTree paper. A finger tree supports any measurement that is a Monoid (so it needs to be associative but not commutative (if it had to be commutative it would be impossible to use a sequence as a set or map, which I needed for my Trie structure)).
Well, I guess I'd suggest you attempt to identify specific problems with already existing packages and attempt to work with those who maintain such packages before reinventing something as basic (and difficult to get right!) as data structure abstractions.
The problem is that some people will be using Data.Edison.Seq at the moment and will naturally not want it to change. However I'd suggest that all the common operations be factored out into separate classes eg: class Foldable f where fold :: (a -> b -> b) -> b -> f a -> b foldL :: ... class Reduce f where -- based on FingerTree paper reduceR :: (a -> b -> b) -> (f a -> b -> b) reduceL :: (b -> a -> b) -> (b -> f a -> b) class TakeDrop f where take :: Int -> f a -> f a takeWhile :: (a -> Bool) -> f a -> f a drop ... class Filter f where filter :: (a -> Bool) -> f a -> f a partition :: (a -> Bool) -> f a -> (f a, f a) class Indexable f where length :: f a -> Int at :: Int -> f a -> f a -- (*) splitAt :: Int -> f a -> (f a, f a) (*) Data.ByteString.index puts the Int arg second. It's not at all clear to me which is best, because I often wish that the Int arg of take and drop was second also so I could write take g $! x+1 instead of (take $! x + 1) g though it's consistent with the arg order for takeWhile etc. I know you don't agree with the no-exception-camel-case idea, but I still would argue that this is essential if you want to have a consistent naming convention. I find it extremely confusing that a word like "reducer" is supposed to be read as "reduceR" because the word "reducer" means to me "something which reduces". It seems to me that a restructuring of the usual fold, reduce ops into classes is a great opportunity to perfect the naming of these functions to make life easier for generations to come... :-)
Such maintainers may be willing to accept patches and/or implement requested features in order to reduce fragmentation in this space *hint, hint* :-)
Point taken, although in the case of the above refactoring idea, I think it really is a Haskell-wide task because although there appears to be a defacto standard use of names like take, drop, splitAt etc, it's not nearly so clear which ops belong together and which should be separated out, and I personally don't have enough experience of Haskell yet to be able to recommend a solution.
<soapbox type="Edison plug"> I personally think that Edison is a great piece of work, and I took up maintainership because I felt it was a great shame that no one was using it. My ultimate goal is to make Edison the package that everyone thinks of first when they discover they need a Haskell datastructure for some purpose. Even if Edison does not fill that need, I want every Haskeller to compare his needs against what Edison provides before striking out on his own, and I want that to be a decision made with some hesitation. Over time I hope to make the cases where Edison doesn't cut the mustard fewer and further between.
So, if you've ever looked at Edison, or ever do so in the future, and decide it isn't what you need, please let me know why so I can make it better for the next time. After all, squeaky wheels get the grease, but only if I can hear the squeaking! </soapbox>
Best regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com