
Robert Dockins wrote: [snip]
7) Finally, I somehow feel like there should be a nice categorical formulation of these datastructure abstractions which would help to drive a refactoring of the API typeclasses in a principled way, rather than on an ad-hoc I-sort-of-think-these-go-together sort of way.
For the last few months (!!!) I've been thinking about the relationship between measured sequences and plain sequences and also whether or not every sequence should by indexable by Int. I'm wondering if something like the following might be a possible factoring of the ops relating to indexing/measurements: -- from http://www.soi.city.ac.uk/~ross/papers/FingerTree.html class Monoid v => Measured v a where measure :: a -> v instance Measured () a where measure _ = () -- then (also based mostly on FingerTree ideas) class (Monoid v, Ord i) => IndexMeasure v i where -- no fundep index :: v -> i class BasicSeq c a | c -> a where length :: c -> Int empty :: c isEmpty :: c -> Bool atL :: c -> a atR :: c -> a pushL :: a -> c -> c viewL :: Monad m => c -> m (a, c) -- pushR, viewR class (Measured v a, Measured v c, BasicSeq c a) => Measurable c v a | c -> v where -- precondition: pred is True for v `mappend` (measure c) splitWithInternal :: (v -> Bool) -> v -> c -> (c, a, c) splitWith :: (v -> Bool) -> c -> (c,c) splitWith p t | isEmpty t = (empty, empty) | p (measure t) = let (l,x,r) = splitWithInternal p mempty t in (l, pushL x r) | otherwise = (empty, empty) splitAt :: IndexMeasure v i => i -> c -> (c,c) splitAt i = splitWith (\v -> i < index v) size :: IndexMeasure v i => c -> i size c = index (measure c) -- take, drop, takeWith, dropWith, in terms of split and splitWith atWith :: (v -> Bool) -> c -> a atWith p t = (\(_,x,_)->x) (splitWithInternal p mempty t) at :: IndexMeasure v i => i -> c -> a at i = atWith (\v -> i < index v) where splitWith p s returns (l,r) such that the measure of l `mappend` the measure of the first element of r satisfies p (FingerTree paper has explanation of this - I assume monotonic p for any useful use). The idea of the above design would be to allow multiple indexes for the same sequence (though the element type is the same in each case so possibly this could be confusing though could be prevented by using a fundep in the IndexMeasure class), as well as allowing sequences with an arbitrary measure that isn't an index (just by having no instances of IndexMeasure) eg: data TextBuffer = ... newtype Line = Line Int newtype CharPos = CharPos Int data TextBufferMeasure = ... instance IndexMeasure TextBufferMeasure Line where ... instance IndexMeasure TextBufferMeasure CharPos where ... instance Measureable TextBuffer TextBufferMeasure Char where ... Line lineCount = size textbuf CharPos charCount = size textbuf (before, after) = splitAt (CharPos 56) textbuf Of course this doesn't solve the problem of using nested sequences, but it at least allows general measurement with predicate search to coexist with simple indexing and size-with-respect-to-index where these are applicable to the relevant concrete sequence. Anyway just a very rough idea at the moment. I'm looking forward to seeing a nice categorical factoring ;-) Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com