
Bertram Felgenhauer wrote:
splitSeq' :: Ord a => Map a () -> [(a,b)] -> ([(a,[b])], Map a [b]) splitSeq' bp [] = ([], map (const []) bp) splitSeq' bp ((a,b):xs) = case lookup a bp bp of Just _ -> let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m) _ -> let (bp', m) = insert a bp m' (l, m') = splitSeq' bp' xs in ((a, b : (fromJust $ lookup a bp' m')) : l, m)
splitSeq' takes a blueprint for a map with all keys seen so far, and a list tail. It returns the result list for all new keys, and a map (corresponding to the given blueprint) with the tails of the seen elements.
The in the base case it just fills the blueprint with empty lists and returns to the caller.
If a new element is seen, insert is used to create a new blueprint, including the new key a, which is passed to the recursive call of splitSeq'. The resulting map from that call is threaded back into insert, which gives us a new map without the a key which matches the structure of the original blueprint.
Very interesting! So the map with the seen tails matches the blueprint and therefore throws away information (the future keys) from the future. This is what effectively allows the key-tree structure to be rebalanced. If one would return the tails-map with all future keys, it would be _|_ because the key-order in the tree depends on the future keys and changes everytime when a new key is found. I thought that there can only be a static solution, i.e. like the one Ross Paterson presented where the structure (I mean the outermost constructors) of the returned tree are determined before the future. This obviously excludes rebalancing. I found a static solution by trying to fit the key-tails pairs into an infinite tails-map which is filled "first comes first": map ! 1 := (first key seen, tails) map ! 2 := (second key seen, tails) By keeping another key-position-map around which records where each key has been inserted, so that the future knows where to put the tails. The point is that the structure of tails-map is known before the future comes as its keys are just 1,2,3,... and so on. It remains to construct such an infinite random access list, but this is turns out to be even easier than finite random access lists (when using non-uniform recursion from Okasaki's chapter 10) :)
data Imp a = Imp a (Imp (a,a)) deriving (Show)
instance Functor Imp where fmap h ~(Imp x xs) = Imp (h x) (fmap (\(x,y) -> (h x, h y)) xs)
update :: (a -> a) -> Position -> Imp a -> Imp a update f 1 ~(Imp x xs) = Imp (f x) xs update f n ~(Imp x xs) = Imp x $ update f2 (n `div` 2) xs where f2 ~(y,z) = if even n then (f y, z) else (y, f z)
Note that we can use irrefutable patterns without hesitation as there is only one constructor. Folding over an infinite thing is strange but here we are
fold :: (a -> b -> b) -> Imp a -> b fold f ~(Imp x xs) = f x (fold f2 xs) where f2 (x,y) z = f x (f y z)
It's not so strange anymore when we realize that this can be used to convert it into an infinite list
toList = fold (:)
The implementation of fromList is fun as well, so I won't tell it. As fold and unfold can be defined generically for Mu f where f is a functor, i wonder how to deal with it in the case of non-uniform recursion. For splitStreams, the key-position-map is needed in both directions, so we quickly define a bijective map
type BiMap a b = (Map.Map a b, Map.Map b a)
switch :: BiMap a b -> BiMap b a switch (x,y) = (y,x)
with the usual operations (empty, member, size etc.) omitted here. Now comes splitStreams itself:
splitStreams :: Ord a => [(a,b)] -> [(a,[b])] splitStreams xs = takeWhile (not . null . snd) $ toList $ splitStreams' empty xs
splitStreams' :: Ord a => BiMap a Position -> [(a,b)] -> Imp (a,[b]) splitStreams' bimap [] = fmap (\i -> (findWithDefault undefined i $ switch bimap,[])) $ fromList [1..] splitStreams' bimap ((a,b):xs) = update fun pos $ splitStreams' bimap' xs where fun ~(_,bs) = (a,b:bs) sz = size bimap pos = findWithDefault (sz+1) a bimap bimap' = (if member a bimap then id else insert a (sz+1)) bimap
Note that update actually generates fresh constructors, so the structure of our tails-Imp is not really static. But this is no problem as the form of the constructors is completely known: there is only one. Regards, apfelmus