Doubly-linked zipper list w/ insert implementation

The other day I decided to implement a ring buffer with a current element (i.e. a doubly-linked zipper list). In order to allow inserts (and, in the future, deletes and updates), I have a special sentinel element called "Join" in the structure. When inserting, I find the join first, insert and then rebuild the buffer using circular programming techniques. This also allows the buffer to be converted back to a list. The current element can be changed by rotating right or left, which never fails. Rotating n positions takes n steps. I'm posting it here for comments and feedback. How could the structure be smarter? Would storing a unique ID with each element make more sense? Any comments on the space behavior under insert and rotates? I wanted to "maximize" sharing. Thanks in advance. Justin p.s. The original motivation for writing this was to model cellular automata. The CA world is "circular", so that got me thinking about a structure that made connecting the ends easy to do. -- cut here --- module Ring (Ring, create, insert, current, rotR, rotL, toListL, toListR) -- Thanks to Keith Wansbrough for his posting describing a doubly-linked list -- in Haskell for inspiration here. -- -- http://groups.google.com/group/comp.lang.functional/msg/8c65fdd16f7e91e1 where import Data.List (foldl') -- For testing import Test.QuickCheck import Control.Monad (replicateM_) import System.Random (randomRIO) import System.Environment (getArgs) data Ring a = Ring (Ring a) a (Ring a) | Join (Ring a) (Ring a) instance (Show a) => Show (Ring a) where show r = "{" ++ show' (findLeftOfJoin r) where show' (Join _ _) = "}" show' (Ring l v _) | isJoin l = show v ++ show' l | otherwise = show v ++ "," ++ show' l create v = let me = Ring join v join join = Join me me in me insert r a = let (left, right) = insert' left right start a start = findLeftOfJoin r in left where insert' left right (Join _ _) val = let last = Ring join val right join = Join left last in (last, join) insert' left right (Ring l v _) val = let this = Ring newL val right (newL, newR) = insert' left this l v in (this, newR) fromList [] = error "Can't create empty ring" fromList ls = let (left, right) = fromList' left right ls in left where -- compute this ring, given left and right pointers. Return -- left and right pointers for this segment fromList' left right [] = let join = Join left right in (join, right) fromList' left right (x:xs) = let this = Ring l x right (l, r) = fromList' left this xs in (this, r) toList = toListL -- View of list from left hand side toListL = toList' . findLeftOfJoin where toList' (Join _ _) = [] toList' (Ring l v _) = v : toList' l toListR = toList' . findRightOfJoin where toList' (Join _ _) = [] toList' (Ring _ v r) = v : toList' r current (Ring r v l) = v current _ = error "Join is never current (curr)" rotR r amt | amt > 0 = goRight r amt | amt < 0 = goLeft r (negate amt) | otherwise = r rotL r amt | amt > 0 = goLeft r amt | amt < 0 = goRight r (negate amt) | otherwise = r goRight r 0 = r goRight (Ring _ _ r@(Ring _ _ _)) amt = goRight r (amt - 1) goRight (Ring _ _ (Join _ r)) amt = goRight r (amt - 1) goRight (Join _ _) _ = error "Join is never current (goRight)" goLeft r 0 = r goLeft (Ring l@(Ring _ _ _) _ _) amt = goLeft l (amt - 1) goLeft (Ring (Join l _) _ _) amt = goLeft l (amt - 1) goLeft (Join _ _) _ = error "Join is never current (goLeft)" isRing (Ring _ _ _) = True isRing _ = False isJoin (Join _ _) = True isJoin _ = False findLeftOfJoin (Join l _) = l findLeftOfJoin (Ring l _ _) = findLeftOfJoin l findRightOfJoin (Join _ r) = r findRightOfJoin (Ring _ _ r) = findRightOfJoin r

Justin Bailey wrote:
The other day I decided to implement a ring buffer with a current element (i.e. a doubly-linked zipper list). In order to allow inserts (and, in the future, deletes and updates), I have a special sentinel element called "Join" in the structure. When inserting, I find the join first, insert and then rebuild the buffer using circular programming techniques. This also allows the buffer to be converted back to a list. The current element can be changed by rotating right or left, which never fails. Rotating n positions takes n steps.
I'm posting it here for comments and feedback. How could the structure be smarter? Would storing a unique ID with each element make more sense? Any comments on the space behavior under insert and rotates? I wanted to "maximize" sharing. Thanks in advance.
Do you really need to realize the cycle by sharing? I mean, sharing doesn't go well with insertion / updates / deletion since each of these operations breaks it and needs to restore it everywhere. In other words, your insert takes O(n) time. I'd simply drop the sharing and use two double ended queues (or something like that) instead data Ring a = Ring (DeQueue a) a (DeQueue a) -- pseudo-code missing lots of cases. I want views! left (Ring (l' :< ls :> l) x (r :< rs :> r')) = Ring (ls :> l :> x) r (rs :> r' :> l') This way, you can implement update operations in O(1) time instead of O(n). With a fancy random access queue like Data.Sequence , you can even have rotations like rotL xs n in O(log n) time. (I keep mixing up the meaning of rotL and rotR , does L push the current element to the left or does it rotate the ring clockwise?) Regards, apfelmus

On Nov 7, 2007 10:16 AM, apfelmus
Do you really need to realize the cycle by sharing? I mean, sharing doesn't go well with insertion / updates / deletion since each of these operations breaks it and needs to restore it everywhere. In other words, your insert takes O(n) time. I'd simply drop the sharing and use two double ended queues (or something like that) instead
Very good point, and much easier to implement with Data.Sequence to boot. All that circular programming made my brain hurt. Thanks for your feedback. Justin

(Btw, this ring stuff could be relevant for Xmonad, I don't know whether the workspace/window-ring implementation there is O(1). Not that it matters for <1000 windows, of course :) Justin Bailey wrote:
apfelmus wrote:
Do you really need to realize the cycle by sharing? I mean, sharing doesn't go well with insertion / updates / deletion since each of these operations breaks it and needs to restore it everywhere. In other words, your insert takes O(n) time. I'd simply drop the sharing and use two double ended queues (or something like that) instead
Very good point, and much easier to implement with Data.Sequence to boot. All that circular programming made my brain hurt.
There's also a direct and rather lightweight possibility to implement rings in the spirit of the classic O(1) lazy amortized functional queue implementation. This post will try to explain it. Here's the basic idea for implementing queues in Haskell: we have a front list to fetch items (head, tail) and a rear list to insert items (snoc) into the queue. data Queue a = Queue [a] [a] empty = Queue [] [] head (Queue (x:f) r) = x tail (Queue (x:f) r) = Queue f r snoc (Queue f r) x = Queue f (x:r) Of course, this doesn't quite work yet, at some point we have to feed the items from the rear list into the front list. For example, the last possibility to do so is when the front list becomes empty. balance (Queue [] r) = Queue (reverse r) [] balance q = q tail (Queue (x:f) r) = balance $ Queue f r snoc (Queue f r) x = balance $ Queue f (x:r) (Calling balance maintains the invariant that the front list is never empty except when the whole queue is empty, too.) Now, how much time will a single snoc or tail operation take? In the worst case, tail triggers a reverse and takes O(n) time whereas snoc always takes constant time. That's a big blow to our goal of O(1) time for both. But luckily, queues don't come out of "thin air", they all have to be constructed from the empty queue by a sequence of applications of snoc and tail . Can the heavy O(n) cost of a worst case tail be spread over the many good cases of tail and snoc in that sequence? Yes, it can. To that end, we increase the price of each snoc by 1 time "coin". So, each item of the rear list gets inserted with one extra coin as "credit". With these credits, we can pay the whole length (rear list) cost of a reverse operation when it occurs, making tail O(1) again. This is also called _amortization_ and O(1) the _amortized_ cost of tail . The above works fine if the queue is used in a single-threaded way i.e. as _ephemeral_ data structure. But it doesn't work anymore when a queue is used multiple times in a _persistent_ setting. Assuming that tail q triggers a reverse , the first evaluation of q1 in let q1 = tail q q2 = tail q q3 = tail q ... in ... q1 .. q2 .. q3 will use up all credits and q2, q3,... don't have any to spend and are back to worst-case behavior. In the persistent setting, lazy evaluation comes to the rescue. The idea is to create the (yet unevaluated) call to reverse earlier, namely when the rear list has more elements than the front list. balance (Queue f r) | length r >= length f = Queue (f ++ reverse r) [] balance q = q (We assume that length has been made O(1) by storing the lengths explicitly.) Now, the O(length r) reverse will not be evaluated before having "tailed" through the previous front list with length f == length r items. Thus, we can spread the cost of reverse as "debits" over these elements. When finally executing reverse , its debits have already been paid off and tail is O(1) again. And once executed, lazy evaluation memoizes the result, so that sharing doesn't duplicate the work. (Note that strict languages without side effects are doomed to be slower when persistence matters. Ha! ;) So much for a too short introduction to the classic purely functional queue implementation. For a detailed exposition and much more, see also Chris Okasaki. Purely Functional Data Structures. (Thesis) http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf or his book with the same title which arose from this thesis. Now, rings can be implemented in a similar style. data Ring a = Ring [a] a [a] rotL (Ring ls x (r:rs)) = balance $ Ring (x:ls) r rs rotR (Ring (l:ls) x rs) = balance $ Ring ls l (x:rs) (For simplicity, we only deal with the case where the left and right list are non-empty.) How to balance? In contrast to queues, doing a full reverse when one list is empty doesn't even work in the ephemeral case since a rotR following a rotL will undo the reverse with yet another expensive reverse . But we can apply the same idea as for persistent queues and balance as soon as one list becomes like 2 times (or 3 or whatever) as large as the other one balance (Ring ls x rs) | length ls > 2*length rs = r' | length rs > 2*length ls = r' where n = length ls + length rs k = n `div` 2 r' = Ring (take k $ ls ++ reverse (drop (n-k) rs)) x (take (n-k) $ rs ++ reverse (drop k ls)) balance r = r This will make rotL and rotR run in O(1) amortized time. Exercises: 1) Complete the implementation of rotL and rotR . Besides dealing with possibly empty ls and rs , it's also possible to assume them non-empty and use special constructors for rings with <= 2 elements. 2) Use the balancing scheme for rings to implement double-ended queues, i.e. queues with both cons and snoc . 3) Read Okasaki's book and prove my O(1) claims :) Regards, apfelmus

Justin Bailey wrote:
The other day I decided to implement a ring buffer with a current element (i.e. a doubly-linked zipper list).
[...]
p.s. The original motivation for writing this was to model cellular automata. The CA world is "circular", so that got me thinking about a structure that made connecting the ends easy to do.
Note that depending on your concrete setting, you may not need a fancy ring structure for cellular automata. And with simple automata like c'_i = c_(i-1) `xor` c_i `xor` c_(i+1) it may even be easier to generate fresh rings for each step in the automaton: data Context a = Context [a] a [a] -- rotate left rotL (Context ls x (r:rs)) = Context (x:ls) r rs -- description of a cellular automaton type Rule a = Context a -> a example :: Rule Bool example (Context (cm:_) c (cp:_)) = cm `xor` c `xor` cp -- run a cellular automaton on an initial band of cells -- which is considered to be cyclic, i.e. a "cylinder" automate :: Rule a -> [a] -> [[a]] automate f xs = iterate (take n . map f . mkContexts) xs where -- length of the cell band n = length xs mkContexts (x:xs) = iterate rotL $ Context (cycle $ reverse xs) (head xs) (tail $ cycle xs) Here, mkContexts xs initializes a new infinite cyclic "ring" for xs and rotates it left ad infinitum. Regards, apfelmus

On Nov 10, 2007 12:24 PM, apfelmus
Note that depending on your concrete setting, you may not need a fancy ring structure for cellular automata. And with simple automata like
I realized that I never updated my automata once a row was created, and ended up using an unboxed array with an index to represent the ring. I just do some math when I want to "rotate" left or right and the index falls off the edge. The rules are much more complex though. I am using a genetic algorithm technique to "evolve" 7 bit rules which can classify if an initial row was mostly black or mostly white. This is loosely related to a class assignment. I'm finding that taking 100 initial rules, determining their fitness on 100 initial automatas, and doing that for 100 generations is taking a loooong time. Our teacher's implementation, in C, does it in about a minute. Mine takes hours :( . I think its becuase the C algorithm does a lot of bit-twiddling to iterate the automata, while I'm using lists of integers (1, 0). Anyways, thanks for your thoughts! Justin
participants (2)
-
apfelmus
-
Justin Bailey