
Hey guys, This is probably more of a question about functional programming than it is about Haskell, but hopefully you can help me out. I'm new to thinking about things in a functional way so I'm not sure what the best way to do some things are. I'm in a data structures course right now, and the assignments for the course are done in Java. I figured it'd be fun to try and implement them in Haskell as well. The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items. i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue] All push and pop operations on the stack have to be done in O(1) time. It was easy to do in Java since you can just keep track of everything with a few pointers, but it took a while to get the Haskell version working. Maybe I'm just not used to the functional way of doing things. I originally had this: data RBSItem a = Red a | Blue a data RedBlueStack a = RBS { red :: [RBSItem a], blue :: [RBSItem a], overall :: [RBSItem a] } But there was no way to keep popping in O(1) time because I would have to walk through the overall list when removing items. I eventually came up with: data ShowColour = PlusRed | MinusRed | PlusBlue | MinusBlue data RedBlueStack a = RBS { red :: [a], blue :: [a], order :: [ShowColour] } popRed :: RedBlueStack a -> RedBlueStack a popRed (RBS (r:rs) b o) = RBS rs b (MinusRed : o) popRed rbs@(RBS [] _ _) = rbs -- As an aside here, would it be better to put "popRed = id" for the catch-all in popRed instead of what I have? -- I don't know proper Haskell coding style yet. remUseless :: [ShowColour] -> [ShowColour] remUseless order@(x:xs) | x == MinusRed = remShowColour PlusRed xs | x == MinusBlue = remShowColour PlusBlue xs | otherwise = order where remShowColour r (c:cs) | c == r = cs | otherwise = c : remShowColour r cs remShowColour _ [] = error "Incorrect Stack Order" So instead of walking through the overall list, I just have to add a MinusRed or MinusBlue to the order list. This makes the pop and push functions operate in O(1) time, but it seems a bit excessive, because whenever the overall order of the stack is needed (e.g. printing the stack) I need to clean up the order list. I was just wondering whether this is the best way to implement something like this, keeping track of changes made instead of making the changes. If anyone has any ideas for other ways of implementing this I'd love to see them. I didn't take into account that Haskell is lazy. Will that have any effect on the running time? Probably not for a simple program like this, but for larger ones and more complex data structures and algorithms, I'm guessing it would? Thanks, Matt

Try writing
data RBStack = RBS [RBSItem] [RBSItem]
where the first list are all the same colour and the start of the
second list is a different colour. The rest should follow naturally
and you will get amortised O(1) push and pop (you occasionally have to
juggle the lists).
By the way, for this kind of question you'll get help much faster if
you ask on #haskell.
Jamie
On Thu, Sep 25, 2008 at 5:11 AM, Matthew Eastman
Hey guys,
This is probably more of a question about functional programming than it is about Haskell, but hopefully you can help me out. I'm new to thinking about things in a functional way so I'm not sure what the best way to do some things are.
I'm in a data structures course right now, and the assignments for the course are done in Java. I figured it'd be fun to try and implement them in Haskell as well.
The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items.
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
All push and pop operations on the stack have to be done in O(1) time.
It was easy to do in Java since you can just keep track of everything with a few pointers, but it took a while to get the Haskell version working. Maybe I'm just not used to the functional way of doing things.
I originally had this:
data RBSItem a = Red a | Blue a
data RedBlueStack a = RBS { red :: [RBSItem a], blue :: [RBSItem a], overall :: [RBSItem a] }
But there was no way to keep popping in O(1) time because I would have to walk through the overall list when removing items.
I eventually came up with:
data ShowColour = PlusRed | MinusRed | PlusBlue | MinusBlue
data RedBlueStack a = RBS { red :: [a], blue :: [a], order :: [ShowColour] }
popRed :: RedBlueStack a -> RedBlueStack a popRed (RBS (r:rs) b o) = RBS rs b (MinusRed : o) popRed rbs@(RBS [] _ _) = rbs
-- As an aside here, would it be better to put "popRed = id" for the catch-all in popRed instead of what I have? -- I don't know proper Haskell coding style yet.
remUseless :: [ShowColour] -> [ShowColour] remUseless order@(x:xs) | x == MinusRed = remShowColour PlusRed xs | x == MinusBlue = remShowColour PlusBlue xs | otherwise = order where remShowColour r (c:cs) | c == r = cs | otherwise = c : remShowColour r cs remShowColour _ [] = error "Incorrect Stack Order"
So instead of walking through the overall list, I just have to add a MinusRed or MinusBlue to the order list. This makes the pop and push functions operate in O(1) time, but it seems a bit excessive, because whenever the overall order of the stack is needed (e.g. printing the stack) I need to clean up the order list.
I was just wondering whether this is the best way to implement something like this, keeping track of changes made instead of making the changes. If anyone has any ideas for other ways of implementing this I'd love to see them.
I didn't take into account that Haskell is lazy. Will that have any effect on the running time? Probably not for a simple program like this, but for larger ones and more complex data structures and algorithms, I'm guessing it would?
Thanks, Matt _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Jamie Brandon wrote:
Try writing
data RBStack = RBS [RBSItem] [RBSItem]
where the first list are all the same colour and the start of the second list is a different colour. The rest should follow naturally and you will get amortised O(1) push and pop (you occasionally have to juggle the lists).
I am afraid, but this does not give constant amortized time. Let me reformulate the data type you have in mind as follows: data Stack2 r b = Empty | S [r] (Stack2 b r) deriving (Eq, Show) We're using the type system to distinguish between red (r) and blue (b) elements. The list [r] corresponds to your first list and means that the stack has red elements on top. The rest is a stack with blue elements on top. Using different types for red and blue is extremely cool :) because the compiler will complain about buggy code that deletes red elements instead of blue ones and the like. It already helped me to find a bug in my implementation below. All stack operations like push and pop will be performed on the red elements. We can switch between red and blue by making the list empty recolor :: Stack2 r b -> Stack2 b r recolor (S [] t) = t recolor t = S [] t In other words, a stack with blue elements on top is a red stack with an empty list of red elements on top :). We impose the /invariant/ that only the topmost list may be empty. Pushing a red element onto the stack is straightforward. push :: r -> Stack2 r b -> Stack2 r b push r Empty = S [r] Empty push r (S rs t) = S (r:rs) t and so is pushing blue elements thanks to recoloring pushB :: b -> Stack2 r b -> Stack2 r b pushB b = recolor . push b . recolor The topmost element may be either blue or red top :: Stack2 r b -> Maybe (Either r b) top Empty = Nothing top (S [] t) = fmap (\(Left b) -> Right b) (top t) top (S (r:_) t) = Just (Left r) Most importantly, we want to remove elements. Removing a red element is easy if there are red elements on the top pop :: Stack2 r b -> Stack2 r b pop (S (_:rs) t ) = S rs t otherwise we will have to remove them from behind the blue elements while taking care that our /invariant/ still holds. pop (S [] (S bs t)) = S [] . s bs . pop $ t where -- s is like S but takes care of the invariant s bs (S [] Empty ) = S bs Empty s bs (S [] (S bs' t')) = S (bs ++ bs') t' s bs t = t pop _ = Empty Exercise: Find and correct the bug in this implementation of pop ! Hint: let the type checker tell you where it is. Quiz question: How to remove blue elements? Unfortunately, this whole implementation is not O(1) time. The problem is our use of ++. Consider the stack S [R] $ S [B] $ S [R] $ S [B] $ S [R] $ S [B] $ S [R] $ S [B] $ Empty Removing all the blue elements from this stack will give S ((([R] ++ [R]) ++ [R]) ++ [R]) Empty and we see the feared left-parenthesized application of list concatenation. Removing all red elements but one and asking for top will take quadratic time which doesn't amortize to O(1). In other words, while cool, the above implementation is not really what you want, Matthew. Quite a disappointing result for such a long e-mail ;). But don't worry, in a subsequent post, I'll turn the above ideas into a better solution and I'll also explain why implementing this data structure seems more difficult in Haskell than in Java. Regards, apfelmus

apfelmus wrote:
data Stack2 r b = Empty | S [r] (Stack2 b r) deriving (Eq, Show)
In the previous post, I considered an implementation of red-blue stacks with the data type above. Unfortunately, it failed to perform in O(1) time because list concatenation needs linear time: xs ++ ys takes O(length xs) time But in Java, it's easy to append two (doubly) linked lists can in constant time; I mean, just link the tail of the first list to the head of the second. Why do we need linear time in Haskell? As Derek already said, the thing is that in Haskell, all data structures are *persistent* by default. Appending two linked lists in Java mutates the first list and its old version is no longer available. Doubly linked lists are said to be an *ephemeral* data structure. In Haskell, xs ++ ys does not change xs or ys at all, both are still around. Persistent data structures are harder to come up with than ephemeral ones, but there are some beautiful techniques available. For more, see Okasaki's book Chris Okasaki. Purely Function Data Structures. http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (This is the thesis on which the book is based.) In particular, chapter 7.2.1 presents a simple implementation of lists that support head, tail and (++) in constant time. So, we could "rescue" our red-blue stack by data StackL r b = Empty | S (List r) (StackL b r) using some more efficient data structure List a instead of [a]. That's exactly what we are going to do now, but with a twist: our lists won't store any elements at all! newtype List a = Length Int deriving (Eq,Show,Num) Instead, we're only storing the length of the list, so that empty list corresponds to 0 tail corresponds to n-1 ++ corresponds to + Clearly, this is a very efficient implementation of "lists" with "concatenation" in constant time^1. (Enable GeneralizedNewtypeDeriving so that the compiler will implement the instance Num (Length a) for us.) The implementation is just as before, except that we don't have any elements. But we can think of () as the element type. recolorL :: StackL r b -> StackL b r recolorL (S 0 t) = t recolorL t = S 0 t pushL :: () -> StackL r b -> StackL r b pushL Empty = S 1 Empty pushL (S rs t) = S (rs+1) t popL :: StackL r b -> StackL r b popL (S 0 (S bs t)) = S 0 . s bs . popL $ t where s bs (S 0 Empty ) = S bs Empty s bs (S 0 (S bs' t')) = S (bs + bs') t' s bs t = S bs t popL (S rs t ) = S (rs-1) t popL _ = Empty topL :: StackL r b -> Maybe (Either () ()) topL Empty = Nothing topL (S 0 t) = fmap (\(Left b) -> Right b) (topL t) topL (S _ _) = Just (Left ()) Note that we still enjoy the benefits of using different types for red and blue elements. For instance, the compiler won't allow us to add List r and List b , though both are plain integers. The r in List r is called a *phantom type* because, well, just like a phantom, the r isn't really there. Now, this is all well and good, but we wanted to store actual elements, didn't we? While the implementation above can't do that, it's perfectly able to keep track of the order of elements. And we just need to combine that with an external element storage to get a full red-blue stack: data RBStack r b = RBS [r] [b] (StackL r b) recolor (RBS rs bs n) = RBS bs rs (recolorL n) push r (RBS rs bs n) = RBS (r:rs) bs (pushL () n) pop (RBS rs bs n) = RBS (drop 1 rs) bs (popL n) top (RBS rs bs n) = fmap f (topL n) where f (Left _) = Left (head rs) f (Right _) = Right (head bs) Last but not least, I would like to add that the above implementation is of course inspired by the technique of "numerical representation", i.e. the analogy between the representation of a number n and a container with n elements. So, the trick was basically to replace the peano numbers [()] with the more efficient representation Int while the special nature of the problem allowed us to store the elements externally. For more about designing purely functional data structures with numerical representations, see of course Okasaki's book and also Ralf Hinze, Ross Paterson. Finger Trees: A Simple General-purpose Data Structure. http://www.soi.city.ac.uk/~ross/papers/FingerTree.html Regards, apfelmus Footnotes: ^1 Actually, addition of big integers is linear in the number of digits, i.e. logarithmic in the size of the integer.

This is a very good post and a clever idea. Thanks!
Luke
On Fri, Sep 26, 2008 at 3:30 AM, apfelmus
apfelmus wrote:
data Stack2 r b = Empty | S [r] (Stack2 b r) deriving (Eq, Show)
In the previous post, I considered an implementation of red-blue stacks with the data type above. Unfortunately, it failed to perform in O(1) time because list concatenation needs linear time:
xs ++ ys takes O(length xs) time
But in Java, it's easy to append two (doubly) linked lists can in constant time; I mean, just link the tail of the first list to the head of the second. Why do we need linear time in Haskell? As Derek already said, the thing is that in Haskell, all data structures are *persistent* by default. Appending two linked lists in Java mutates the first list and its old version is no longer available. Doubly linked lists are said to be an *ephemeral* data structure. In Haskell, xs ++ ys does not change xs or ys at all, both are still around.
Persistent data structures are harder to come up with than ephemeral ones, but there are some beautiful techniques available. For more, see Okasaki's book
Chris Okasaki. Purely Function Data Structures. http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf (This is the thesis on which the book is based.)
In particular, chapter 7.2.1 presents a simple implementation of lists that support head, tail and (++) in constant time. So, we could "rescue" our red-blue stack by
data StackL r b = Empty | S (List r) (StackL b r)
using some more efficient data structure List a instead of [a].
That's exactly what we are going to do now, but with a twist: our lists won't store any elements at all!
newtype List a = Length Int deriving (Eq,Show,Num)
Instead, we're only storing the length of the list, so that
empty list corresponds to 0 tail corresponds to n-1 ++ corresponds to +
Clearly, this is a very efficient implementation of "lists" with "concatenation" in constant time^1. (Enable GeneralizedNewtypeDeriving so that the compiler will implement the instance Num (Length a) for us.)
The implementation is just as before, except that we don't have any elements. But we can think of () as the element type.
recolorL :: StackL r b -> StackL b r recolorL (S 0 t) = t recolorL t = S 0 t
pushL :: () -> StackL r b -> StackL r b pushL Empty = S 1 Empty pushL (S rs t) = S (rs+1) t
popL :: StackL r b -> StackL r b popL (S 0 (S bs t)) = S 0 . s bs . popL $ t where s bs (S 0 Empty ) = S bs Empty s bs (S 0 (S bs' t')) = S (bs + bs') t' s bs t = S bs t popL (S rs t ) = S (rs-1) t popL _ = Empty
topL :: StackL r b -> Maybe (Either () ()) topL Empty = Nothing topL (S 0 t) = fmap (\(Left b) -> Right b) (topL t) topL (S _ _) = Just (Left ())
Note that we still enjoy the benefits of using different types for red and blue elements. For instance, the compiler won't allow us to add List r and List b , though both are plain integers. The r in List r is called a *phantom type* because, well, just like a phantom, the r isn't really there.
Now, this is all well and good, but we wanted to store actual elements, didn't we? While the implementation above can't do that, it's perfectly able to keep track of the order of elements. And we just need to combine that with an external element storage to get a full red-blue stack:
data RBStack r b = RBS [r] [b] (StackL r b)
recolor (RBS rs bs n) = RBS bs rs (recolorL n) push r (RBS rs bs n) = RBS (r:rs) bs (pushL () n) pop (RBS rs bs n) = RBS (drop 1 rs) bs (popL n)
top (RBS rs bs n) = fmap f (topL n) where f (Left _) = Left (head rs) f (Right _) = Right (head bs)
Last but not least, I would like to add that the above implementation is of course inspired by the technique of "numerical representation", i.e. the analogy between the representation of a number n and a container with n elements. So, the trick was basically to replace the peano numbers [()] with the more efficient representation Int while the special nature of the problem allowed us to store the elements externally. For more about designing purely functional data structures with numerical representations, see of course Okasaki's book and also
Ralf Hinze, Ross Paterson. Finger Trees: A Simple General-purpose Data Structure. http://www.soi.city.ac.uk/~ross/papers/FingerTree.html
Regards, apfelmus
Footnotes: ^1 Actually, addition of big integers is linear in the number of digits, i.e. logarithmic in the size of the integer.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

apfelmus wrote:
[..]
Persistent data structures are harder to come up with than ephemeral ones, [...]
Yes, in some cases it's quite hard to find a persistent solution for a data structure that is rather trivial compared to its ephemeral counterpart. My question is: Is there a case, where finding a persistent solution that performs equally well is *impossible* rather than just harder? I mean might there be a case where (forced) persistence (as we have in pure Haskell) is a definite disadvantage in terms of big-O notation? Do some problems even move from P to NP in a persistent setting? Stephan -- Früher hieß es ja: Ich denke, also bin ich. Heute weiß man: Es geht auch so. - Dieter Nuhr

On 26 Sep 2008, at 19:18, Stephan Friedrichs wrote:
apfelmus wrote:
[..]
Persistent data structures are harder to come up with than ephemeral ones, [...]
Yes, in some cases it's quite hard to find a persistent solution for a data structure that is rather trivial compared to its ephemeral counterpart. My question is: Is there a case, where finding a persistent solution that performs equally well is *impossible* rather than just harder? I mean might there be a case where (forced) persistence (as we have in pure Haskell) is a definite disadvantage in terms of big-O notation? Do some problems even move from P to NP in a persistent setting?
I'm fairly confident one could come up with a proof that you'll never go from P to NP because of it along the lines of treating all memory as being a list. Operations to modify memory are all in P (although slow), so any algorithm that relies on mutation to be in P will stay in P (although with a higher polynomial factor). Bob

On Fri, Sep 26, 2008 at 7:18 PM, Stephan Friedrichs
apfelmus wrote:
[..]
Persistent data structures are harder to come up with than ephemeral ones, [...]
Yes, in some cases it's quite hard to find a persistent solution for a data structure that is rather trivial compared to its ephemeral counterpart. My question is: Is there a case, where finding a persistent solution that performs equally well is *impossible* rather than just harder? I mean might there be a case where (forced) persistence (as we have in pure Haskell) is a definite disadvantage in terms of big-O notation? Do some problems even move from P to NP in a persistent setting?
The only result I'm aware of is that of Nicholas Pippenger where he shows that there are algorithms which are slower by a factor of log n if one is not allowed to use mutation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.670 Interestingly enough, this particular result does not carry over to Haskell. The particular algorithm that he uses can actually be implemented optimally using lazy evaluation, as show in the following paper: http://progtools.comlab.ox.ac.uk/members/oege/publications/jfp97 So a pure strict language is less efficient than a strict language with mutation and a pure lazy language. Although intuitively a pure lazy language should also be less efficient than a strict language with mutation I'm not aware of any such results. Cheers, Josef

Josef Svenningsson wrote:
Stephan Friedrichs wrote:
My question is: Is there a case, where finding a persistent solution that performs equally well is *impossible* rather than just harder? I mean might there be a case where (forced) persistence (as we have in pure Haskell) is a definite disadvantage in terms of big-O notation? Do some problems even move from P to NP in a persistent setting?
The only result I'm aware of is that of Nicholas Pippenger where he shows that there are algorithms which are slower by a factor of log n if one is not allowed to use mutation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.670
Note however that Pippenger is forced to make the additional assumption that the computation is online, i.e. that it operates on an infinite list data Inf a = a :> (Inf a) doPerms :: Int -> [Int] -> Inf a -> Inf a doPerms ~= \n ps -> concat . map (perm ps) . group n I am not aware of any result that was able to lift this restriction. In section 3, Pippenger essentially discusses that every ephemeral data structure that needs T(n) can be made persistent in T(n)*log T(n) time, basically by making the storage explicit, i.e. simulating RAM with a pure array like a binary tree. So, we can at least say that problems in P will stay there.
Interestingly enough, this particular result does not carry over to Haskell. The particular algorithm that he uses can actually be implemented optimally using lazy evaluation, as show in the following paper: http://progtools.comlab.ox.ac.uk/members/oege/publications/jfp97
So a pure strict language is less efficient than a strict language with mutation and a pure lazy language. Although intuitively a pure lazy language should also be less efficient than a strict language with mutation I'm not aware of any such results.
Yes, lazy evaluation makes persistent data structures much easier, sometimes even possible. It only gives amortized times, though. Regards, apfelmus

Matthew Eastman said:
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
Hmm, did you mean [Red,Blue] or [Red,Red,Red,Blue]? Judging by your implementation of remUseless, I'm guessing the latter. Here is a more straightforward approach than apfelmus'. I store colours separately, but count insertions so that I can easily reconstruct the overall ordering. To save myself some work, I've generalised to an arbitrary set of colours, though for O(1) behaviour, I'm assuming the set of colours is bounded finite. Unfortunately, this is still not quite O(1), due to the use of an Integer which can grow without bound. In practice, though, I don't think any of us will live long enough to notice. \begin{code} import qualified Data.Map as M import Data.List import Data.Maybe import Control.Arrow data CStack c a = CStack !Integer (M.Map c [(Integer,a)]) empty :: CStack c a empty = CStack 0 M.empty push :: Ord c => c -> a -> CStack c a -> CStack c a push c x (CStack i m) = CStack (i+1) (M.insertWith (++) c [(i,x)] m) popc :: Ord c => c -> CStack c a -> Maybe (a, CStack c a) popc c (CStack i m) = do cs <- M.lookup c m (_,x) <- listToMaybe cs return (x, CStack i (M.adjust tail c m)) pop :: Ord c => CStack c a -> Maybe ((c,a), CStack c a) pop = undefined -- left as an exercise :-) toList :: CStack c a -> [(c,a)] toList (CStack _ m) = map snd (foldr merge [] (map dist (M.toList m))) where dist (c,xs) = map (second ((,) c)) xs merge (xxs@((i,x):xs)) (yys@((j,y):ys)) | i > j = (i,x) : merge xs yys | i < j = (j,y) : merge xxs ys merge xs [] = xs merge [] ys = ys instance (Eq a, Eq c) => Eq (CStack c a) where x == y = toList x == toList y instance (Show a, Show c) => Show (CStack c a) where show = show . toList data RBColour = Red | Blue deriving (Eq,Ord,Show) type RedBlueStack a = CStack RBColour a \end{code}

Matthew Brecknell wrote:
Matthew Eastman said:
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
Hmm, did you mean [Red,Blue] or [Red,Red,Red,Blue]? Judging by your implementation of remUseless, I'm guessing the latter.
Yes, I meant the latter. Popping Blue in [Red, Red, Blue, Red, Blue] should give [Red, Red, Red, Blue]. Sorry for the confusion, I shouldn't be writing emails at midnight I guess! apfelmus wrote:
...
Our lists won't store any elements at all!
newtype List a = Length Int deriving (Eq,Show,Num)
Instead, we're only storing the length of the list, so that
empty list corresponds to 0 tail corresponds to n-1 ++ corresponds to +
...
Regards, apfelmus
Wow! That's a really clever way to think about a list. The way that you push blue elements is pretty interesting too, switching the positions of the lists and doing a regular push. Very insightful posts. I'm slowly reading through Okasaki's thesis now, I'm not sure how much of it I'm understanding but it seems pretty interesting. I had no idea that functional (I suppose "persistent" is the correct word) data structures were so different from ephemeral ones. Thomas Davie wrote:
In this interprettation, here's what I think is an O(1) implementation:
...
rbPop :: Colour -> RBStack a -> RBStack a rbPop c Empty = error "Empty Stack, can't pop" rbPop c (More c' v asCs nextNonC) | c == c' = asCs | otherwise = rbPop c nextNonC ...
Your pop doesn't seem to be in O(1) since you have to walk through the nextNonC stack if the colours don't match. Thanks for the help everyone, Matt

Thomas Davie wrote:
In this interprettation, here's what I think is an O(1) implementation:
...
rbPop :: Colour -> RBStack a -> RBStack a rbPop c Empty = error "Empty Stack, can't pop" rbPop c (More c' v asCs nextNonC) | c == c' = asCs | otherwise = rbPop c nextNonC ...
Your pop doesn't seem to be in O(1) since you have to walk through the nextNonC stack if the colours don't match.
Yep, this is still O(1) though, as you can guarentee that nextNonC will start with something of the correct colour. Thus the worst case here is that we walk once to the nextNonC element, and then do a different O(1) operation. Bob

On Thu, 2008-09-25 at 00:11 -0400, Matthew Eastman wrote:
Hey guys,
This is probably more of a question about functional programming than it is about Haskell, but hopefully you can help me out. I'm new to thinking about things in a functional way so I'm not sure what the best way to do some things are.
I'm in a data structures course right now, and the assignments for the course are done in Java. I figured it'd be fun to try and implement them in Haskell as well.
The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items.
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
All push and pop operations on the stack have to be done in O(1) time.
It was easy to do in Java since you can just keep track of everything with a few pointers, but it took a while to get the Haskell version working. Maybe I'm just not used to the functional way of doing things.
Note that purely functional data structures are inherently persistent (i.e. you can access old copies as well as new copies.) This is a significant extra constraint. Your Java type is almost certainly ephemeral, the opposite of persistent. Rewriting your Java code to be persistent while still maintaining the asymptotic complexities of the relevant operations is non-trivial. You can always achieve persistence by (deep) copying, but copying is an O(n) operation at best. Consider a simple example. The requirements are a sequence of elements with O(1) add and remove to beginning and end. This is easy. One solution is a doubly-linked list with head and tail pointers. Now if I add the requirement that it is persistent, i.e. if I have list1 = [a,b,c,d] and I make list2 = list1.RemoveLast(), list1 should still be [a,b,c,d] and list2 should be [a,b,c]. Now the problem is quite a bit more difficult. Try it. Also try to prove (informally or formally) that the asymptotic complexities hold and that the persistence guarantee holds. Okasaki's thesis and/or book, "Purely Functional Data Structures" goes into the differences and how to produce data structures with good complexity characteristics in a purely functional language. The implementations described are rather different from the usual implementations used for ephemeral data structures. The real-time deques described in the thesis are one solution to the above problem, in this case a purely functional one. In a nutshell, persistent data structures are inherently more difficult to build than ephemeral ones*, which are what are usually described, and in a purely functional language all data structures are persistent. * Proof: If I have a persistent data structure I can make an ephemeral one with the same asymptotic complexity behaviour by simply having a mutable reference holding the persistent data structure.

On 25 Sep 2008, at 06:11, Matthew Eastman wrote:
Hey guys,
This is probably more of a question about functional programming than it is about Haskell, but hopefully you can help me out. I'm new to thinking about things in a functional way so I'm not sure what the best way to do some things are.
I'm in a data structures course right now, and the assignments for the course are done in Java. I figured it'd be fun to try and implement them in Haskell as well.
The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items.
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
I wanted to add my own 2p to this discussion. I'm not dead certain I understand what is meant by the statement above, so I'm going to make a guess that when we pop an item, the top item on the stack should end up being the next item of the same colour as we popped. In this interprettation, here's what I think is an O(1) implementation: data RBStack a = Empty | More RBColour a (RBStack a) (RBStack a) data RBColour = Red | Blue rbPush :: Colour -> a -> RBStack a -> RBStack a rbPush c x Empty = Elem c x Empty Empty rbPush c x e@(More c' v asCs nextNonC) | c == c' = More c x e nextNonC | otherwise = More c x nextNonC e rbPop :: Colour -> RBStack a -> RBStack a rbPop c Empty = error "Empty Stack, can't pop" rbPop c (More c' v asCs nextNonC) | c == c' = asCs | otherwise = rbPop c nextNonC The idea is that an RBStack contains its colour, an element, and two other stacks -- the first one is the substack we should get by popping an element of the same colour. The second substack is the substack we get by looking for the next item of the other colour. When we push, we compare colours with the top element of the stack, and we swap around the same coloured/differently coloured stacks appropriately. When we pop, we jump to the first element of the right colour, and then we jump to the next element of the same colour. I hope I haven't missed something. Bob

Thomas Davie wrote:
Matthew Eastman wrote:
The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items.
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
I wanted to add my own 2p to this discussion. I'm not dead certain I understand what is meant by the statement above, so I'm going to make a guess that when we pop an item, the top item on the stack should end up being the next item of the same colour as we popped.
In this interpretation, here's what I think is an O(1) implementation:
data RBStack a = Empty | More RBColour a (RBStack a) (RBStack a)
data RBColour = Red | Blue
rbPush :: Colour -> a -> RBStack a -> RBStack a rbPush c x Empty = More c x Empty Empty rbPush c x e@(More c' v asCs nextNonC) | c == c' = More c x e nextNonC | otherwise = More c x nextNonC e
rbPop :: Colour -> RBStack a -> RBStack a rbPop c Empty = error "Empty Stack, can't pop" rbPop c (More c' v asCs nextNonC) | c == c' = asCs | otherwise = rbPop c nextNonC
The idea is that an RBStack contains its colour, an element, and two other stacks -- the first one is the substack we should get by popping an element of the same colour. The second substack is the substack we get by looking for the next item of the other colour.
When we push, we compare colours with the top element of the stack, and we swap around the same coloured/differently coloured stacks appropriately.
When we pop, we jump to the first element of the right colour, and then we jump to the next element of the same colour.
I hope I haven't missed something.
This looks O(1) but I don't understand your proposal enough to say that it matches what Matthew had in mind. Fortunately, understanding can be replaced with equational laws :) So, I think Matthew wants the following specification: A red-blue stack is a data structure data RBStack a with three operations data Color = Red | Blue empty :: RBStack a push :: Color -> a -> RBStack a -> RBStack a pop :: Color -> RBStack a -> RBStack a top :: RBStack a -> Maybe (Color, a) subject to the following laws -- pop removes elements of the same color pop Red . push Red x = id pop Blue . push Blue x = id -- pop doesn't interfere with elements of the other color pop Blue . push Blue x = push Blue x . pop Red pop Red . push Red x = push Red x . pop Blue -- top returns the last color pushed or nothing otherwise (top . push c x) stack = Just (c,x) top empty = Nothing -- pop on the empty stack does nothing pop c empty = empty These laws uniquely determine the behavior of a red-blue stack. Unfortunately, your proposal does not seem to match the second group of laws: (pop Blue . push Red r . push Blue b) Empty = pop Blue (push Red r (More Blue b Empty Empty)) = pop Blue (More Red r Empty (More Blue b Empty Empty)) = pop Blue (More Blue b Empty Empty) = Empty but = (push Red r . pop Blue . push Blue b) Empty = push Red r (pop Blue (More Blue b Empty Empty)) = push Red r Empty = More Red r Empty Empty The red element got lost in the first case. Regards, apfelmus

On 27 Sep 2008, at 20:16, apfelmus wrote:
Thomas Davie wrote:
Matthew Eastman wrote:
The part of the assignment I'm working on is to implement a RedBlueStack, a stack where you can push items in as either Red or Blue. You can pop Red and Blue items individually, but the stack has to keep track of the overall order of items.
i.e. popping Blue in [Red, Red, Blue, Red, Blue] would give [Red, Red, Blue]
I wanted to add my own 2p to this discussion. I'm not dead certain I understand what is meant by the statement above, so I'm going to make a guess that when we pop an item, the top item on the stack should end up being the next item of the same colour as we popped.
In this interpretation, here's what I think is an O(1) implementation:
data RBStack a = Empty | More RBColour a (RBStack a) (RBStack a)
data RBColour = Red | Blue
rbPush :: Colour -> a -> RBStack a -> RBStack a rbPush c x Empty = More c x Empty Empty rbPush c x e@(More c' v asCs nextNonC) | c == c' = More c x e nextNonC | otherwise = More c x nextNonC e
rbPop :: Colour -> RBStack a -> RBStack a rbPop c Empty = error "Empty Stack, can't pop" rbPop c (More c' v asCs nextNonC) | c == c' = asCs | otherwise = rbPop c nextNonC
The idea is that an RBStack contains its colour, an element, and two other stacks -- the first one is the substack we should get by popping an element of the same colour. The second substack is the substack we get by looking for the next item of the other colour.
When we push, we compare colours with the top element of the stack, and we swap around the same coloured/differently coloured stacks appropriately.
When we pop, we jump to the first element of the right colour, and then we jump to the next element of the same colour.
I hope I haven't missed something.
This looks O(1) but I don't understand your proposal enough to say that it matches what Matthew had in mind.
Fortunately, understanding can be replaced with equational laws :) So, I think Matthew wants the following specification: A red-blue stack is a data structure
data RBStack a
with three operations
data Color = Red | Blue
empty :: RBStack a push :: Color -> a -> RBStack a -> RBStack a pop :: Color -> RBStack a -> RBStack a top :: RBStack a -> Maybe (Color, a)
subject to the following laws
-- pop removes elements of the same color pop Red . push Red x = id pop Blue . push Blue x = id
-- pop doesn't interfere with elements of the other color pop Blue . push Blue x = push Blue x . pop Red pop Red . push Red x = push Red x . pop Blue
-- top returns the last color pushed or nothing otherwise (top . push c x) stack = Just (c,x) top empty = Nothing
-- pop on the empty stack does nothing pop c empty = empty
These laws uniquely determine the behavior of a red-blue stack.
Unfortunately, your proposal does not seem to match the second group of laws:
(pop Blue . push Red r . push Blue b) Empty = pop Blue (push Red r (More Blue b Empty Empty)) = pop Blue (More Red r Empty (More Blue b Empty Empty)) = pop Blue (More Blue b Empty Empty) = Empty
but
= (push Red r . pop Blue . push Blue b) Empty = push Red r (pop Blue (More Blue b Empty Empty)) = push Red r Empty = More Red r Empty Empty
The red element got lost in the first case.
I don't think my proposal even meets the first set of laws -- I interpretted the question differently. pop Red . push Red 1 (More Blue 2 Empty (More Red 3 Empty Empty)) == More Red 3 Empty Empty Bob

I'm a haskell beginner so the following code might not meet haskell coding standards. I think that it is a correct O(1) implementation. Sorry if i simply recoded an already posted solution that i did not understand correctly. --- code --------- module Main where data Col a = Red a | Blue a data RBStack a = RBS [Col Int] -- order [a] -- blues [a] -- reds empty = RBS [] [] [] push (Blue e) (RBS [] [] []) = RBS [Blue 1] [e] [] push (Blue e) (RBS ((Blue n):ns) bs rs) = RBS ((Blue (n+1)):ns) (e:bs) rs push (Blue e) (RBS ns bs rs) = RBS ((Blue 1):ns) (e:bs) rs push (Red e) (RBS [] [] []) = RBS [Red 1] [] [e] push (Red e) (RBS ((Red n):ns) bs rs) = RBS ((Red (n+1)):ns) bs (e:rs) push (Red e) (RBS ns bs rs) = RBS ((Red 1):ns) bs (e:rs) popBlue (RBS [] _ _) = error "no blue, empty stack" popBlue (RBS [Red _] _ _) = error "no blue" popBlue (RBS ((Red nr):(Blue 1):[]) [b] rs) = RBS [Red nr] [] rs popBlue (RBS ((Red nr):(Blue 1):(Red nr'):s) (b:bs) rs) = RBS ((Red (nr+nr')):s) bs rs popBlue (RBS ((Red nr) :(Blue nb):s) (b:bs) rs) = RBS ((Red nr):(Blue (nb-1)):s) bs rs popBlue (RBS ((Blue 1):s) (b:bs) rs) = RBS s bs rs popBlue (RBS ((Blue nb):s) (b:bs) rs) = RBS (Blue (nb-1):s) bs rs popRed (RBS [] _ _) = error "no red, empty stack" popRed (RBS [Blue _] _ _) = error "no red" popRed (RBS ((Blue nb):(Red 1):[]) bs [r]) = RBS [Blue nb] bs [] popRed (RBS ((Blue nb):(Red 1):(Blue nb'):s) bs (r:rs)) = RBS ((Blue (nb+nb')):s) bs rs popRed (RBS ((Blue nb):(Red nr):s) bs (r:rs)) = RBS ((Blue nb):(Red (nr-1)):s) bs rs popRed (RBS ((Red 1):s) bs (r:rs)) = RBS s bs rs popRed (RBS ((Red nr):s) bs (r:rs)) = RBS (Red (nr-1):s) bs rs pop (RBS [] _ _) = error "empty stack" pop rbs@(RBS ((Red _):_) _ _) = popRed rbs pop rbs@(RBS ((Blue _):_) _ _) = popBlue rbs pp (RBS [] [] []) = "" pp (RBS ((Red 1):s) bs (r:rs)) = "r " ++ (pp (RBS s bs rs)) pp (RBS ((Red n):s) bs (r:rs)) = "r " ++ (pp (RBS ((Red (n-1)):s) bs rs)) pp (RBS ((Blue 1):s) (b:bs) rs) = "b " ++ (pp (RBS s bs rs)) pp (RBS ((Blue n):s) (b:bs) rs) = "b " ++ (pp (RBS ((Blue (n-1)):s) bs rs)) altPushRed 0 = empty altPushRed n = push (Red n) (altPushBlue (n-1)) altPushBlue 0 = empty altPushBlue n = push (Blue n) (altPushRed (n-1)) main = do let s = altPushRed 4 s1 = popBlue $ popBlue $ s s2 = popRed $ popRed $ s s3 = pop $ pop $ s putStrLn ("s = " ++ (pp s)) putStrLn ("s1 = " ++ (pp s1)) putStrLn ("s2 = " ++ (pp s2)) putStrLn ("s3 = " ++ (pp s3))

It won't be O(1) but this is how I would do it. It uses alternating lists of red and blue elements. It has to access at most three elements from this list for any one operation so as long as we don't have huge blocks of red or blue elements performance should be quite good. The worst case I can think of is if we have an extremely large number of one colour followed by a single element of the other then pop that single element off the stack. This would require two lists (before and after the single element) to be combined with ++, taking time linear to the size of the first list. Anyway, here's some code: module RBStack where data RBColour = Red | Blue deriving (Show, Eq) data RBStack a = RBStack { headColour :: RBColour, stackElems :: [[a]] } deriving (Show, Eq) otherCol :: RBColour -> RBColour otherCol Red = Blue otherCol Blue = Red empty :: RBStack a empty = RBStack Red [] push :: RBColour -> a -> RBStack a -> RBStack a push col val stack | null (stackElems stack) = RBStack col [[val]] | headColour stack == col = RBStack col ((val:e):es) | otherwise = RBStack col ([val]:e:es) where (e:es) = stackElems stack popColour :: RBColour -> RBStack a -> (Maybe a, RBStack a) popColour col stack | null (stackElems stack) = (Nothing, stack) | headColour stack == col = (Just (head e), if null (tail e) then (RBStack (otherCol col) es) else (RBStack col ((tail e):es))) | otherwise = if null es then (Nothing, empty) else let (f:fs) = es in (Just (head f), if null (tail f) then (if null fs then (RBStack (otherCol col) [e]) else (RBStack (otherCol col) ((e ++ (head fs)):(tail fs)))) else RBStack (otherCol col) (e:(tail f):fs)) where (e:es) = stackElems stack pop :: RBStack a -> (Maybe (RBColour, a), RBStack a) pop stack | null (stackElems stack) = (Nothing, stack) | otherwise = (Just (col, head e), if null (tail e) then (RBStack (otherCol col) es) else (RBStack col ((tail e):es))) where (e:es) = stackElems stack col = headColour stack peek :: RBStack a -> Maybe (RBColour, a) peek = fst . pop

There was a bug in there with popping the non-head colour off the stack. Updated code, please test thoroughly: module RBStack where data RBColour = Red | Blue deriving (Show, Eq) data RBStack a = RBStack { headColour :: RBColour, stackElems :: [[a]] } deriving (Show, Eq) otherCol :: RBColour -> RBColour otherCol Red = Blue otherCol Blue = Red empty :: RBStack a empty = RBStack Red [] push :: RBColour -> a -> RBStack a -> RBStack a push col val stack | null (stackElems stack) = RBStack col [[val]] | headColour stack == col = RBStack col ((val:e):es) | otherwise = RBStack col ([val]:e:es) where (e:es) = stackElems stack popColour :: RBColour -> RBStack a -> (Maybe a, RBStack a) popColour col stack | null (stackElems stack) = (Nothing, stack) | headColour stack == col = (Just (head e), if null (tail e) then (RBStack (otherCol col) es) else (RBStack col ((tail e):es))) | otherwise = if null es then (Nothing, stack) else let (f:fs) = es in (Just (head f), if null (tail f) then (if null fs then (RBStack (otherCol col) [e]) else (RBStack (otherCol col) ((e ++ (head fs)):(tail fs)))) else RBStack (otherCol col) (e:(tail f):fs)) where (e:es) = stackElems stack pop :: RBStack a -> (Maybe (RBColour, a), RBStack a) pop stack | null (stackElems stack) = (Nothing, stack) | otherwise = (Just (col, head e), if null (tail e) then (RBStack (otherCol col) es) else (RBStack col ((tail e):es))) where (e:es) = stackElems stack col = headColour stack peek :: RBStack a -> Maybe (RBColour, a) peek = fst . pop
participants (11)
-
apfelmus
-
Derek Elkins
-
Jamie Brandon
-
jean verdier
-
Josef Svenningsson
-
Luke Palmer
-
Matthew Brecknell
-
Matthew Eastman
-
Stephan Friedrichs
-
Thomas Davie
-
Timothy Goddard