Re: Understanding tail recursion and trees

Hi, Thanks to Miguel for pointing out my silly error. So at least my understanding of tail recursion is correct :) So then the question becomes: what *is* the best way to write this function? One version I can think of is
ecount :: [Tree] -> Integer -> Integer ecount [] acc = acc ecount (Leaf _ : ts) acc = ecount ts $! (acc + 1) ecount (Branch t1 t2 : ts) acc = ecount (t1 : t2 : ts) acc
which essentially maintains an explicit stack and runs on all trees. Are there better ways to do this? Thanks again and sorry for my mistake, Edsko

On Thu, May 1, 2008 at 9:32 AM, Edsko de Vries
So then the question becomes: what *is* the best way to write this function?
I guess it would be simpler to have the counter on the data type and a smart branch constructor:
data Tree = Leaf Integer | Branch Integer Tree Tree
count :: Tree -> Integer count (Leaf _) = 1 count (Branch c _ _) = c
branch :: Tree -> Tree -> Tree branch left right = Branch (count left + count right) left right
gentreeL :: Integer -> Tree gentreeL 0 = Leaf 0 gentreeL n = branch (gentreeL (n-1)) (Leaf 0)
etc. -- Felipe.

On Thu, May 1, 2008 at 9:44 AM, Felipe Lessa
On Thu, May 1, 2008 at 9:32 AM, Edsko de Vries
wrote: So then the question becomes: what *is* the best way to write this function?
I guess it would be simpler to have the counter on the data type and a smart branch constructor:
Under more careful analysis, it seems I just moved the stack overflow from the counter function to the generator =). Modifying the data type to
data Tree = Leaf Integer | Branch !Integer Tree Tree
also won't work in this example (although it seems to fail earlier). However, I'd expect the data type above to work nicely with most real applications (that doesn't construct the entire tree in one go), such as Data.Map[1]. But the answer to your original question really seems to be using an explicit stack created in the heap. This technique is used in Data.Map in a certain case[2] and, although has received a lot of attention on a thread that sparked some time ago (I think it was [3]) for being merely a workaround over the limited stack, it seems to me it's a compulsory workaround for the time being when working with problems like yours. HTH, [1] http://haskell.org/ghc/docs/latest/html/libraries/containers/src/Data-Map.ht... [2] http://haskell.org/ghc/docs/latest/html/libraries/containers/src/Data-Map.ht... [3] http://www.haskell.org/pipermail/haskell-cafe/2008-February/039104.html -- Felipe.

Felipe Lessa wrote:
On Thu, May 1, 2008 at 9:44 AM, Felipe Lessa
wrote: On Thu, May 1, 2008 at 9:32 AM, Edsko de Vries
wrote: So then the question becomes: what *is* the best way to write this function?
I guess it would be simpler to have the counter on the data type and a smart branch constructor:
Under more careful analysis, it seems I just moved the stack overflow from the counter function to the generator =). Modifying the data type to
data Tree = Leaf Integer | Branch !Integer Tree Tree
also won't work in this example (although it seems to fail earlier). However, I'd expect the data type above to work nicely with most real applications (that doesn't construct the entire tree in one go), such as Data.Map[1].
But the answer to your original question really seems to be using an explicit stack created in the heap. This technique is used in Data.Map in a certain case[2] and, although has received a lot of attention on a thread that sparked some time ago (I think it was [3]) for being merely a workaround over the limited stack, it seems to me it's a compulsory workaround for the time being when working with problems like yours.
I think that consuming heap instead of stack is the best we can do. I may be wrong, but it seems to be more or less impossible to traverse a tree in constant memory. Well, if the tree structure doesn't have back links (apart from left, right). The thing is we have to remember nodes to return and remember if we went to the left or to the right. The left or right biased tree is just a list-like structure, where we don't have to remember anything, we can just proceed. That's why it easy to traverse them in constant memory. Maybe, in a C program we could traverse a tree without back links in constant memory by XORing pointers and left-right booleans. That would employ the property of xor that (a xor b) xor a = b. But I'm not sure. Well, anyway, that's not about Haskell.

On Fri, 2008-05-02 at 00:10 +0400, Daniil Elovkov wrote:
Felipe Lessa wrote:
On Thu, May 1, 2008 at 9:44 AM, Felipe Lessa
wrote: On Thu, May 1, 2008 at 9:32 AM, Edsko de Vries
wrote: So then the question becomes: what *is* the best way to write this function?
I guess it would be simpler to have the counter on the data type and a smart branch constructor:
Under more careful analysis, it seems I just moved the stack overflow from the counter function to the generator =). Modifying the data type to
data Tree = Leaf Integer | Branch !Integer Tree Tree
also won't work in this example (although it seems to fail earlier). However, I'd expect the data type above to work nicely with most real applications (that doesn't construct the entire tree in one go), such as Data.Map[1].
But the answer to your original question really seems to be using an explicit stack created in the heap. This technique is used in Data.Map in a certain case[2] and, although has received a lot of attention on a thread that sparked some time ago (I think it was [3]) for being merely a workaround over the limited stack, it seems to me it's a compulsory workaround for the time being when working with problems like yours.
I think that consuming heap instead of stack is the best we can do.
I may be wrong, but it seems to be more or less impossible to traverse a tree in constant memory. Well, if the tree structure doesn't have back links (apart from left, right).
The thing is we have to remember nodes to return and remember if we went to the left or to the right. The left or right biased tree is just a list-like structure, where we don't have to remember anything, we can just proceed. That's why it easy to traverse them in constant memory.
Maybe, in a C program we could traverse a tree without back links in constant memory by XORing pointers and left-right booleans. That would employ the property of xor that (a xor b) xor a = b. But I'm not sure.
Well, anyway, that's not about Haskell.

On Thu, May 1, 2008 at 4:10 PM, Daniil Elovkov
Felipe Lessa wrote:
On Thu, May 1, 2008 at 9:44 AM, Felipe Lessa
wrote: On Thu, May 1, 2008 at 9:32 AM, Edsko de Vries
wrote: So then the question becomes: what *is* the best way to write this function?
I guess it would be simpler to have the counter on the data type and a smart branch constructor:
Under more careful analysis, it seems I just moved the stack overflow from the counter function to the generator =). Modifying the data type to
data Tree = Leaf Integer | Branch !Integer Tree Tree
also won't work in this example (although it seems to fail earlier). However, I'd expect the data type above to work nicely with most real applications (that doesn't construct the entire tree in one go), such as Data.Map[1].
But the answer to your original question really seems to be using an explicit stack created in the heap. This technique is used in Data.Map in a certain case[2] and, although has received a lot of attention on a thread that sparked some time ago (I think it was [3]) for being merely a workaround over the limited stack, it seems to me it's a compulsory workaround for the time being when working with problems like yours.
I think that consuming heap instead of stack is the best we can do.
I may be wrong, but it seems to be more or less impossible to traverse a tree in constant memory. Well, if the tree structure doesn't have back links (apart from left, right).
I think Huet's Zipper is intended to solve this sort of problem.
data Path = Top | BranchL Path Tree | BranchR Tree Path
type Zipper = (Path, Tree)
openZipper :: Tree -> Zipper
openZipper t = (Top, t)
Conceptually the zipper is a tree with one subtree selected. You can
move the selection point with the (partial) functions defined below.
downL, downR, up :: Zipper -> Zipper
downL (p, Branch l r) = (BranchL p r, l)
downR (p, Branch l r) = (BranchR l p, r)
up (BranchL p r, l) = (p, Branch l r)
up (BranchR l p, r) = (p, Branch l r)
Note that these functions just shuffle existing subtrees around.
Depending on your traversal pattern, these can run in roughly constant
space.
Using the zipper, we can define functions that traverse the entire
tree and return a new tree:
number :: Tree -> Tree
number t = down Top t 0
where
down p (Leaf _) n = up p (Leaf n) $! n + 1
down p (Branch l r) n = down (BranchL p r) l n
up Top t n = t
up (BranchL p r) l n = down (BranchR l p) r n
up (BranchR l p) r n = up p (Branch l r) n
For something like counting, we can simplify considerably because we
don't need to retain the already-traversed portion of the tree.
acountZ :: Tree -> Integer
acountZ t = down t [] 0
where
down (Branch l r) p i = down l (r:p) i
down (Leaf _) (p:ps) i = down p ps $! i + 1
down (Leaf _) [] i = i + 1
--
Dave Menendez

Hi,
I think Huet's Zipper is intended to solve this sort of problem.
data Path = Top | BranchL Path Tree | BranchR Tree Path type Zipper = (Path, Tree)
openZipper :: Tree -> Zipper openZipper t = (Top, t)
Conceptually the zipper is a tree with one subtree selected. You can move the selection point with the (partial) functions defined below.
downL, downR, up :: Zipper -> Zipper downL (p, Branch l r) = (BranchL p r, l) downR (p, Branch l r) = (BranchR l p, r) up (BranchL p r, l) = (p, Branch l r) up (BranchR l p, r) = (p, Branch l r)
Note that these functions just shuffle existing subtrees around. Depending on your traversal pattern, these can run in roughly constant space.
Using the zipper, we can define functions that traverse the entire tree and return a new tree:
number :: Tree -> Tree number t = down Top t 0 where down p (Leaf _) n = up p (Leaf n) $! n + 1 down p (Branch l r) n = down (BranchL p r) l n
up Top t n = t up (BranchL p r) l n = down (BranchR l p) r n up (BranchR l p) r n = up p (Branch l r) n
Please correct me if I'm wrong, but doesn't the the size of the zipper grow every time we move down the left branch? I.e., by the time we reach the leaf, we'll have a zipper (BranchL (BranchL ..)) of size the depth of the tree? Or am I missing something? Edsko

On Sat, May 3, 2008 at 12:30 PM, Edsko de Vries
I think Huet's Zipper is intended to solve this sort of problem.
data Path = Top | BranchL Path Tree | BranchR Tree Path type Zipper = (Path, Tree)
openZipper :: Tree -> Zipper openZipper t = (Top, t)
Conceptually the zipper is a tree with one subtree selected. You can move the selection point with the (partial) functions defined below.
downL, downR, up :: Zipper -> Zipper downL (p, Branch l r) = (BranchL p r, l) downR (p, Branch l r) = (BranchR l p, r) up (BranchL p r, l) = (p, Branch l r) up (BranchR l p, r) = (p, Branch l r)
Note that these functions just shuffle existing subtrees around. Depending on your traversal pattern, these can run in roughly constant space.
Using the zipper, we can define functions that traverse the entire tree and return a new tree:
number :: Tree -> Tree number t = down Top t 0 where down p (Leaf _) n = up p (Leaf n) $! n + 1 down p (Branch l r) n = down (BranchL p r) l n
up Top t n = t up (BranchL p r) l n = down (BranchR l p) r n up (BranchR l p) r n = up p (Branch l r) n
Please correct me if I'm wrong, but doesn't the the size of the zipper grow every time we move down the left branch? I.e., by the time we reach the leaf, we'll have a zipper (BranchL (BranchL ..)) of size the depth of the tree? Or am I missing something?
If there are no other references to the tree, then each time a new
node in the zipper is created, the corresponding node in the original
tree can be discarded. Typically, the garbage won't be collected
immediately, so there will be some growth in actual memory usage, but
this will be bounded.
If there are other references to the tree, then the nodes in the
original tree cannot be reclaimed. Even so, the zipper never grows in
size beyond the longest path from root to leaf in the tree.
--
Dave Menendez
participants (5)
-
Daniil Elovkov
-
David Menendez
-
Derek Elkins
-
Edsko de Vries
-
Felipe Lessa