[Haskell-begin] Looking for cunning ways to update a tree

Hello all, This is a really helpful list: I've learned half a dozen new things just by reading this month's traffic. Anyway...I have the following bit of code that updates a tree structure given a route to a leaf: data Tree a = Leaf a | Node (Tree a) (Tree a) deriving (Show, Eq) data PathSelector = GoLeft | GoRight deriving (Show, Eq) type Path = [PathSelector] selectChild (Node left _) GoLeft = left selectChild (Node _ right ) GoRight = right updateNode (Node _ right) GoLeft newLeft = Node newLeft right updateNode (Node left _) GoRight newRight = Node left newRight updateLeaf new (Leaf previous) = Leaf new updateTree :: Tree a -> Path -> a -> Tree a updateTree tree path newValue = case path of [] -> updateLeaf newValue tree (p:ps) -> updateNode tree p (updateTree' (selectChild tree p) ps newValue) I wanted to rewrite updateTree without using explicit recursion. Unfortunately, the best I could come up with is: upDownRecurse :: (a -> b -> a) -> (a -> c) -> (a -> b -> c -> c) -> a -> [b] -> c upDownRecurse down bottoming up = upDownRecurse' where upDownRecurse' acc [] = bottoming acc upDownRecurse' acc (x:xs) = up acc x (upDownRecurse' (down acc x) xs) updateTree' :: Tree a -> Path -> a -> Tree a updateTree' tree path newValue = upDownRecurse selectChild (updateLeaf newValue) updateNode tree path So what's the sexier way of doing this? Cheers, -- Matt

Hey Matt, On 24 jul 2008, at 20:29, Quergle Quergle wrote:
Hello all,
This is a really helpful list: I've learned half a dozen new things just by reading this month's traffic. Anyway...I have the following bit of code that updates a tree structure given a route to a leaf:
[...]
I wanted to rewrite updateTree without using explicit recursion. Unfortunately, the best I could come up with is:
[...]
So what's the sexier way of doing this?
The trick here is to define a fold for a tree. A fold is a function that does all the recursion, and you can then define other functions in terms of that fold. The type of the fold function is based on the structure of your data. So, for the fold of the tree, you basically first have to take two functions:
leaf :: a -> r node :: r -> r -> r
The leaf function will act on Leaf's, and take a value of type a and turn it into a result. The node function will first compute the result for the recursive parts (so this is where the recursion happens), and then needs to combine those results. The full type of our function foldTree looks like this:
foldTree :: (a -> r) -> (r -> r -> r) -> Tree a -> r
And the implementation looks like this:
foldTree leaf node (Leaf a) = leaf a foldTree leaf node (Node a b) = node (foldTree leaf node a) (foldTree leaf node b)
Now, to find a value in the tree using your Path type, we want the following type:
findTree :: Tree a -> Path -> a
So suppose we give our foldTree two arguments to handle both the Node and the Leaf, we will end up with a function that has type:
Tree a -> r
Now, what will we choose for r? Of course, it has to be Path a -> a! We now need a function (a -> Path a -> a) and a function (Path a -> a) -> (Path a -> a) -> (Path a -> a). When we remove unnecessary parentheses, the type is (Path a -> a) -> (Path a -> a) -> Path a -> a The first one is easy, we just ignore the second argument:
findLeaf a _ = a
The second one takes three arguments: the left value, the right value and the path. Based on the path we need to choose wheter to choose the left or the right value:
findNode left right (p:ps) = case p of GoLeft -> left ps GoRight -> right ps
Now we can take these parts and compose them into the findTree function:
findTree t p = foldTree findLeaf findNode t p
And because Haskell will save us from unnecessary typing, we could also write it shorter:
findTree = foldTree const findNode
Note that we didn't use any recursion in our findTree! It would be cool if you could try to come up with a definition of updateTree in terms of foldTree. Have fun, -chris P.S.: Here's the full code:
foldTree :: (a -> r) -> (r -> r -> r) -> Tree a -> r foldTree leaf node (Leaf a) = leaf a foldTree leaf node (Node a b) = node (rec a) (rec b) where rec = foldTree leaf node
findTree :: Tree a -> Path -> a findTree = foldTree const findNode where findNode left right (p:ps) = case p of GoLeft -> left ps GoRight -> right ps

Hi Chris,
On Fri, Jul 25, 2008 at 12:44 AM, Chris Eidhof
The trick here is to define a fold for a tree. A fold is a function that does all the recursion, and you can then define other functions in terms of that fold. The type of the fold function is based on the structure of your data.
<snip>
Note that we didn't use any recursion in our findTree! It would be cool if you could try to come up with a definition of updateTree in terms of foldTree.
Thanks for pointing me at foldTree. I'm afraid the best I was able to come up with is the rather clunky:
updateTree'' :: Tree a -> Path -> a -> Tree a updateTree'' tree = (foldTree leaf node tree) True where leaf currentValue True [] newValue = Leaf newValue leaf currentValue True path@(_:_) newValue = error ("Invalid path at leaf -- still has elements: " ++ show path) leaf currentValue False _ newValue = Leaf currentValue node left right True (GoLeft:ps) newValue = Node (left True ps newValue) (right False ps newValue) node left right True (GoRight:ps) newValue = Node (left False ps newValue) (right True ps newValue) node _ _ True [] _ = error "Invalid path at node -- no elements left" node left right False (p:ps) newValue = Node (left False ps newValue) (right False ps newValue) node left right False [] newValue = Node (left False [] newValue) (right False [] newValue)
I guess one downside to using foldTree is that it seems I'm obliged to visit every node in the tree, so the updated tree I get back is a deep copy of the one passed in, whereas ideally it'd share as much of the tree as is unchanged by the update. Cheers, -- Matt

Hey Matt, On 27 jul 2008, at 12:57, Quergle Quergle wrote:
Thanks for pointing me at foldTree. I'm afraid the best I was able to come up with is the rather clunky:
updateTree'' :: Tree a -> Path -> a -> Tree a updateTree'' tree = (foldTree leaf node tree) True where leaf currentValue True [] newValue = Leaf newValue leaf currentValue True path@(_:_) newValue = error ("Invalid path at leaf -- still has elements: " ++ show path) leaf currentValue False _ newValue = Leaf currentValue node left right True (GoLeft:ps) newValue = Node (left True ps newValue) (right False ps newValue) node left right True (GoRight:ps) newValue = Node (left False ps newValue) (right True ps newValue) node _ _ True [] _ = error "Invalid path at node -- no elements left" node left right False (p:ps) newValue = Node (left False ps newValue) (right False ps newValue) node left right False [] newValue = Node (left False [] newValue) (right False [] newValue)
The version I came up with was basically the same. Here are some minor things of your code: path@(_:_) can be rewritten as path, because you already know it's not an empty list. The last two lines of node can be merged into one line, because as soon as we are at a False, we don't care about the path anymore: node left right False _ newValue = Node (left False [] newValue) (right False [] newValue)
I guess one downside to using foldTree is that it seems I'm obliged to visit every node in the tree, so the updated tree I get back is a deep copy of the one passed in, whereas ideally it'd share as much of the tree as is unchanged by the update.
Yes, indeed. However, folds are a nice way to factor out recursion and can make life a whole lot easier. For example, it is easy to define map and filter (functions that work on a list) in terms of fold (which also works on a list). Similarly, it isn't too hard to define mapTree :: Tree a -> Tree b in terms of foldTree. In a way, folding is a really essential traversal of a data structure, so it can be seen as a primitive. A really nice exercise is to try and define filter, map and sum in terms of fold, and compare those with the versions that have explicit recursion. If you take, for example, map and sum: map f [] = [] map f (x:xs) = f x : map f xs sum [] = 0 sum (x:xs) = x + sum xs You'll see that they have almost the same structure (visually). If you factor out the differing bits, you'll probably come up with the definition of fold! Have fun, -chris

Hi Chris,
Thanks for the further pointers!
On Mon, Jul 28, 2008 at 11:12 AM, Chris Eidhof
Yes, indeed. However, folds are a nice way to factor out recursion and can make life a whole lot easier...In a way, folding is a really essential traversal of a data structure, so it can be seen as a primitive.
I'll admit to having seen folds before and being soundly impressed by how they capture a general pattern of recursion. Indeed, what actually prompted me to start this thread was that I had a function (updateTree) which did explicit recursion across a list, but I couldn't see a way to express it as (say) a fold. I guess I was wondering if there was any way to factor out the "updateTree" pattern of recursion in terms of simpler primitives (even though, for example, using a tree fold might be the more sensible choice for the particular example). -- Matt

Hi Matt, Quergle Quergle wrote:
I have the following bit of code that updates a tree structure given a route to a leaf:
data Tree a = Leaf a | Node (Tree a) (Tree a) deriving (Show, Eq) data PathSelector = GoLeft | GoRight deriving (Show, Eq) type Path = [PathSelector]
selectChild (Node left _) GoLeft = left selectChild (Node _ right ) GoRight = right
updateNode (Node _ right) GoLeft newLeft = Node newLeft right updateNode (Node left _) GoRight newRight = Node left newRight
updateLeaf new (Leaf previous) = Leaf new
updateTree :: Tree a -> Path -> a -> Tree a updateTree tree path newValue = case path of [] -> updateLeaf newValue tree (p:ps) -> updateNode tree p (updateTree' (selectChild tree p) ps newValue)
I wanted to rewrite updateTree without using explicit recursion. Unfortunately, the best I could come up with is:
upDownRecurse :: (a -> b -> a) -> (a -> c) -> (a -> b -> c -> c) -> a -> [b] -> c upDownRecurse down bottoming up = upDownRecurse' where upDownRecurse' acc [] = bottoming acc upDownRecurse' acc (x:xs) = up acc x (upDownRecurse' (down acc x) xs)
updateTree' :: Tree a -> Path -> a -> Tree a updateTree' tree path newValue = upDownRecurse selectChild (updateLeaf newValue) updateNode tree path
So what's the sexier way of doing this?
I would approach that problem sligthly differently, by writing combinators for tree updaters. A combinator is a function which works on function and produces new functions, like (.), which composes two functions. Combinators can be very helpful to produce complex functions out of simple ones. A tree updater is a function of type (Tree a -> Tree a). A simple tree updater works only on Leafs, and sets the value stored in the leaf: onLeaf :: a -> Tree a -> Tree a onLeaf new (Leaf previous) = Leaf new onLeaf new (Node left right) = error "not a leaf" Note that the type of onLeaf can be read as (a -> (Tree a -> Tree a)), emphasizing that it takes a new value, and returns a tree updater. Given such a tree updater and Tree, we can do the following things: (1) apply the updater to the tree (2) apply the updater to the left subtree, leave the right unchanged (3) apply the updater to the right subtree, leave the left unchanged Case (1) is just function application, but for (2) and (3) we can define the following combinators: onLeft :: (Tree a -> Tree a) -> Tree a -> Tree a onLeft updater (Leaf x) = error "not a node" onLeft updater (Node a b) = Node (updater a) b onRight :: (Tree a -> Tree a) -> Tree a -> Tree a onRight updater (Leaf x) = error "not a node" onRight updater (Node a b) = Node a (updater b) Note that the types of onLeft and onRight can be read (Tree a -> Tree a) -> (Tree a -> Tree a), emphasizing their function as "tree updater transformer". They take a tree updater, and return a different tree updater. Now, we want to transform a call to updateTree such as updateTree tree [GoLeft, GoLeft, GoRight] 42 into onLeft (onLeft (onRight (onLeaf 42))). As a first step, note that the nested tree updater above can be written as onLeft . onLeft . onRight . onLeaf $ 42 using the (.) and ($) combinators. Try to implement onPath :: Path -> (Tree a -> Tree a) -> Tree a -> Tree a and updateTree in terms of onPath and onLeaf. (onPath path updater) is supposed to apply updater to the subtree denoted by path. You can also try to write some more combinators, like onNode :: (Tree a -> Tree a) -> (Tree a -> Tree a) -> Tree a -> Tree a applyLeaf :: (a -> a) -> Tree a -> Tree a onTree :: (Tree a -> Tree a) -> (Tree a -> Tree a) -> Tree a -> Tree a everywhere :: (Tree a -> Tree a) -> Tree a -> Tree a (onNode f g) works for Nodes, and applies f to the left, and g to the right subtree. (applyLeaf f) works for Leafs, and applies f to the value stored in the leaf. (onTree f g) works for Leafs and Nodes, and applies f to the tree if it is a Leaf, and g if it is a Node. (everywhere f) works for whole trees, and applies f to each subtree, and each subtree of each subtree, and so on. It is possible to write onLeaf, onLeft and onRight in terms of these more general combinators. What is the minimum choice of combinators you need to define with pattern matching on Tree, so that all other combinators can be defined using only other combinators? Tillmann
participants (3)
-
Chris Eidhof
-
Quergle Quergle
-
Tillmann Rendel