Reconstructing a tree from a list of its paths (to leaves)

Hello, I am manipulating labeled multiway trees, some kind of "lightweight" XML notation. One thing I would like to be able to do is manipulating a tree as a list of (Path, Value). Generating such a list is easy but I am a little bit surprised to find it harder to reconstruct a tree, given such a list assuming some sensible properties (list is ordered, for example). I got the intuition this has already been tackled in one way or another in a functional setting in Haskell (I code in Java but using mostly functional constructs), but don't know where to look. Thanks for your pointers and help, Regards, Arnaud

On 10/04/12 09:55, Arnaud Bailly wrote:
Hello, I am manipulating labeled multiway trees, some kind of "lightweight" XML notation. One thing I would like to be able to do is manipulating a tree as a list of (Path, Value). Generating such a list is easy but I am a little bit surprised to find it harder to reconstruct a tree, given such a list assuming some sensible properties (list is ordered, for example).
I got the intuition this has already been tackled in one way or another in a functional setting in Haskell (I code in Java but using mostly functional constructs), but don't know where to look.
The haskell solution would be to consider first how to turn a single (Path,Value) into a tree. Then you just combine these trees for all the paths by taking their union. I attached some code. Twan

Hello Twan,
I have a rather clear idea of how I would do it (in Haskell or in
Java), but I feel this is something quite common that should have
already been handled in other contexts (eg. XML/XPath stuff, JSon, CSS
?), hence my question which was admittedly not clear enough.
Anyway, thanks a lot for your detailed answer and your code.
Best regards,
Arnaud
On Tue, Apr 10, 2012 at 12:26 PM, Twan van Laarhoven
On 10/04/12 09:55, Arnaud Bailly wrote:
Hello, I am manipulating labeled multiway trees, some kind of "lightweight" XML notation. One thing I would like to be able to do is manipulating a tree as a list of (Path, Value). Generating such a list is easy but I am a little bit surprised to find it harder to reconstruct a tree, given such a list assuming some sensible properties (list is ordered, for example).
I got the intuition this has already been tackled in one way or another in a functional setting in Haskell (I code in Java but using mostly functional constructs), but don't know where to look.
The haskell solution would be to consider first how to turn a single (Path,Value) into a tree. Then you just combine these trees for all the paths by taking their union. I attached some code.
Twan
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 10/04/2012, at 7:55 PM, Arnaud Bailly wrote:
I am manipulating labeled multiway trees, some kind of "lightweight" XML notation. One thing I would like to be able to do is manipulating a tree as a list of (Path, Value). Generating such a list is easy but I am a little bit surprised to find it harder to reconstruct a tree, given such a list assuming some sensible properties (list is ordered, for example).
Given a tree, there is a unique set of (Path, Value) pairs. Given a set of (Path, Value) pairs, there might be no trees, one, or infinitely many. For example, suppose we have data XM = XE String [XM] | XL String as a simplification of XML and paths :: XM -> [([Int],String)] paths (XL s) = [([], s)] paths (XE _ ks) = [(i:p,v) | (i,k) <- zip [1..] ks, (p,v) <- paths k] as the function to reduce a tree to a list of (path,value) pairs. paths (XE "foo" [XE "bar" [XL "zabbo"], XE "ugh" [XL "troppo"]]) ==> [([1,1],"zabbo"),([2,1],"troppo")] in which "foo", "bar", and "ugh" have been irretrievably lost. So you need to be rather more explicit about your "sensible properties". (The list being ordered is not one of them; sorting is a solved problem.)

You are right, of course. By "sensible properties" I simply meant the
list of (Path, Value) is assumed to represent a tree (eg. it has been
generated by a traversal of some original tree). By "ordered" I meant
Path(s) segments are lexicographically ordered and (Path, Value) are
enumerated from a tree using depth-first traversal.
Thanks,
Arnaud
On Wed, Apr 11, 2012 at 2:15 AM, Richard O'Keefe
On 10/04/2012, at 7:55 PM, Arnaud Bailly wrote:
I am manipulating labeled multiway trees, some kind of "lightweight" XML notation. One thing I would like to be able to do is manipulating a tree as a list of (Path, Value). Generating such a list is easy but I am a little bit surprised to find it harder to reconstruct a tree, given such a list assuming some sensible properties (list is ordered, for example).
Given a tree, there is a unique set of (Path, Value) pairs. Given a set of (Path, Value) pairs, there might be no trees, one, or infinitely many.
For example, suppose we have
data XM = XE String [XM] | XL String
as a simplification of XML and
paths :: XM -> [([Int],String)]
paths (XL s) = [([], s)] paths (XE _ ks) = [(i:p,v) | (i,k) <- zip [1..] ks, (p,v) <- paths k]
as the function to reduce a tree to a list of (path,value) pairs.
paths (XE "foo" [XE "bar" [XL "zabbo"], XE "ugh" [XL "troppo"]]) ==> [([1,1],"zabbo"),([2,1],"troppo")]
in which "foo", "bar", and "ugh" have been irretrievably lost.
So you need to be rather more explicit about your "sensible properties". (The list being ordered is not one of them; sorting is a solved problem.)

On 11/04/2012, at 4:23 PM, Arnaud Bailly wrote:
You are right, of course. By "sensible properties" I simply meant the list of (Path, Value) is assumed to represent a tree (eg. it has been generated by a traversal of some original tree). By "ordered" I meant Path(s) segments are lexicographically ordered and (Path, Value) are enumerated from a tree using depth-first traversal.
My main point was that there is a "sensible property" that you did not mention, and you still have not mentioned it, namely that *ALL* non-structural information about a node must be represented in the Value that corresponds to it (and of course, that all the nodes are actually listed somewhere). Given your constraints, the sequence of (Path,Value) pairs must begin with a ([],Value) representing the non-structural information about the root, then a sequence of (1:x11,v11) ... (1:x1k,v1k) .... (n:xn1,vn1) ... (n:xnm,vnm) pairs which you partition as <(x11,v11) ... (x1k,v1k)> ... <(xn1,vn1) ... (xnm,vnm)> and then process recursively to make the children. The initial lexicographic ordering is _not_ an important property because you can ensure that by sorting. As long as the paths are numbered correctly, the kind of traversal that was used to generate the initial list is not important either. But the no-missing-nodes and no-missing-non-structural- information properties are crucial.
participants (3)
-
Arnaud Bailly
-
Richard O'Keefe
-
Twan van Laarhoven