How to tie knots in XML-style data structures

I am reading an XML file, where some nodes have attributes that refer to other nodes. For example: <blah name="first" type="5"> <yada blah="second"></yada> </blah> <blah name="second" type="3"> </blah> Here, the point is that the inner <yada> refers to a <blah> that is not its parent by means of a "blah" attribute. But there might also be a mistake in the XML file, which I would then want to report: the inner <yada> might refer to a <blah> which does not exist. Using an XML parsing library, this XML file can easily be read into a data type such as the following: data Blah = Blah {blahName :: String, blahType :: Int, blahYadas :: [Yada]} data Yada = Yada {yadaBlah :: String} But for my application, it is useful to have the Yadas refer back to actual Blahs, instead of just their name Strings. So I'd like to parse the XML into the following data structure: data Blah' = Blah' {blah'Name :: String, blah'Type :: Int, blah'Yadas :: [Yada']} data Yada' = Yada' {yada'Blah :: Blah} (the only change is the type of yada'Bla from String to Blah) 1. How can I write a function that takes a Blah, and outputs a Blah', and reports a nice error when there is no <blah> with the name specified in the <yada>? Can it be done at all without using partial functions like fromJust (when using Maybe to "report" errors)? 2. Is there a better way to define the types Blah, Yada, Blah' and Yada' (e.g. with less repetition)? After all, all I'm changing is the type of one field, but I end up having to redefine all my types.

Hi Auke,
I have admittedly only skimmed your email, but your problem reminds me
of a talk by Joachim Breitner.
A German transcript of the talk is available here:
http://www.joachim-breitner.de/publications/MonadFix_HaL10_2015-12-04.pdf
I hope this is useful to you!
Simon
2017-01-02 15:08 GMT+01:00 Auke Booij
I am reading an XML file, where some nodes have attributes that refer to other nodes. For example:
<blah name="first" type="5"> <yada blah="second"></yada> </blah> <blah name="second" type="3"> </blah>
Here, the point is that the inner <yada> refers to a <blah> that is not its parent by means of a "blah" attribute. But there might also be a mistake in the XML file, which I would then want to report: the inner <yada> might refer to a <blah> which does not exist.
Using an XML parsing library, this XML file can easily be read into a data type such as the following:
data Blah = Blah {blahName :: String, blahType :: Int, blahYadas :: [Yada]} data Yada = Yada {yadaBlah :: String}
But for my application, it is useful to have the Yadas refer back to actual Blahs, instead of just their name Strings. So I'd like to parse the XML into the following data structure:
data Blah' = Blah' {blah'Name :: String, blah'Type :: Int, blah'Yadas :: [Yada']} data Yada' = Yada' {yada'Blah :: Blah}
(the only change is the type of yada'Bla from String to Blah)
1. How can I write a function that takes a Blah, and outputs a Blah', and reports a nice error when there is no <blah> with the name specified in the <yada>? Can it be done at all without using partial functions like fromJust (when using Maybe to "report" errors)?
2. Is there a better way to define the types Blah, Yada, Blah' and Yada' (e.g. with less repetition)? After all, all I'm changing is the type of one field, but I end up having to redefine all my types. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi Auke,
2. Is there a better way to define the types Blah, Yada, Blah' and Yada' (e.g. with less repetition)? After all, all I'm changing is the type of one field, but I end up having to redefine all my types.
I think one way to accomplish your task would be to employ functors as basic building blocks that you can parameterise to express either Blah or Blah' types. The complete code with my attempt is in the attachment. To get more information on Fix type I'd recommend to read on recursion schemes/F-algebras, e.g. https://github.com/willtim/recursion-schemes/raw/master/slides-final.pdf.
1. How can I write a function that takes a Blah, and outputs a Blah', and reports a nice error when there is no <blah> with the name specified in the <yada>? Can it be done at all without using partial functions like fromJust (when using Maybe to "report" errors)?
I included my attempt at defining such function in the attached code. The idea is that conversion works in the Either monad that allows to report errors. Monadic context is threaded through whole conversion, so if any error occurs it will appear in the result. However, the conversion function, resolveYadas, is still pure from the outside and you can easily find out whether there were any errors by analyzing the Either result. Sample run: λ> resolveYadas sampleEnv $ BlahF "first" 10 [YadaF "second"] Right (Fix (C (BlahF {blahName = "first", blahType = 10, blahYadas = [YadaF {yadaBlah = Fix (C (BlahF {blahName = "second", blahType = 3, blahYadas = []}))}]}))) λ> resolveYadas sampleEnv $ BlahF "first" 10 [YadaF "third"] Left "No Blah with name \"third\"" Hope this helps! Regards, Sergey

On 3 January 2017 at 00:08, Sergey Vinokurov
The complete code with my attempt is in the attachment.
Thanks a lot for your work, Sergey. Your suggested solution contains many useful elements. I am still not quite satisfied, however, as your solution does not let me convert the following: sampleEnv :: Map String Blah sampleEnv = M.fromList [ ("first", BlahF "first" 5 [YadaF "second"]) , ("second", BlahF "second" 3 [YadaF "first"]) ] (the point here is that the "tie knots" in the thread subject means that we end up with potentially cyclic values)
information on Fix type I'd recommend to read on recursion schemes/F-algebras, e.g. https://github.com/willtim/recursion-schemes/raw/master/slides-final.pdf.
In retrospect this indeed sounds like the right solution, thanks for the pointer.

I have found a way to do this which is good enough for my purposes -
attached. I suspect this could also be modified to use Sergey's Fix
trick for code deduplication
On 3 January 2017 at 00:08, Sergey Vinokurov
Hi Auke,
2. Is there a better way to define the types Blah, Yada, Blah' and Yada' (e.g. with less repetition)? After all, all I'm changing is the type of one field, but I end up having to redefine all my types.
I think one way to accomplish your task would be to employ functors as basic building blocks that you can parameterise to express either Blah or Blah' types.
The complete code with my attempt is in the attachment. To get more information on Fix type I'd recommend to read on recursion schemes/F-algebras, e.g. https://github.com/willtim/recursion-schemes/raw/master/slides-final.pdf.
1. How can I write a function that takes a Blah, and outputs a Blah', and reports a nice error when there is no <blah> with the name specified in the <yada>? Can it be done at all without using partial functions like fromJust (when using Maybe to "report" errors)?
I included my attempt at defining such function in the attached code. The idea is that conversion works in the Either monad that allows to report errors. Monadic context is threaded through whole conversion, so if any error occurs it will appear in the result. However, the conversion function, resolveYadas, is still pure from the outside and you can easily find out whether there were any errors by analyzing the Either result.
Sample run: λ> resolveYadas sampleEnv $ BlahF "first" 10 [YadaF "second"] Right (Fix (C (BlahF {blahName = "first", blahType = 10, blahYadas = [YadaF {yadaBlah = Fix (C (BlahF {blahName = "second", blahType = 3, blahYadas = []}))}]}))) λ> resolveYadas sampleEnv $ BlahF "first" 10 [YadaF "third"] Left "No Blah with name \"third\""
Hope this helps!
Regards, Sergey
participants (3)
-
Auke Booij
-
Sergey Vinokurov
-
Simon Jakobi