Re: [Haskell-cafe] HXT: how to get sibling element

Oh, yes!
In this situation with so poor structured source I can try to use tagsoup. (or I'll take a look at xml-conduit).
Nevertheless for better undestanding HXT it will be interesting to solve this problem in HXT. Or is it impossible?
15.03.2012, 20:08, "Asten, W.G.G. van (Wilfried, Student B-TI)"
You might want to check out the xml-conduit package. It has preceding and following sibling Axis. I am not sure how the package works exactly, but it seems to be a good starting point.
2012/3/15 Никитин Лев
: I absolutly agree with you but unfortunetly, it is not my xml file. It is extraction from html page of public web server. I cannot to change format of this html page. Sorry. I had to explain it in first letter.
But than what about to get sibling text (geting sibling is an separate interesting tasks with no matter for my contrete case).
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

ArrowNavigatableTree can also get a following sibling Axis:
http://hackage.haskell.org/packages/archive/hxt/9.2.2/doc/html/Control-Arrow...
Wilfried
2012/3/15 Никитин Лев
Oh, yes! In this situation with so poor structured source I can try to use tagsoup. (or I'll take a look at xml-conduit).
Nevertheless for better undestanding HXT it will be interesting to solve this problem in HXT. Or is it impossible?
15.03.2012, 20:08, "Asten, W.G.G. van (Wilfried, Student B-TI)"
: You might want to check out the xml-conduit package. It has preceding and following sibling Axis. I am not sure how the package works exactly, but it seems to be a good starting point.
2012/3/15 Никитин Лев
: I absolutly agree with you but unfortunetly, it is not my xml file. It is extraction from html page of public web server. I cannot to change format of this html page. Sorry. I had to explain it in first letter.
But than what about to get sibling text (geting sibling is an separate interesting tasks with no matter for my contrete case).
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thanx to all. I've done it! =============== import Text.XML.HXT.Core import Text.XML.HXT.Curl import Text.XML.HXT.HTTP import Control.Arrow.ArrowNavigatableTree pageURL = "http://localhost/test.xml" main = do r <- runX (configSysVars [withCanonicalize no, withValidate no, withTrace 0, withParseHTML no] >>> readDocument [withErrors no, withWarnings no, withHTTP []] pageURL >>> getChildren >>> isElem >>> hasName "div" >>> (getTitle <+> getSections)) putStrLn "Articles:" putStrLn "<" mapM_ putStrLn $ map (\i -> (fst i) ++ " is " ++ (snd i) ++ "\n") r putStrLn ">" getTitle = listA (getChildren >>> isElem >>> hasName "span") >>> arr head >>> getChildren >>> getText >>> arr trim >>> arr ("Title",) getSections = addNav >>> listA (getChildren >>> withoutNav (isElem >>> hasName "span")) >>> arr tail >>> unlistA >>> ((getChildren >>> remNav >>> getText) &&& (listA followingSiblingAxis >>> arr head >>> remNav >>> getText >>> arr (rc . trim))) ltrim [] = [] ltrim (' ':x) = ltrim x ltrim ('\n':x) = ltrim x ltrim ('\r':x) = ltrim x ltrim ('\t':x) = ltrim x ltrim x = x rtrim = reverse . ltrim . reverse trim = ltrim . rtrim rc (':':' ':x) = x rc x = x ==========================
participants (2)
-
Wilfried van Asten
-
Никитин Лев