Re: [Haskell-beginners] remove XML tags using Text.Regex.Posix

30 Sep 2009


      HXT should be able to do what you're after quite easily from what I've seen.

On Wed, Sep 30, 2009 at 1:58 PM, Magnus Therning  wrote:
...
On Tue, Sep 29, 2009 at 12:25:07PM -0700, Robert Ziemba wrote:
...
I have been working with the regular expression package (Text.Regex.Posix).
 My hope was to find a simple way to remove a pair of XML tags from a short
string.
I have something like this "<tag>Data</tag>" and would like to extract
'Data'.  There is only one tag pair, no nesting, and I know exactly what the
tag is.
My first attempt was this:
  "<tag>123</tag>" =~ "[^<tag>].+[^</tag>]"::String
result:  "123"
Upon further experimenting I realized that it only works with more than 2
digits in 'Data'.  I occured to me that my thinking on how this regular
expression works was not correct - but I don't understand why it works at
all for 3 or more digits.
Can anyone help me understand this result and perhaps suggest another
strategy?  Thank you.
Personally I would have used tagsoup for this sort of thing.  Keep in mind the
eternal words
 Some people, when confronted with a problem, think 'I know, I'll use
 regular expressions.' Now they have two problems.
      -- Jamie Zawinski
As you so nicely demonstrated yourself ;-)
/M
--
Magnus Therning                        (OpenPGP: 0xAB4DFBA4)
magnus＠therning．org          Jabber: magnus＠therning．org
http://therning.org/magnus         identi.ca|twitter: magthe
_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners

Re: [Haskell-beginners] remove XML tags using Text.Regex.Posix

Lyndon Maydwell