Regular Expression with PCRE

Hey everyone, I'm hoping someone can point me in the right direction. The regex-pcre package exports (=~) and (=~~) as two useful infix functions. They're great! The only problem is, they are a *positive* match for a regex. I have a file that contains HTML comments (it was generated in Word) and I really just want the barest text. I already have a function that strips out all the tags, and I have a function that finds all the links and sticks those in another file for later perusal. What I'd like is advice on how to implement the (!~) and (!~~) operators. They should have the same types as (=~) and (=~~). I'm stuck, though. Here's the source for both of those functions: they depend on Text.Rege.PCRE package. (=~) :: (RegexMakerhttp://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-... Regexhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... CompOptionhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... ExecOptionhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... source, RegexContexthttp://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-... Regexhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... source1 target) => source1 -> source -> target (=~) x r = let q :: Regex q = makeRegex r in match q x (=~~) :: (RegexMakerhttp://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-... Regexhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... CompOptionhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... ExecOptionhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... source, RegexContexthttp://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-... Regexhttp://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-... source1 target, Monadhttp://hackage.haskell.org/packages/archive/base/4.5.0.0/doc/html/Control-Mo... m) => source1 -> source -> m target (=~~) x r = do (q :: Regex) <- makeRegexM r matchM q x What I figured I could do was find a function that was the inverse of "match" and "matchM", but I can't find any in the docs. I really hope I don't have to implement *that*, too. I'm still new at this, and that seems like it would be over my head.

have you considered using one of the many amazing HTML parsers on hackage? If the goal is to just get the HTML comments, that might be a much more effective use of your time -- Carter Tazio Schonwald On Friday, March 16, 2012 at 4:55 PM, Joseph Bozeman wrote:
Hey everyone, I'm hoping someone can point me in the right direction.
The regex-pcre package exports (=~) and (=~~) as two useful infix functions. They're great! The only problem is, they are a positive match for a regex. I have a file that contains HTML comments (it was generated in Word) and I really just want the barest text. I already have a function that strips out all the tags, and I have a function that finds all the links and sticks those in another file for later perusal.
What I'd like is advice on how to implement the (!~) and (!~~) operators. They should have the same types as (=~) and (=~~). I'm stuck, though. Here's the source for both of those functions: they depend on Text.Rege.PCRE package.
(=~) :: (RegexMaker (http://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-...) Regex (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) CompOption (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) ExecOption (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) source, RegexContext (http://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-...) Regex (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) source1 target) => source1 -> source -> target (=~) x r = let q :: Regex q = makeRegex r in match q x
(=~~) :: (RegexMaker (http://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-...) Regex (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) CompOption (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) ExecOption (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) source, RegexContext (http://hackage.haskell.org/packages/archive/regex-base/0.93.2/doc/html/Text-...) Regex (http://hackage.haskell.org/packages/archive/regex-pcre/0.94.2/doc/html/Text-...) source1 target, Monad (http://hackage.haskell.org/packages/archive/base/4.5.0.0/doc/html/Control-Mo...) m) => source1 -> source -> m target (=~~) x r = do (q :: Regex) <- makeRegexM r matchM q x What I figured I could do was find a function that was the inverse of "match" and "matchM", but I can't find any in the docs. I really hope I don't have to implement that, too. I'm still new at this, and that seems like it would be over my head.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (2)
-
Carter Tazio Schonwald
-
Joseph Bozeman