Matching word boundaries in Text.Regexp

I would like to match word boundaries in a regular expression but this doesn't seem to work with Text.Regex in GHC 6.4.2. The regular expression looks something like: "\\b(send|receive)\\b" to match either the keyword send or the keyword receive but not the word sending. Neither works \< and \> for matching the beginning and end of a word. Thanks for any help, Bernd

Bernd Holzmüller wrote:
I would like to match word boundaries in a regular expression but this doesn't seem to work with Text.Regex in GHC 6.4.2.
The regular expression looks something like: "\\b(send|receive)\\b" to match either the keyword send or the keyword receive but not the word sending. Neither works \< and \> for matching the beginning and end of a word.
Thanks for any help, Bernd
What you want to do is not POSIX regular expression syntax. What you are asking for is Perl(-Compatible-Regular-Expressions, aka PCRE). http://perldoc.perl.org/perlre.html This is provided in Haskell. You will first need to ensure you have the PCRE library. You may already have libpcre, if not you can get it from http://www.pcre.org/ where it is developed. I have the newest wrapper for calling this from Haskell: http://haskell.org/haskellwiki/Libraries_and_tools/Data_structures#Regular_e... You will need regex-base and regex-pcre packages from: http://darcs.haskell.org/packages/ darcs get --partial http://darcs.haskell.org/packages/regex-base/ darcs get --partial http://darcs.haskell.org/packages/regex-pcre/ It works on both String and Data.ByteString (great performance), and with a bit of .cabal file editing (to point to libpcre) it should compile and run with GHC 6.4.2 (which I have done on OS X). To easily compile the packages you will also need the Data.ByteString module which is provided by Don's fps package: http://www.cse.unsw.edu.au/~dons/fps.html darcs get --partial http://www.cse.unsw.edu.au/~dons/code/fps The Text.Regex module is the old Posix api. You don't want that. You want the new api exported by Text.Regex.Base and Text.Regex.PCRE which uses (=~) (=~~) and classes RegexOptions, RegexMaker, RegexLike, RegexContext. If you upgrade to GHC 6.6 then the regex-base and Data.ByteString are already installed and you would only need regex-pcre. Please continue to ask for help on this mailing list or haskell-cafe. (This was based on the older JRegex libpcre wrapper, which for reference is at http://repetae.net/john/computer/haskell/JRegex/ )
participants (2)
-
Bernd Holzmüller
-
Chris Kuklewicz