
Thanks, that was extremely helpful.
My bad for being so sloppy reading the documentation so sloppily -- I
somehow glossed over the bit that backreferences worked as one would
expect.
To atone for this,
http://patch-tag.com/repo/haskell-learning/browse/regexStuff/pcreReplace.hs
shows successful =~ s/../../ -like behavior for a pcre and a
posix-like (but compatible with pcre engine) regex in the same
example, which is based on pcre regex. (See testPcre, testPosix).
FWIW, I still think that there should be a library subRegex function
for all regex flavors, and not just Posix.
If there are gotchas about how capture references work in different
flavors I might backpedal on this, but Im not aware of any.
2009/3/16 ChrisK
Thomas Hartman wrote:
testPcre = ( subRegex (mkRegex "(?
quoting from the man page for regcomp:
REG_NEWLINE Compile for newline-sensitive matching. By default, newline is a completely ordinary character with no special meaning in either REs or strings. With this flag, `[^' bracket expressions and `.' never match newline, a `^' anchor matches the null string after any newline in the string in addition to its normal function, and the `$' anchor matches the null string before any newline in the string in addition to its normal function.
This is the carried over to Text.Regex with
mkRegexWithOpts Source :: String The regular expression to compile -> Bool True <=> '^' and '$' match the beginning and end of individual lines respectively, and '.' does not match the newline character. -> Bool True <=> matching is case-sensitive -> Regex Returns: the compiled regular expression Makes a regular expression, where the multi-line and case-sensitive options can be changed from the default settings.
Or with regex-posix directly the flag is "compNewline": http://hackage.haskell.org/packages/archive/regex-posix/0.94.1/doc/html/Text...
The defaultCompOpt is (compExtended .|. compNewline).
You want to match a \n that is not next to any other \n.
So you want to turn off REG_NEWLINE.
import Text.Regex.Compat
r :: Regex r = mkRegexWithOpts "(^|[^\n])\n($|[^\n])" False True -- False is important here
The ^ and $ take care of matching a lone newline at the start or end of the whole text. In the middle of the text the pattern is equivalent to [^\n]\n[^\n].
When substituting you can use the \1 and \2 captures to restore the matched non-newline character if one was present.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe