
Attachment is the test text file.
And I tested my regexp as this:
Prelude> :m + Text.Regex.PCRE
Prelude Text.Regex.PCRE> z <- readFile "test.html"
Prelude Text.Regex.PCRE> let (b, m ,a, ss) = z =~ ".*? b
...
n of the Triumvirate</td>\r\n
From the value of b and m, it was weird that the matching was moved forward by 1 char ( the ss (sub matching) was even worse, 2 chars ). Rematch to a and so on gave correct results. It was only the first matching that was broken. Tested with regex-posix (with modified regexp), everything is OK.
$ ghc-pkg describe regex-pcre name: regex-pcre version: 0.94.4 id: regex-pcre-0.94.4-d45e00c9e113c7c9352d0785497e1dca license: BSD3 copyright: Copyright (c) 2006, Christopher Kuklewicz maintainer: TextRegexLazy@personal.mightyreason.com stability: Seems to work, passes a few tests homepage: http://hackage.haskell.org/package/regex-pcre package-url: http://code.haskell.org/regex-pcre/ synopsis: Replaces/Enhances Text.Regex description: The PCRE backend to accompany regex-base, see www.pcre.org category: Text author: Christopher Kuklewicz exposed: True exposed-modules: Text.Regex.PCRE Text.Regex.PCRE.Wrap Text.Regex.PCRE.String Text.Regex.PCRE.Sequence Text.Regex.PCRE.ByteString Text.Regex.PCRE.ByteString.Lazy hidden-modules: trusted: False import-dirs: /home/magicloud/.cabal/lib/regex-pcre-0.94.4/ghc-7.6.1 library-dirs: /home/magicloud/.cabal/lib/regex-pcre-0.94.4/ghc-7.6.1 hs-libraries: HSregex-pcre-0.94.4 extra-libraries: pcre extra-ghci-libraries: include-dirs: includes: depends: array-0.4.0.1-cbe8814e07792e8f0d66cac77a2c0b6b base-4.6.0.0-9108e251636b0c8499261c52a7809ea1 bytestring-0.10.0.1-11d4f52c4f4ed9833f768577b77050c5 containers-0.5.2.1-b183418bc7f43ce98b6916ef296c2669 regex-base-0.93.2-1ee07f806ad6b0c911226883d15b64f2 hugs-options: cc-options: ld-options: framework-dirs: frameworks: haddock-interfaces: /home/magicloud/.cabal/share/doc/regex-pcre-0.94.4/html/regex-pcre.haddock haddock-html: /home/magicloud/.cabal/share/doc/regex-pcre-0.94.4/html pkgroot: "/home/magicloud/.ghc/x86_64-linux-7.6.1" -- 竹密岂妨流水过 山高哪阻野云飞 And for G+, please use magiclouds#gmail.com.