Prelude> :m + Text.Regex.PCRE
Prelude Text.Regex.PCRE> z <- readFile "test.html"
Prelude Text.Regex.PCRE> let (b, m ,a, ss) = z =~ "<a href=\"(.*?)\">.*?<img class=\"article-image\"" :: (String, String, String, [String])
Prelude Text.Regex.PCRE> b
...
n of the Triumvirate</td>\r\n <td class=\"small\">David Rapoza</td>\r\n <td class=\"small\">\r\n <i>Return to Ravnica</i>\r\n </td>\r\n <td class=\"small\">10/31/2012</td>\r\n </tr><tr>\r\n <td class=\"small\"><"
Prelude Text.Regex.PCRE> m
"a href=\"/magic/magazine/article.aspx?x=mtg/daily/activity/1088\"><img class=\"article-image\" "
From the value of b and m, it was weird that the matching was moved forward by 1 char ( the ss (sub matching) was even worse, 2 chars ). Rematch to a and so on gave correct results. It was only the first matching that was broken.
Tested with regex-posix (with modified regexp), everything is OK.
$ ghc-pkg describe regex-pcre
name: regex-pcre
version: 0.94.4
id: regex-pcre-0.94.4-d45e00c9e113c7c9352d0785497e1dca
license: BSD3
copyright: Copyright (c) 2006, Christopher Kuklewicz
stability: Seems to work, passes a few tests
synopsis: Replaces/Enhances Text.Regex
description: The PCRE backend to accompany regex-base, see
www.pcre.orgcategory: Text
author: Christopher Kuklewicz
exposed: True
exposed-modules: Text.Regex.PCRE Text.Regex.PCRE.Wrap
Text.Regex.PCRE.String Text.Regex.PCRE.Sequence
Text.Regex.PCRE.ByteString Text.Regex.PCRE.ByteString.Lazy
hidden-modules:
trusted: False
import-dirs: /home/magicloud/.cabal/lib/regex-pcre-0.94.4/ghc-7.6.1
library-dirs: /home/magicloud/.cabal/lib/regex-pcre-0.94.4/ghc-7.6.1
hs-libraries: HSregex-pcre-0.94.4
extra-libraries: pcre
extra-ghci-libraries:
include-dirs:
includes:
depends: array-0.4.0.1-cbe8814e07792e8f0d66cac77a2c0b6b
base-4.6.0.0-9108e251636b0c8499261c52a7809ea1
bytestring-0.10.0.1-11d4f52c4f4ed9833f768577b77050c5
containers-0.5.2.1-b183418bc7f43ce98b6916ef296c2669
regex-base-0.93.2-1ee07f806ad6b0c911226883d15b64f2
hugs-options:
cc-options:
ld-options:
framework-dirs:
frameworks:
haddock-interfaces: /home/magicloud/.cabal/share/doc/regex-pcre-0.94.4/html/regex-pcre.haddock
haddock-html: /home/magicloud/.cabal/share/doc/regex-pcre-0.94.4/html
pkgroot: "/home/magicloud/.ghc/x86_64-linux-7.6.1"
--
竹密岂妨流水过
山高哪阻野云飞
And for G+, please use magiclouds#
gmail.com.