subRegex https? with anchor href tags

Hi, I am trying to replace every occurrence of http://URL or https://URL with: <a href="http://URL"> http://URL </a> <a href="https://URL"> https://URL </a> respectively in a text file. I tried with Text.Regex with an example: Prelude> let s = "Google website is http://www.google.com in US" Prelude> import Text.Regex Prelude Text.Regex> let s3 = subRegex (mkRegex "https?:[^\\s\n\r]+") s "</a>" Prelude Text.Regex> s3 "Google website is I then tried a simpler example: Prelude Text.Regex> subRegex (mkRegex "e") "hello" "\\1" "h*** Exception: Ix{Int}.index: Index (1) out of range ((0,0)) What could I be missing? I am using GHCi 6.12.3 on Fedora 14. Appreciate any help in this regard. Thanks, SK -- Shakthi Kannan http://www.shakthimaan.com

On Saturday 12 November 2011, 11:30:39, Shakthi Kannan wrote:
I then tried a simpler example:
Prelude Text.Regex> subRegex (mkRegex "e") "hello" "\\1" "h*** Exception: Ix{Int}.index: Index (1) out of range ((0,0))
What could I be missing?
Maybe the backreferences numbering starts at 0? Worth a try.

Hi,
--- On Sat, Nov 12, 2011 at 5:44 PM, Daniel Fischer

On Saturday 12 November 2011, 14:23:30, Shakthi Kannan wrote:
Hi,
--- On Sat, Nov 12, 2011 at 5:44 PM, Daniel Fischer
wrote: | Maybe the backreferences numbering starts at 0?
Not backreferences, but who cares?
| Worth a try.
\--
\0 represents the entire string match:
The entire *match*, that is, the part of the input matched by the regexp. The other entries correspond to parts matched by certain subregexen in the match.
http://cvs.haskell.org/Hugs/pages/libraries/base/Text-Regex.html
May I suggest using the docs at hackage, hugs hasn't had a release since 2006, I don't think the docs are up to date. Unless you're actually using hugs, in which case I suggest switching to ghc. http://hackage.haskell.org/package/regex-compat
Prelude Text.Regex> subRegex (mkRegex "e") "hello" "\\0" "hello"
Heh, I didn't see it immediately either ;)
Prelude Text.Regex> subRegex (mkRegex "e") "hello" ">\\0<"
"h>e

Hi,
--- On Sat, Nov 12, 2011 at 8:20 PM, Daniel Fischer

On Saturday 12 November 2011, 16:10:58, Shakthi Kannan wrote:
Hi,
--- On Sat, Nov 12, 2011 at 8:20 PM, Daniel Fischer
wrote: | Prelude Text.Regex> subRegex (mkRegex "https?[^[:space:]]+") "The best | is http://haskell.org\n" "there</a>" | "The best is http://haskell.org\">there</a>\n" \--
Is there any way we can escape the " within <a href></a>, and not get the \" in the output?
The \" are just because that's Haskell's way of showing Strings. Strings are shown enclosed in quotes, and quotes (and other characters) appearing in the String are escaped. You can see what would get written to the file by outputting the String with putStrLn, Prelude Text.Regex> putStrLn $ subRegex (mkRegex "https?://([:alpha:]| [0-9]|.[[:alpha:]0-9])+") "The best is http://haskell.org." "there</a>" The best is <a href="http://haskell.org">there</a>. See, just like it should be.
I am producing HTML in the output, and the \" doesn't work with the anchor tag. Expected output is:
"The best is <a href="http://haskell.org">there</a>\n"
Thanks for your prompt help!
SK

Hi,
--- On Sat, Nov 12, 2011 at 8:59 PM, Daniel Fischer
participants (2)
-
Daniel Fischer
-
Shakthi Kannan