-- Wraps numbered code listings within the page body with a div

-- in order to be able to apply some more specific styling.

wrapNumberedCodelistings (Page meta body) =

Page meta newBody

where

newBody = regexReplace "<table\\s+class=\"sourceCode[^>]+>[\\s\\S]*?</table>" wrap body

wrap x = "<div class=\"sourceCodeWrap\">" ++ x ++ "</div>"

-- Replaces the whole match for the given regex using the given function

regexReplace :: String -> (String -> String) -> String -> String

regexReplace regex replace text = go text

where

go text = case text =~~ regex of

Just (before, match, after) ->

before ++ replace match ++ go after

_ -> text

rico

On Wed, Jun 6, 2012 at 7:11 AM, Arlen Cuss <a@unnali.com> wrote:

I'd be more inclined to look at a solution involving manipulating the HTML structure, rather than trying a regexp-based approach, which will probably end up disappointing. (See this: http://stackoverflow.com/a/1732454/499609)

I hope another Haskeller can speak to a library that would be good for this kind of purpose.

To suit what you're doing now, though; if you change .*? to [\s\S]*?, it should work on multiline strings. If you can work out how to pass the 's' modifier to Text.Regexp.PCRE, that should also do it.

—Arlen

On Wednesday, 6 June 2012 at 3:05 PM, Rico Moorman wrote:

> Hello,
>
> I have a given piece of multiline HTML (which is generated using pandoc btw.) and I am trying to wrap certain elements (tags with a given class) with a <div>.
>
> I already took a look at the Text.Regex.PCRE module which seemed a reasonable choice because I am already familiar with similar regex implementations in other languages.
>
> I came up with the following function which takes a regex and replaces all matches within the given string using the provided function (which I would use to wrap the element)
>
> import Text.Regex.PCRE ((=~~))
>
> -- Replaces the whole match for the given regex using the given function
> regexReplace :: String -> (String -> String) -> String -> String
> regexReplace regex replace text = go text
> where
> go text = case text =~~ regex of
> Just (before, match, after) ->
> before ++ replace match ++ go after
> _ -> text
>
> The problem with this function is, that it will not work on multiline strings. I would like to call it like this:
>
> newBody = regexReplace "<table class=\"sourceCode\".*?table>" wrap body
> wrap x = "<div class=\"sourceCodeWrap\">" ++ x ++ "</div>"
>
> Is there any way to easily pass some kind of multiline modifier to the regex in question?
>
> Or is this approach completely off and would something else be more appropriate/haskelly for the problem at hand?
>
> Thank you very much in advance.

> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org (mailto:Beginners@haskell.org)
> http://www.haskell.org/mailman/listinfo/beginners