regex-pcre is not working with UTF-8

18 Aug 2012

      Hello.

It seems that the regex-pcre has a bug dealing with utf-8:

   Prelude> :m + Text.Regex.PCRE

   Prelude Text.Regex.PCRE> "país:Brasil" =~ "país:(.*)" :: (String,String,String,[String])
   ("","pa\237s:Brasil","",["rasil"])

Notice the missing 'B' in the result of the regex matching.

With regex-posix this does not happen:

   Prelude> :m + Text.Regex.Posix

   Prelude Text.Regex.Posix> "país:Brasil" =~ "país:(.*)" ::(String,String,String,[String])
   ("","pa\237s:Brasil","",["Brasil"])

I hope this bug can be fixed soon.

Is there a bug tracker to report the bug? If so, what is it?

Romildo

José Romildo Malaquias

Konstantin Litvinenko

José Romildo Malaquias

tags

participants (2)