
Hi Alistar,
On Fri, 02 Feb 2007 21:01:04 +0900, Alistair Bayley
What is the state of UTF8 support in Haskell libraries (base or user-contributed)? I had a need for a UTF8 en & de-coder for Takusen, and after looking around couldn't find anything particularly satisfactory, so ended up writing (yet another) one.
regex-posix doesn't support UTF8. Because regex-posix uses POSIX regex. So this problem can't fixed by only correct UTF8 en & de-coder. If someone is interested in suppourting UTF8, I recommend to use oniguruma. http://www.geocities.jp/kosako3/oniguruma/ Oniguruma also supports UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE, etc .... And it is portable, it's available both on Unix and Windows. So I think it is best regex C library to choose backend. -- shelarcy <shelarcy capella.freemail.ne.jp> http://page.freett.com/shelarcy/