I don't actually need UTF-16 code in these strings. I would rather filter them out before writing such strings to a file.
What would be a simple filter to do this?

Albert Y. C. Lai trebla at vex.net wrote:

On 11-12-04 07:08 AM, dokondr wrote: > In GHC 7.0.3 / Mac OS X when trying to: > > writeFile "someFile" "(Hoping You Have A iPhone When I Do This) Lol > Sleep Is When You Close These ---&gt; \55357\56384" > > I get: > commitBuffer: invalid argument (Illegal byte sequence) > > The string I am trying to write can also be seen here: > http://twitter.com/#!/search/Hoping%20You%20Have%20A%20iPhone%20When%20I%20Do%20This%20lang%3Aen > <http://twitter.com/#%21/search/Hoping%20You%20Have%20A%20iPhone%20When%20I%20Do%20This%20lang%3Aen>

\55357 and \56384 would be surrogates D83D and DC40 for use in UTF-16 only. Haskell's Char is not a UTF-16 code unit (unlike early versions of Java and probably current ones). GHC is correct in rejecting them. Haskell's Char is a Unicode character directly. If you want the character U+1F440 "EYES", write \128064 directly (or \x1f440, or \x1F440). Use http://www.unicode.org/charts/ to find out what you are getting into. You can enter a hexadecimal number or choose a category.

On Sun, Dec 4, 2011 at 11:43 PM, Erik Hesselink <hesselink@gmail.com> wrote:
Yes, you can set the text encoding on the handle you're reading this
text from [1]. The default text encoding is determined by the
environment, which is why I asked about LANG.

If you're entering literal strings, see Albert Lai's answer.

Erik

[1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/System-IO.html#g:23

On Sun, Dec 4, 2011 at 19:13, dokondr <dokondr@gmail.com> wrote:
> Is there any other way to solve this problem without changing LANG
> environment variable?
>
>
> On Sun, Dec 4, 2011 at 8:27 PM, Erik Hesselink <hesselink@gmail.com> wrote:
>>
>> What is the value of your LANG environment variable? Does it still
>> give the error if you set it to e.g. "en_US.UTF-8"?
>>
>> Erik
>>
>> On Sun, Dec 4, 2011 at 13:12, dokondr <dokondr@gmail.com> wrote:
>> > Correct url of a "bad" string:
>> >
>> > http://twitter.com/#!/search/Hoping%20You%20Have%20A%20iPhone%20When%20I%20Do%20This%20lang%3Aen
>> >
>> >
>> > On Sun, Dec 4, 2011 at 3:08 PM, dokondr <dokondr@gmail.com> wrote:
>> >>
>> >> Hi,
>> >> In  GHC 7.0.3 / Mac OS X when trying to:
>> >>
>> >> writeFile  "someFile" "(Hoping You Have A iPhone When I Do This) Lol
>> >> Sleep
>> >> Is When You Close These ---&gt; \55357\56384"
>> >>
>> >> I get:
>> >> commitBuffer: invalid argument (Illegal byte sequence)
>> >>
>> >> The string I am trying to write can also be seen here:
>> >>
>> >>
>> >> http://twitter.com/#!/search/Hoping%20You%20Have%20A%20iPhone%20When%20I%20Do%20This%20lang%3Aen
>> >>
>> >> It looks like 'writeFile' can not write unicode characters.
>> >> Any workarounds?
>> >>
>> >> Thanks!
>> >> Dmitri
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Haskell-Cafe mailing list
>> > Haskell-Cafe@haskell.org
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>
>
>
>