Cast from and to CChar

Christian Buschmann

28 Oct 2003 28 Oct '03

3:47 a.m.

Hi! I've got following problem. If I enter in ghci following line: Prelude Foreign.C> castCCharToChar $ castCharToCChar 'ü' I would expect that this returns 'ü', but it returns '\252'. Is this the correct behaviour? Or am I doing something wrong? Or are there any problems with language specific characters and CChar? thanks Christian B.

Show replies by date

Marcin 'Qrczak' Kowalczyk

28 Oct 28 Oct

5:31 a.m.

W liście z pon, 27-10-2003, godz. 20:47, Christian Buschmann pisze:

...

Prelude Foreign.C> castCCharToChar $ castCharToCChar 'ü' I would expect that this returns 'ü', but it returns '\252'.

This is the same - instance Show Char displays non-ASCII characters that way. You get the same effect if you just type 'ü'. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Christian Buschmann

7:30 p.m.

Marcin 'Qrczak' Kowalczyk wrote:

...

This is the same - instance Show Char displays non-ASCII characters that way. You get the same effect if you just type 'ü'.

Thanks to you and John for the hints. But what is the good reason that show is doing it that way? Wouldn't it be better to output the 'ü' as a 'ü' instead of a code? I thought that Char in Haskell is working with Unicodes? christian

Marcin 'Qrczak' Kowalczyk

29 Oct 29 Oct

2:36 a.m.

W liście z wto, 28-10-2003, godz. 12:30, Christian Buschmann pisze:

...

But what is the good reason that show is doing it that way? Wouldn't it be better to output the 'ü' as a 'ü' instead of a code?

Then don't use show, output characters directly (putChar, putStr).

...

I thought that Char in Haskell is working with Unicodes?

In theory yes, but there is no library infrastructure for conversion between Unicode and external byte encodings yet. For most programs which work with a single encoding it doesn't hurt, although they are misusing Char values if the encoding is not ISO-8859-1. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

John Meacham

28 Oct 28 Oct

5:49 a.m.

On Mon, Oct 27, 2003 at 08:47:41PM +0100, Christian Buschmann wrote:

...

Hi! I've got following problem. If I enter in ghci following line: Prelude Foreign.C> castCCharToChar $ castCharToCChar 'ü' I would expect that this returns 'ü', but it returns '\252'. Is this the correct behaviour? Or am I doing something wrong? Or are there any problems with language specific characters and CChar?

The problem is a CChar is (most likely) 8 bits while a Haskell Char is a 32 bit unicode value. the correct thing to do to talk unicode values with C code can depend on what you are trying to do, if you know your system is utf8 (many are) and don't mind being somewhat unportable, then the easist thing to do is just hard code that in http://repetae.net/john/computer/haskell/UTF8.hs will do it. (code stolen from someone else) if you want to use the proper locale settings, then things get trickier but this should do it for many apps but requires the ffi http://repetae.net/john/computer/haskell/CWString.hsc otherwise, you may need to write your own ffi code which uses 'iconv' to do the proper character set conversion. This is a well known deficiency in the haskell libraries at the moment... John -- --------------------------------------------------------------------------- John Meacham - California Institute of Technology, Alum. - john@foo.net ---------------------------------------------------------------------------

8057

Age (days ago)

8058

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Christian Buschmann
John Meacham
Marcin 'Qrczak' Kowalczyk