Re: To show or not to show french accents

newer
RE: To show or not to show french...

older
RE: Installation problem on suse...

francis.girard＠free.fr

18 Dec 2003 18 Dec '03

12:55 p.m.

Good afternoon, Well, I think there should probably be some internationalisation mechanism that tells the "show" function (to name one), according to some configuration, how to interpret a byte as a character. Frankly, I see no good reason why we should be satisfied we the dinosaurus 7 bits except perhaps because 7 bits is sufficient for english. I am talking about respect for non english speaking people. But if nobody cares ... Cheers, Francis Girard LE CONQUET France Selon Max Kirillov :

...

...
Good morning,

The following haskell program :

--<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< module Main where

accentLetters :: String accentLetters = "éàô"

main :: IO () main = do putStr (show accentLetters) -->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

after being compiled will give the result :

"\233\224\244"

But, exactly the same program, without the "show" function will give the result:

éàô

Is there some way to have "show" show all the printable characters, even

On Tue, Dec 16, 2003 at 07:49:26AM +0100, francis.girard@free.fr wrote: those

...
represented by a value greater than the US-ASCII 7 bits (127) ?

The specific octet may be printable character or not depending on your charset. For instance, your letters are printable in koi8-r (showing upper Russian I YU T), but not in cp866 (al least recode cp866..koi8-r fails on them).

The "show" function represents your over-127 bytes in portable and readable (by read) way and, I think, it does right.

-- Max _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Show replies by date

Carsten Schultz

18 Dec 18 Dec

1:07 p.m.

New subject: To show or not to show french accents

Hallo! On Thu, Dec 18, 2003 at 01:55:27PM +0100, francis.girard@free.fr wrote:

...

Well, I think there should probably be some internationalisation mechanism that tells the "show" function (to name one), according to some configuration, how to interpret a byte as a character.

My understanding is that `show' should work with `read' and possibly produce output that can be parsed by the Haskell parser. It is not a pretty printing function.

...

Frankly, I see no good reason why we should be satisfied we the dinosaurus 7 bits except perhaps because 7 bits is sufficient for english.

I am talking about respect for non english speaking people.

But if nobody cares ...

I, too, speak a language that can't be fully expressed in ASCII, but I do not think that the behaviour of `show' should be changed in this respect. Greetings, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.fu-mathe-team.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.

francis.girard＠free.fr

3:40 p.m.

New subject: To show or not to show french accents

Good evening, OK. I don't know Haskell enough to argue. But I can't resist pointing out that reading a single byte having the value 233 (that is 'é') is certainly simpler than reading the four characters "\233", parse it, and translate it into a single byte having the value 233 representing no matter what character in your character table. But, I don't care that much and I'm sorry for this. Best regards, Francis Girard LE CONQUET France Selon Carsten Schultz :

...

Hallo!

On Thu, Dec 18, 2003 at 01:55:27PM +0100, francis.girard@free.fr wrote:

...
Well, I think there should probably be some internationalisation mechanism that tells the "show" function (to name one), according to some configuration, how to interpret a byte as a character.

My understanding is that `show' should work with `read' and possibly produce output that can be parsed by the Haskell parser. It is not a pretty printing function.

...
Frankly, I see no good reason why we should be satisfied we the dinosaurus 7 bits except perhaps because 7 bits is sufficient for english.

I am talking about respect for non english speaking people.

But if nobody cares ...

I, too, speak a language that can't be fully expressed in ASCII, but I do not think that the behaviour of `show' should be changed in this respect.

Greetings,

Carsten

-- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.fu-mathe-team.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.

...

Original message :

The following haskell program :

--<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< module Main where

accentLetters :: String accentLetters = "éàô"

main :: IO () main = do putStr (show accentLetters) -->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

after being compiled will give the result :

"\233\224\244"

But, exactly the same program, without the "show" function will give the result:

éàô

Is there some way to have "show" show all the printable characters, even those represented by a value greater than the US-ASCII 7 bits (127) ?

Jon Fairbairn

5:40 p.m.

New subject: To show or not to show french accents

On 2003-12-18 at 16:40+0100 francis.girard@free.fr wrote:

...

Good evening,

OK. I don't know Haskell enough to argue.

But I can't resist pointing out that reading a single byte having the value 233 (that is 'é')

The problem is that if you are reading single bytes, 233 is not necessarily é. It might be 'shch' if you are in Russia, or iota if you are in Greece. While it's (almost) completely reasonable to expect 233 to display as é in Western Europe, it's completely unreasonable to hold that expectation across borders.

...

is certainly simpler than reading the four characters "\233", parse it, and translate it into a single byte

but it isn't a single byte internally. Indeed, if you are in Russia you could reasonably expect reading a single byte 233 to be converted to the internal code 1257 (if I got the arithmetic right). Since Haskell specifies unicode, if you are operating in a Russian locale that's what ought to happen. What I don't understand is why you want show for this. As I mentioned earlier, to output strings and get accented characters, all you have to do is to output the string with putStr, and voilà, les signes diacritiques. Jón -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Dimitry Golubovsky

19 Dec 19 Dec

2:43 a.m.

New subject: To show or not to show french accents

I would support the point of view that show should output escapes when showing characters outside ASCII. This is sort of a "transport" format (together with read), therefore it must be a GCD for all possible input encodings. UTF-8 might be alternative, but it would require to be equally supported by all Haskell implementations. -- Dmitry M. Golubovsky South Lyon, MI

francis.girard＠free.fr

7:06 a.m.

New subject: To show or not to show french accents

Hello,

...

What I don't understand is why you want show for this. As I mentioned earlier, to output strings and get accented characters, all you have to do is to output the string with putStr, and voilà, les signes diacritiques.

Sometimes, I want to do cheap and dirty test programs that "shows" data structures involving some strings. Again, the cheap and dirty way to do this is to derive the data structure from "Show" using the "deriving" keyword. But then, you are sometimes barely able to just read the outputed string. Therefore I have to redefine "show" myself ... Which is a lot less cheap and a lot more dirty.

...

The problem is that if you are reading single bytes, 233 is not necessarily é. It might be 'shch' if you are in Russia,

What the byte should represents is not relevant here. All I wanted to point out, is that it is easier to read a character as a one byte (or two) instead of 4 and translate 4 characters into a numerical value, NO MATTER WHAT THE BYTE IS SUPPOSED TO REPRESENT. This was to answer an opposition that was maintaining that "show" was meant to be "Read". I simply answered that that was not a valid argument.

...

arithmetic right). Since Haskell specifies unicode, if you are operating in a Russian locale that's what ought to happen.

I naively tought that unicode would solve these kind of problems. But yet we're stucked with these pesky 7 bits ... Regards, Francis Girard LE CONQUET France Selon Jon Fairbairn :

...

On 2003-12-18 at 16:40+0100 francis.girard@free.fr wrote:

...
Good evening,

OK. I don't know Haskell enough to argue.

But I can't resist pointing out that reading a single byte having the value 233 (that is 'é')

The problem is that if you are reading single bytes, 233 is not necessarily é. It might be 'shch' if you are in Russia, or iota if you are in Greece. While it's (almost) completely reasonable to expect 233 to display as é in Western Europe, it's completely unreasonable to hold that expectation across borders.

...
is certainly simpler than reading the four characters "\233", parse it, and translate it into a single byte

but it isn't a single byte internally. Indeed, if you are in Russia you could reasonably expect reading a single byte 233 to be converted to the internal code 1257 (if I got the arithmetic right). Since Haskell specifies unicode, if you are operating in a Russian locale that's what ought to happen.

What I don't understand is why you want show for this. As I mentioned earlier, to output strings and get accented characters, all you have to do is to output the string with putStr, and voilà, les signes diacritiques.

Jón

-- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

8030

Age (days ago)

8031

Last active (days ago)

List overview

Download

5 comments

4 participants

participants (4)

Carsten Schultz
Dimitry Golubovsky
francis.girard＠free.fr
Jon Fairbairn