On 1/22/08, Ian Lynagh <igloo@earth.li> wrote:
On Tue, Jan 22, 2008 at 03:16:15PM +0000, Magnus Therning wrote:
> On 1/22/08, Duncan Coutts <duncan.coutts@worc.ox.ac.uk> wrote:
> >
> >
> > On Tue, 2008-01-22 at 09:29 +0000, Magnus Therning wrote:
> > > I vaguely remember that in GHC 6.6 code like this
> > >
> > >   length $ map ord "a string"
> > >
> > > being able able to generate a different answer than
> > >
> > >   length "a string"
> >
> > That seems unlikely.
>
>
> Unlikely yes, yet I get the following in GHCi (ghc 6.6.1, the version
> currently in Debian Sid):
>
> > map ord "a"
> [97]
> > map ord "ö"
> [195,182]

In 6.6.1:

Prelude Data.Char> map ord "ö"
[195,182]
Prelude Data.Char> length "ö"
2

there are actually 2 bytes there, but your terminal is showing them as
one character.

Yes, of course, stupid me.  But it is still the UTF-8 representation of "ö", not Latin-1, and this brings me back to my original question, is this an intentional change in 6.8?

> map ord "ö"
[246]
> map ord "åɓz𝐀"
[229,595,65370,119808]

6.8 produces Unicode code points rather then a particular encoding.

/M