
18 Nov
2009
18 Nov
'09
10:28 a.m.
Hi. The Unicode Standard (version 4.0, section 3.9, D31 - pag 76) says: """Because surrogate code points are not included in the set of Unicode scalar values, UTF-32 code units in the range 0000D800 .. 0000DFFF are ill-formed""" However GHC does not reject this code units: Prelude> print '\x0000D800' '\55296' Is this a correct behaviour? Note that Python, too (2.5.4, UCS4 build, Linux Debian), accept these code units. Thanks Manlio