
That's not what the official unicode site says in its FAQ:
http://unicode.org/faq/utf_bom.html#bom4 and
http://unicode.org/faq/utf_bom.html#bomhttp://unicode.org/faq/utf_bom.html#bom4
5
Cheers,
-Tako
On Mon, Apr 4, 2011 at 15:18, malcolm.wallace
BOM is not part of UTF8, because UTF8 is byte-oriented. But applications should be prepared to read and discard it, because some applications erroneously generate it.
Regards, Malcolm
On 04 Apr, 2011,at 02:09 PM, Antoine Latter
wrote: On Mon, Apr 4, 2011 at 7:30 AM, Max Bolingbroke
wrote: On 4 April 2011 11:34, Daniel Fischer
wrote: If there's only a single encoding recognised, UTF-8 surely should be the one (though perhaps Windows users might disagree, iirc, Windows uses UCS2 as standard encoding).
Windows APIs use UTF-16, but the encoding of files (which is the relevant point here) is almost uniformly UTF-8 - though of course you can find legacy apps making other choices.
Would we need to specifically allow for a Windows-style leading BOM in UTF-8 documents? I can never remember if it is truly a part of UTF-8 or not.
Cheers, Max
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe