
On 2005-01-30, Marcin 'Qrczak' Kowalczyk
Aaron Denney
writes: It provides variants of UTF-16/32 with and without a BOM, but UTF-8 only has the variant with a BOM. This makes UTF-8 a stateful encoding.
I think you mean "UTF-8 only has the variant without a BOM".
...
IMHO it would be fair if it had two variants of UTF-8 encoding scheme, just like it has three variants of UTF-16/32, so it would be unambiguous whether "UTF-8" in a particular context allows BOM or not.
Ah. Okay. It's not that the BOM is always to be there, but that it's always ambiguous, which was not clear from your initial description. Better yet would be to have the standard never allow the BOM. Since some things can't handle it, on output we should never emit it, but still must handle it on input. Bah. -- Aaron Denney -><-