Although the intent of the spec is to represent characters, I
contend it does not succeed in doing so. Is it wise to assume
more semantics than are actually there?
It is not; one of the reasons that many experts protested the acceptance of this RFC is because of its incomplete specification (and as a result there are a lot of implementations currently which *do* assume more semantics, not always compatibly with each other).
Punycode is "out there" now, but it's a mess and a minefield.
-- brandon s allbery allbery.b@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms