
20 Aug
2009
20 Aug
'09
2:50 a.m.
On Thu, Aug 20, 2009 at 4:28 PM, Colin Paul
Adams
But how do you get Latin-1 bytes from a Unicode string? This would need a transcoding process.
The first 256 code-points of Unicode coincide with Latin-1. Therefore, if you truncate Unicode characters down to 8 bits you'll effectively end up with Latin-1 text (except that any code points above U+00FF will give strange results). If your terminal then interprets these bytes as UTF-8 (or anything else, really), the result will be gibberish or worse. Stuart