
On 2014年04月25日 16:25, Christopher Allen wrote:
I'm going to disagree for a different reason. The transition to Python 3 improved unicode support in some respects, but utterly gutted the previously excellent codec support. Now you can't really handle arbitrary source/destination encodings of text without treating everything as if they were bytes. Really bad.
Perhaps I am misunderstanding, but, from my experience, Python 3 still has excellent codec support: https://docs.python.org/3.4/library/codecs.html When reading from a file, the source encoding can be passed to the `open` function so that it handles transcoding for you. When writing to a file, the destination encoding can similarly be specified to `open`. When dealing with other sources/destinations, data must be read/written as bytes, but content can be encoded/decoded as necessary using the functions in the codecs module. Haskell has excellent codec support thanks to ICU: http://hackage.haskell.org/package/text-icu The contents of the `Data.Text.ICU.Convert` module can be used to convert between codecs. For reference, here is a list of supported codecs: http://demo.icu-project.org/icu-bin/convexp Cheers, Travis