
9 Jul
2007
9 Jul
'07
4:37 p.m.
On 7/9/07, Henning Thielemann
HXT returns a list of warnings for invalid UTF-8 byte sequences: http://www.fh-wedel.de/~si/HXmlToolbox/hdoc_arrow/Text-XML-HXT-DOM-Unicode.h...
Is your decoder lazy?
Yes, the decoder is lazy. Regarding error handling, I noticed that Python has three modes for decoding UTF-8: strict, replace, and ignore. strict: error "bad encoding" replace: ('\xfffd' :) ignore: id which I could add if there was interest. -- Eric Mertens