
On 27/04/2014, at 9:30 PM, Ben Franksen wrote:
The main problem with special Unicode characters, as I see it, is that it is no longer possible to distinguish characters unambiguously just by looking at them.
"No longer"? Hands up all the people old enough to have used "coding forms". Yes, children, there was a time when programmers wrote their programs on printed paper forms (sort of like A4 tipped sideways) so that the keypunch girls (not my sexism, historical accuracy) knew exactly which column each character went in. And at the top of each sheet was a row of boxes for you to show how you wrote 2 Z 7 1 I ! 0 O and the like. For that matter, I recall a PhD thesis from the 80s in which the author spent a page grumbling about the difficulty of telling commas and semicolons apart...
Apart from questions of maintainability, this is also a potential security problem: it enables an attacker to slip in malicious code simply by importing a module whose name looks like a well known safe module. In a big and complex piece of software, such an attack might not be spotted for some time.
Again, considering the possibilities of "1" "i" "l", I don't think we actually have a new problem here. Presumably this can be addressed by tools: "here is are some modules, tell me what exactly they depend on" not entirely unlike ldd(1). Of course, the gotofail bug shows that it's not enough to _have_ tools like that, you have to use them and review the results periodically.