
2011/4/4 Colin Adams
Not from looking with your eyes perhaps. Does that matter? Your text editor, and the compiler, can surely figure it out for themselves. I am not aware of any algorithm that can reliably infer the character encoding used by just looking at the raw data. Why would people bother with stuff like <?xml version="1.0" encoding="UTF-8"?> if automatically figuring out the encoding was easy?
There aren't many Unicode encoding formats From casually scanning some articles about encodings I can count at least 70 character encodings [1].
and there aren't very many possibilities for the leading characters of a Haskell source file, are there? Since a Haskell program is a sequence of Unicode code points the programmer can choose from up to 1,112,064 characters. Many of these can legitimately be part of the interface of a module, as function names, operators or names of types.