
The goal is that more complicated situations are reflected in more complicated "ghc" or "main" invocations. The least complicated usage defaults to being identical cross-platform and regardless of terminal I/O.
I think the best default would be UTF8 for all text handles. This can be easily documented, it can be easily understood, and will produce the fewest suprises. (...) ** Unless influenced by command-line switches, these default to UTF8. I think that making the behavior of programs change, depending on compiler options, will produce a lot of surprises. I think that being only able to set the default encoding from within the program is a better idea, because it keeps the specification of the behavior of
Op 26-feb-2008, om 18:42 heeft Chris Kuklewicz het volgende geschreven: the program inside the source. Reinier

Reinier Lamers wrote:
The goal is that more complicated situations are reflected in more complicated "ghc" or "main" invocations. The least complicated usage defaults to being identical cross-platform and regardless of terminal I/O.
I think the best default would be UTF8 for all text handles. This can be easily documented, it can be easily understood, and will produce the fewest suprises. (...) ** Unless influenced by command-line switches, these default to UTF8. I think that making the behavior of programs change, depending on compiler options, will produce a lot of surprises. I think that being only able to set the default encoding from within the program is a better idea, because it keeps the specification of the behavior of the
Op 26-feb-2008, om 18:42 heeft Chris Kuklewicz het volgende geschreven: program inside the source.
Reinier
I thought about that. I started with realizing that *all* code written for GHC is written knowing Handles only return Word8 sized Latin1 characters. So there are several way one might proceed, some of which are: 1) No command line switches, default to Latin1. To get unicode you call a special 'turnOnUnicodeHandleGoodness' IO operation. This is good since it does not break old code. 2) No command line switches, default to something new. This required all old code to be conditionally retrofit with a 'turnOffUnicode' IO operation. This breaks much of the code that has been written, and is thus bad. 3) Add a "ghc --turn-on-unicode" command line switch. This makes all old code build just fine, since it lacks the switch to activate the new behavior. 4) Add a "ghc --turn-off-unicode" command line switch. This is nice since it lets new code use the new Handle encoding by default, but not nice in requiring that old code built using ghc-6.10 use an additional option. I also think the following are likely to be true: *) Cabal is already controlling the ghc compiler switches for most code. *) The experience of the ghc-6.6 to ghc-6.8 transition involved updating most cabal files to allow old code to work with new compiler. *) Other changes, unrelated to the unicode handles, will require most old packages to update their cabal files to with with ghc-6.10 *) The additional work to updated the cabal file to add the "--turn-off-unicode" command line switch to ghc would be 1 word to 1 line. So I think that making ghc default to option (4) above saves nearly zero work when updating old cabal files compared to option (3). The benefit of option (3) compared to (4) is that no boilerplate will be needed to obtain the new handle encoding. And I simply prefer that the better handle encoding be the default; move the implementation forward. Now if GHC does not have a command line switch then either with (2) you have to conditionally (perhaps with #ifdef) update almost every bit of code on hackage or with (1) you have all future programs burdened with boilerplate, which some people may forget. So I will enjoy having switches as well as the IO commands.
participants (2)
-
Chris Kuklewicz
-
Reinier Lamers