
The algorithm in the new module (GHC.IO.Encoding.CodePage.API) is rather
intricate, so I've commented it quite thoroughly. The changes to other
modules are minimal: we simply now use a real code page encoding instead of
brokenly using latin1 when GHC doesn't have the code page built in, so
there isn't much of a change to document.
Max
On 24 April 2013 08:12, Simon Peyton-Jones
Great stuff. ****
** **
One thing: have you left enough documentation in the code that, when someone comes along in 3 years time, they can understand the problem and how you have dealt with it? Lot of “Note [Blah]” stuff? Or something.*** *
Thanks****
** **
Simon****
** **
*From:* ghc-devs-bounces@haskell.org [mailto:ghc-devs-bounces@haskell.org] *On Behalf Of *Max Bolingbroke *Sent:* 23 April 2013 21:29 *To:* ghc-devs@haskell.org *Subject:* DBCS encoding support on Windows****
** **
Hi GHCers,****
** **
I've implemented support in GHC for extra Windows code pages on the branch "dbcs" of the base library.****
** **
The problem this solves is that currently users of Haskell on a Windows machine running in a locale which uses a double-byte code page such as CP936 (GBK) or CP950 (Big5) cannot properly interact with the Windows console in their native language. Unfortunately code page support is a prerequisite for getting this to work correctly because for all Microsoft's fine talk about Unicode being the future, the Windows console does not seem to support it properly - code pages are the only way to go for console input and output.****
** **
As the standard Windows locale encodings in many regions, these code pages are also the predominant method of encoding text files in many countries, so they are useful outside the console.****
** **
The solution is along the lines suggested in http://hackage.haskell.org/trac/ghc/ticket/3977, i.e. we create an iconv-like interface to Window's MultiByteToWideChar and WideCharToMultiByte APIs by the judicious use of binary search. In my branch, these APIs will be used whenever we don't have a built-in native Haskell TextEncoding for the code page (we used to fall back on using latin1 for such code pages).****
** **
Unless there are any objections I'll merge this into the base library main branch next week.****
** **
Cheers,****
Max****
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs