Re: WinHugs and Unicode

On Fri, Sep 16, 2005 at 04:58:03PM +0100, Neil Mitchell wrote:
In Windows both the WinHugs "fake" console and the standard console are already unicode compliant on NT/2K/XP and have wrapper functions such as wprintf which are unicode and tprintf which is either ASCII or Unicode depending on some #define's.
If you change the user-default ANSI code page, can it print out non-ASCII Chars, and read them in? (Identifiers are restricted to Latin-1, though.)
How does Hugs deal with unicode, i.e. are the filenames etc. stored by the program as unicode, or is it just the Haskell elements that are Unicode.
Module filenames, no. Filenames in H98 library calls are String, which is Unicode. String literals are byte-encoded Unicode, though.
Attatched is a patch which includes the unicode compliant header along with commented out definitions for unicode in WinHugs. If you uncomment those definitions it will not work, but then I can work on converting files one by one and when they are all done hopefully a Unicode enabled Hugs on windows will result :)
This process should not alter the non unicode code paths at all, beause of various defines in Windows.
I fear you may be opening a can of worms, and you may end up having to change things everywhere.

This process should not alter the non unicode code paths at all, beause of various defines in Windows.
I fear you may be opening a can of worms, and you may end up having to change things everywhere.
Shouldn't do. Windows defines A and W versions of all functions, i.e. SendMessage is a #define, which calls either SendMessageA or SendMesssageW depending on if #define Unicode or not. LPTSTR is the best way to define strings which maps to either ASCII LPSTR or Unicode LPWSTR. Throughout the WinHugs code its a bit of a haphazard mess of both LPSTR and LPTSTR. In order to get it all synced, you need to use _T("str") for strings, which autos between unicode and ascii. The include in this patch enables that. With this patch, the unicode version will not work. But it does put in the necessary include so that from now on converting a file can be done one by one, and its not too hard. To be honest, I'm not that fussed about Unicode (my language has a small alphabet), so this patch could be ignored until someone who cares about it steps up. Thanks Neil
participants (2)
-
Neil Mitchell
-
Ross Paterson