
Hi, Following up the discussion in Haskell-Cafe about ways to bring better Unicode support in GHC. I may take care on putting this into the GHC runtime, but I need some advice as I am completely new to this. What needs to be done primarily, is to replace the FFI calls made from GHC.Unicode (iswupper, islower etc.) with functions implemented directly in the runtime, or in an external library (independent from libc). I tried to do this in two ways. First, I made a shared object containing substitutes for these functions (draft code, based on what I submitted for Hugs some time ago), and LD_PRELOADed it. Everything went fine, and my small test program worked both compiled to binary and in GHCi (it just coredumped when towlower/towupper was called on non-ASCII character without the substitution library). Bad thing is, LD_PRELOAD does not work on all systems. So I tried to put the code directly into the runtime (where I believe it should be; the Unicode properties table is packed, and won't eat much space). I renamed foreign function names in GHC.Unicode (to avoid conflict with libc functions) adding u_ to them (so now they are u_iswupper, etc). I placed the new file into ghc/rts, and the include file into ghc/includes. I could not avoid messages about missing prototypes for u_... functions , but finally I was able to build ghc. Now when I compiled my test program with the rebuilt ghc, it worked without the LD_PRELOADed library. However, GHCi could not start complaining that it could not see these u_... symbols. I noticed some other entry points into the runtime like revertCAFs, or getAllocations, declared in the Haskell part of GHCi just as other foreign calls, so I just followed the same style - partly unsuccessfully. Where am I wrong? Another thing, this might be done without intervention into the sources, just as an external library. Which would be the best placement for the library, so it would load/compile in automatically? I tried to find information in the GHC Commentary, but it did not give me much. Dimitry Golubovsky Middletown, CT

--- Dimitry Golubovsky
Hi,
Following up the discussion in Haskell-Cafe about ways to bring better Unicode support in GHC.
A radical suggestion from an earlier discussion was to make String a typeclass. Have unicode, ascii, etc. all be representations. The question that this brings up is does it require a change from the Haskell language standard? The trickle down implications from String into Char and potentially List might have too high an impact for this approach. Personally I think it's something that should be considered in the long run. Shawn Garbett __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail
participants (2)
-
Dimitry Golubovsky
-
Shawn Garbett