
Hello guys, I am working on updated version of HDirect and now I am going to use CWString API to marshal (wchar_t *) type to String. I found some inconsistencies in the API. - castCWcharToChar and castCharToCWchar functions are defined only for Posix systems and they aren't exported. In the same time castCCharToChar and castCharToCChar have the same meaning and they are defined and exported on all platforms. - CWchar type looks a little bit strange compared to CChar, CString and CWString types. In my opinion CWChar looks more consistent. Since the CWString API isn't released in previous GHC releases I think now is the time to fix that. Any opinions? Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail

On Tue, Nov 30, 2004 at 12:41:04AM -0800, Krasimir Angelov wrote:
Hello guys,
I am working on updated version of HDirect and now I am going to use CWString API to marshal (wchar_t *) type to String. I found some inconsistencies in the API. - castCWcharToChar and castCharToCWchar functions are defined only for Posix systems and they aren't exported. In the same time castCCharToChar and castCharToCChar have the same meaning and they are defined and exported on all platforms.
The problem is that these operations are very unsafe, there is no guarenteed isomorphism or even injection between wchar_ts and Chars. If people really know what they are doing, they can do the conversion themselves via fromIntegral/ord/chr, but I don't think we should encourage such unsafe usage with functions when it is simple for the user to work around it themselves.
- CWchar type looks a little bit strange compared to CChar, CString and CWString types. In my opinion CWChar looks more consistent.
I originally had it as CWChar, but it was changed to CWchar to conform to the already written FFI spec which defined the wchar_t equivalant to be CWchar
Since the CWString API isn't released in previous GHC releases I think now is the time to fix that. Any opinions?
Any changes would have to be propegated to the FFI spec which is pretty stable. John -- John Meacham - ⑆repetae.net⑆john⑈

--- John Meacham
The problem is that these operations are very unsafe, there is no guarenteed isomorphism or even injection between wchar_ts and Chars. If people really know what they are doing, they can do the conversion themselves via fromIntegral/ord/chr, but I don't think we should encourage such unsafe usage with functions when it is simple for the user to work around it themselves.
As I understand castCWcharToChar is unsafe if the language doesn't support unicode /* Char type is too small */ and castCharToCWchar is unsafe if in the target OS wchar_t has 16 bits while the language supports unicode. In both cases String<->CWString traslation is safe. When I have wchar_t in C then I have two opportunities: - map the type in Haskell to CWchar without any conversion - use chr.fromIntegral or fromIntegral.ord The first variant is more portable. Please correct me if I am wrong. Are castCCharToChar and castCharToCChar deprecated? I think castCharToCChar is unsafe when the language supports Unicode.
- CWchar type looks a little bit strange compared to CChar, CString and CWString types. In my opinion CWChar looks more consistent.
I originally had it as CWChar, but it was changed to CWchar to conform to the already written FFI spec which defined the wchar_t equivalant to be CWchar
Maybe the FFI spec should be fixed but I don't think that the issue is so important because it is a matter of taste. I can live with CWchar for now. Cheers, Krasimir __________________________________ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com

On Tue, Nov 30, 2004 at 02:40:20AM -0800, Krasimir Angelov wrote:
--- John Meacham
wrote: The problem is that these operations are very unsafe, there is no guarenteed isomorphism or even injection between wchar_ts and Chars. If people really know what they are doing, they can do the conversion themselves via fromIntegral/ord/chr, but I don't think we should encourage such unsafe usage with functions when it is simple for the user to work around it themselves.
As I understand castCWcharToChar is unsafe if the language doesn't support unicode /* Char type is too small */ and castCharToCWchar is unsafe if in the target OS wchar_t has 16 bits while the language supports unicode. In both cases String<->CWString traslation is safe. When I have wchar_t in C then I have two opportunities:
- map the type in Haskell to CWchar without any conversion - use chr.fromIntegral or fromIntegral.ord
The first variant is more portable. Please correct me if I am wrong.
The problem is that even if the language supports the full unicode range, there is no guarentee that a single wchar_t maps (simply and in a pure functional fashion) to a haskell Char. Just because wchar_t is 16 bits, it does not mean it represents a 16 bit subset of unicode, regional systems may have specialized wchar_t's for their language which are not unicode. The encoding of wchar_t is pretty much completely unspecified, unless __STDC_ISO10646__ is defined, in which case it is straight unicode and the casting routines could be defined simply (my CWString library detects and optimizes this case.). The only common system where this is the case is linux glibc based systems.
Are castCCharToChar and castCharToCChar deprecated? I think castCharToCChar is unsafe when the language supports Unicode.
These have never really been safe to use. char may have a completly different encoding than Char which these won't honor. deprecated may not be the proper word, but whenever possible one should use the higher level conversion routines which behave properly in the current locale. These should only be used when you have system or application specific knowledge that CChar is always ASCII and not dependent on the current locale. Note that in general, there will not ever be a guarenteed one-to-one mapping between chars,wchar_ts and haskell Chars, so higher level routines must work on strings rather than individual chars. John -- John Meacham - ⑆repetae.net⑆john⑈

--- John Meacham
Note that in general, there will not ever be a guarenteed one-to-one mapping between chars,wchar_ts and haskell Chars, so higher level routines must work on strings rather than individual chars.
With other words the case for wchar_t isn't worse than the case for char. I still would like to have the conversion functions for CWchar in order to keep symmetry between CChar and CWchar. Of course the documentation must explain the limitations of these functions. In any case I can use the chr/ord/fromIntegral trick but with the specialized functions the code will look better. Cheers, Krasimir __________________________________ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail
participants (2)
-
John Meacham
-
Krasimir Angelov