adding to GHC/win32 Handle operations support of Unicode filenames and files larger than 4 GB
Hello glasgow-haskell-users, Simon, what you will say about the following plan? ghc/win32 currently don't support operations with files with Unicode filenames, nor it can tell/seek in files for positions larger than 4 GB. it is because Unix-compatible functions open/fstat/tell/... that is supported in Mingw32 works only with "char[]" for filenames and off_t (which is 32 bit) for file sizes/positions half year ago i discussed with Simon Marlow how support for unicode names and large files can be added to GHC. now i implemented my own library for such files, and got an idea how this can incorporated to GHC with minimal efforts: GHC currently uses CString type to represent C-land filenames and COff type to represent C-land fileseizes/positions. We need to systematically change these usages to CFilePath and CFileOffset, respectively, defined as follows: #ifdef mingw32_HOST_OS type CFilePath = LPCTSTR type CFileOffset = Int64 withCFilePath = withTString peekCFilePath = peekTString #else type CFilePath = CString type CFileOffset = COff withCFilePath = withCString peekCFilePath = peekCString #endif and of course change using of withCString/peekCString, where it is applied to filenames, to withCFilePath/peekCFilePath (this will touch modules System.Posix.Internals, System.Directory, GHC.Handle) the last change needed is to conditionally define all "c_*" functions in System.Posix.Internals, whose types contain references to filenames or offsets: #ifdef mingw32_HOST_OS foreign import ccall unsafe "HsBase.h _wrmdir" c_rmdir :: CFilePath -> IO CInt .... #else foreign import ccall unsafe "HsBase.h rmdir" c_rmdir :: CFilePath -> IO CInt .... #endif (note that actual C function used is _wrmdir for Windows and rmdir for Unix). of course, all such functions defined in HsBase.h, also need to be defined conditionally, like: #ifdef mingw32_HOST_OS INLINE time_t __hscore_st_mtime ( struct _stati64* st ) { return st->st_mtime; } #else INLINE time_t __hscore_st_mtime ( struct stat* st ) { return st->st_mtime; } #endif That's all! of course, this will broke compatibility with current programs which directly uses these c_* functions (c_open, c_lseek, c_stat and so on). this may be issue for some libs. are someone really use these functions??? of course, we can go in another, fully backward-compatible way, by adding some "f_*" functions and changing high-level modules to work with these functions -- Best regards, Bulat mailto:bulatz@HotPOP.com
Am Montag, 21. November 2005 13:01 schrieb Bulat Ziganshin:
[...] #ifdef mingw32_HOST_OS type CFilePath = LPCTSTR type CFileOffset = Int64 withCFilePath = withTString peekCFilePath = peekTString #else type CFilePath = CString type CFileOffset = COff withCFilePath = withCString peekCFilePath = peekCString #endif [...] #ifdef mingw32_HOST_OS INLINE time_t __hscore_st_mtime ( struct _stati64* st ) { return st->st_mtime; } #else INLINE time_t __hscore_st_mtime ( struct stat* st ) { return st->st_mtime; } #endif [...]
Whatever will be done, please use *feature-based* ifdefs, not those platform-dependent ones above, which will be proven wrong either immediately or after a short time. We already have too much of those wrong ifdefs in the code... Cheers, S.
Hello Sven, Tuesday, November 22, 2005, 8:53:55 PM, you wrote:
#ifdef mingw32_HOST_OS type CFilePath = LPCTSTR type CFileOffset = Int64
SP> Whatever will be done, please use *feature-based* ifdefs, not those SP> platform-dependent ones above, which will be proven wrong either immediately SP> or after a short time. We already have too much of those wrong ifdefs in the SP> code... as GHC authors will say - for me there is no difference -- Best regards, Bulat mailto:bulatz@HotPOP.com
participants (2)
-
Bulat Ziganshin -
Sven Panne