
Here is what happens when a language provides only narrow-char API for
filenames:
-------------------- Start of forwarded message --------------------
Date: Wed, 15 Sep 2004 15:18:00 +0100
From: Peter Jolly
I have a filename as an UTF-8 encoded string. I need to be able to handle strange chars like accents, Asian chars etc.
Is there any way to create a file with that name? I only need it on Win32.
Windows uses UTF-16 for filenames, but provides a non-Unicode interface for legacy applications; the standard open() function that OCaml's open_out wraps appears to use the legacy interface. The precise codepage this uses is system-dependent, and AFAIK there's no way for a program to determine what it is without calling out to the Win32 API, but you can be pretty sure it won't be UTF-8. In other words, there is no reliable way to use a filename containing non-ASCII characters with OCaml's standard library.
Or should I solve this problem by talking directly to the Win32-api?
This is probably the best solution. A combination of CreateFileW() and MultiByteToWideChar() should do what you want. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners -------------------- End of forwarded message -------------------- -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk wrote:
Here is what happens when a language provides only narrow-char API for filenames:
I have a filename as an UTF-8 encoded string. I need to be able to handle strange chars like accents, Asian chars etc.
Is there any way to create a file with that name? I only need it on Win32.
Windows uses UTF-16 for filenames, but provides a non-Unicode interface for legacy applications; the standard open() function that OCaml's open_out wraps appears to use the legacy interface. The precise codepage this uses is system-dependent, and AFAIK there's no way for a program to determine what it is without calling out to the Win32 API, but you can be pretty sure it won't be UTF-8.
In other words, there is no reliable way to use a filename containing non-ASCII characters with OCaml's standard library.
No, this is what happens when an API imposes restrictions upon the
filenames which it can handle.
Essentially, it's due to two (or possibly three) factors:
1. The fact that Windows uses wide strings, rather than multi-byte
strings, for filenames.
2. The fact that Windows' compatibility interface is broken, i.e. it
only lets you access filenames which can be represented in the current
codepage (which, to me, is highly analogous to only supporting
filenames which are valid in the current locale).
3. Possibly that OCaml insists upon using UTF-8. [I don't know that
this is the case, but the fact that they specifically mention UTF-8
suggests that it might be.]
IOW, this incident seems to oppose, rather than support, the
filenames-as-characters viewpoint.
--
Glynn Clements
participants (2)
-
Glynn Clements
-
Marcin 'Qrczak' Kowalczyk