
Is there such a thing as memory-mapped arrays in GHC? I'm looking for something that would let me memory-map a file of floats and access it as an array. Thanks, Joel -- http://wagerlabs.com

joelr1:
Is there such a thing as memory-mapped arrays in GHC?
I'm looking for something that would let me memory-map a file of floats and access it as an array.
There's a commented out mmapFile for ByteString in Data.ByteString's source. Use that, and then extract the ForeignPtr from the resulting ByteString, and castPtr it to a Ptr CFloat, then you're in business. -- Don

Joel Reymont wrote:
Is there such a thing as memory-mapped arrays in GHC?
In principle, there could be an IArray instance to memory-mapped files. (There could also be a mutable version, but just the IArray version would be useful). I noticed just the other day that there are some 'obvious' IArray constructors missing. It ought, for example, be possible to build a new IArray from an old from a subset of the elements; a dimensional slice going from an (Int,Int,Int) indexed array to (Int,Int), or a stride taking 'one element in three' along each axis, etc. Annoyingly, it doesn't seem to be straightforward to make your own instances of IArray, since the important methods aren't exported. I think there is real scope for some expansion here. Jules

On Wed, Nov 07, 2007 at 10:10:16PM +0000, Jules Bean wrote:
Joel Reymont wrote:
Is there such a thing as memory-mapped arrays in GHC?
In principle, there could be an IArray instance to memory-mapped files.
(There could also be a mutable version, but just the IArray version would be useful).
I noticed just the other day that there are some 'obvious' IArray constructors missing. It ought, for example, be possible to build a new IArray from an old from a subset of the elements; a dimensional slice going from an (Int,Int,Int) indexed array to (Int,Int), or a stride taking 'one element in three' along each axis, etc.
Annoyingly, it doesn't seem to be straightforward to make your own instances of IArray, since the important methods aren't exported.
They are, from the undocumented module Data.Array.Base. Stefan

On Wed, Nov 07, 2007 at 10:10:16PM +0000, Jules Bean wrote:
Joel Reymont wrote:
Is there such a thing as memory-mapped arrays in GHC?
In principle, there could be an IArray instance to memory-mapped files.
(There could also be a mutable version, but just the IArray version would be useful).
The IArray instance would be unsafe, however, because the contents of the file could change after you opened it, breaking referential transparency. I don't know what all is possible with file open modes, but I don't think you can guarantee that once you've opened a file it won't change (unless you unlink it, and know that noone else has an opened file handle to it). It may be that by opening it in write mode you could ensure that noone else modifies it (although I don't think this would work e.g. on nfs), but then you eliminate the very useful possibility of mmapping read-only files as IArrays (e.g. to access /usr/share/dict/words). So it seems reasonable that the mutable version would necesarily be primary, with the IArray version accessible only by an unsafe operation. -- David Roundy Department of Physics Oregon State University

On 2007-11-08, David Roundy
On Wed, Nov 07, 2007 at 10:10:16PM +0000, Jules Bean wrote:
Joel Reymont wrote:
Is there such a thing as memory-mapped arrays in GHC?
In principle, there could be an IArray instance to memory-mapped files.
(There could also be a mutable version, but just the IArray version would be useful).
The IArray instance would be unsafe, however, because the contents of the file could change after you opened it, breaking referential transparency.
Or even crashing, if the size becomes smaller than the mapped area.
I don't know what all is possible with file open modes, but I don't think you can guarantee that once you've opened a file it won't change (unless you unlink it, and know that noone else has an opened file handle to it).
File open modes won't do it, and I don't think any thing else will do it using just POSIX behavior, either. Linux's mmap() used to support a DENY_WRITE flag, but it enabled DoS attacks, so it's gone.
It may be that by opening it in write mode you could ensure that noone else modifies it (although I don't think this would work e.g. on nfs),
It doesn't even work locally. -- Aaron Denney -><-

Aaron Denney wrote:
It may be that by opening it in write mode you could ensure that noone else modifies it (although I don't think this would work e.g. on nfs),
It doesn't even work locally.
Right. But mmap is only sensible to use (even in C) when you know about all the other processes who can modify the file; bearing this in mind, it's still useful! Obviously the haskell RTS can't insulate us from certain things, if the OS-level primitives let it down, but we can document what you need to do to be safe. Jules
participants (6)
-
Aaron Denney
-
David Roundy
-
Don Stewart
-
Joel Reymont
-
Jules Bean
-
Stefan O'Rear