accessing a ByteArray from FFI

Hello, I want to access the contents of a Data.Vector.Primitive from FFI. I came up with the following so far: {-# LANGUAGE ForeignFunctionInterface #-} module FFIExample (exampleFn) where import Control.Monad.Primitive (touch) import Control.Monad.ST (runST) import Data.Primitive.Addr (Addr(..)) import Data.Primitive.ByteArray (mutableByteArrayContents) import qualified Data.Vector.Primitive as P import qualified Data.Vector.Primitive.Mutable as PM import GHC.Ptr (Ptr(..), plusPtr) import Data.Word (Word64) foreign import ccall unsafe "my_external_fn" myExternalFn :: Ptr Word64 -> Int -> Int exampleFn :: P.Vector Word64 -> Int exampleFn v = runST $ do PM.MVector off len mba <- P.unsafeThaw v let ptr = case mutableByteArrayContents mba of Addr addr -> Ptr addr `plusPtr` off result = myExternalFn ptr len result `seq` touch mba return result This seems to work, but I have questions that I haven't been able to find the answers to on my own: 1. Is this actually OK to do? I guess that this wouldn't be OK if the foreign import were "safe", as then the GC could move the contents of the byte array while the foreign function is running. But is it safe like this? 2. If the answer to the previous question is no, then is there a way to do it properly? Or there is just no way to pass an unpinned byte array to a foreign call? What about foreign import prim? 3. If the answer to Q1 is no, then would it be OK if the underlying byte array were pinned? 4. Any other simplifications? Pointers to resources on these topics would be more than welcome! But, I haven't been able to find any. Thanks, Mihaly

Hi,
On Wed, Jun 25, 2014 at 12:58 PM, Mihaly Barasz
1. Is this actually OK to do? I guess that this wouldn't be OK if the foreign import were "safe", as then the GC could move the contents of the byte array while the foreign function is running. But is it safe like this?
Whether the foreign import is marked as safe or not doesn't matter. The GC might move the array just after you make your address computation but before the call is made. This is only safe if the Primitive array is guaranteed to be pinned. I checked the docs for that module and couldn't see any such promise.
2. If the answer to the previous question is no, then is there a way to do it properly? Or there is just no way to pass an unpinned byte array to a foreign call? What about foreign import prim?
There is a way to pass an unpinned ByteArray# (or MutableByteArray#, but the former seems right in your case) to a foreign call, using the UnliftedFFITypes language extension. The ByteArray# is guaranteed to not to be moved for the duration of the call. The code should treat the ByteArray# argument as if it was a pointer to bytes. You will need to do any address offset computations on the C side (i.e. pass any offsets you need as extra argument to your C function).
3. If the answer to Q1 is no, then would it be OK if the underlying byte array were pinned?
Yes, but I don't know if you can have the API give you any such guarantees, unless you use the Storable vector version.
4. Any other simplifications?
The unsafeThat bothers me. Doesn't vector give you any other way to get to the underlying ByteArray#? -- Johan

On Wed, Jun 25, 2014 at 2:11 PM, Johan Tibell
Hi,
On Wed, Jun 25, 2014 at 12:58 PM, Mihaly Barasz
wrote: 1. Is this actually OK to do? I guess that this wouldn't be OK if the foreign import were "safe", as then the GC could move the contents of the byte array while the foreign function is running. But is it safe like this?
Whether the foreign import is marked as safe or not doesn't matter. The GC might move the array just after you make your address computation but before the call is made. This is only safe if the Primitive array is guaranteed to be pinned. I checked the docs for that module and couldn't see any such promise.
Well, I don't know enough about how GC is specified in GHC, but _in practice_ calls to GC could happen only on entry to the exampleFn closure. Not between the address computation and the foreign call. (I simply looked at the generated code, I don't know if there is any guarantee for that.)
2. If the answer to the previous question is no, then is there a way to do it properly? Or there is just no way to pass an unpinned byte array to a foreign call? What about foreign import prim?
There is a way to pass an unpinned ByteArray# (or MutableByteArray#, but the former seems right in your case) to a foreign call, using the UnliftedFFITypes language extension. The ByteArray# is guaranteed to not to be moved for the duration of the call. The code should treat the ByteArray# argument as if it was a pointer to bytes. You will need to do any address offset computations on the C side (i.e. pass any offsets you need as extra argument to your C function).
Thanks, I'll look into that. Are there any pointers/examples?
3. If the answer to Q1 is no, then would it be OK if the underlying byte array were pinned?
Yes, but I don't know if you can have the API give you any such guarantees, unless you use the Storable vector version.
4. Any other simplifications?
The unsafeThat bothers me. Doesn't vector give you any other way to get to the underlying ByteArray#?
I don't know why, but the constructor for Vector is not exported, only for the MVector. But, this use of unsafeThaw is completely benign. (Actually, it fully disappears in the generated code. :))
-- Johan

On Wed, Jun 25, 2014 at 2:54 PM, Mihaly Barasz
Well, I don't know enough about how GC is specified in GHC, but _in practice_ calls to GC could happen only on entry to the exampleFn closure. Not between the address computation and the foreign call. (I simply looked at the generated code, I don't know if there is any guarantee for that.)
There can be additional heap checks at the start of any basic block in the generated assembly for the function. Right, in practice there's probably not an issue.
Thanks, I'll look into that. Are there any pointers/examples?
There's some code out there on the web that uses the extension. Here's an example: https://github.com/tibbe/hashable/blob/master/Data/Hashable/Class.hs#L470
I don't know why, but the constructor for Vector is not exported, only for the MVector. But, this use of unsafeThaw is completely benign. (Actually, it fully disappears in the generated code. :))
Probably because it's a ByteArray#, not an Array#. For the latter unsafe thawing results in the object being put on the GC mutable list (in practice that means that the info table ptr changes).

On Wed, Jun 25, 2014 at 15:02 +0200, Johan Tibell wrote:
On Wed, Jun 25, 2014 at 2:54 PM, Mihaly Barasz
wrote: Well, I don't know enough about how GC is specified in GHC, but _in practice_ calls to GC could happen only on entry to the exampleFn closure. Not between the address computation and the foreign call. (I simply looked at the generated code, I don't know if there is any guarantee for that.)
There can be additional heap checks at the start of any basic block in the generated assembly for the function. Right, in practice there's probably not an issue.
Thanks, I'll look into that. Are there any pointers/examples?
There's some code out there on the web that uses the extension. Here's an example: https://github.com/tibbe/hashable/blob/master/Data/Hashable/Class.hs#L470
Thanks, this works wonderfully. The code is much simplified: {-# LANGUAGE ForeignFunctionInterface #-} {-# LANGUAGE MagicHash, UnliftedFFITypes #-} module FFIExample (exampleFn) where import Control.Monad.ST (runST) import Data.Primitive.ByteArray (ByteArray(..), ByteArray#, unsafeFreezeByteArray) import qualified Data.Vector.Primitive as P import qualified Data.Vector.Primitive.Mutable as PM import Data.Word (Word64) foreign import ccall unsafe "my_external_fn" myExternalFn :: ByteArray# -> Int -> Int -> Int exampleFn :: P.Vector Word64 -> Int exampleFn v = runST $ do PM.MVector off len mba <- P.unsafeThaw v ByteArray ba <- unsafeFreezeByteArray mba return $! myExternalFn ba off len And yeah, now the unsafeThaw followed immediately by the unsafeFreezeByteArray looks even sillier. (But again, it completely disappears in the generated code.) Mihaly
I don't know why, but the constructor for Vector is not exported, only for the MVector. But, this use of unsafeThaw is completely benign. (Actually, it fully disappears in the generated code. :))
Probably because it's a ByteArray#, not an Array#. For the latter unsafe thawing results in the object being put on the GC mutable list (in practice that means that the info table ptr changes).
participants (2)
-
Johan Tibell
-
Mihaly Barasz