Boxed foreign prim

I'm currently working with a lot of very short arrays of fixed length and as a thought experiment I thought I would try to play with fast numeric field accessors In particular, I'd like to use something like foreign prim to do something like
foreign import prim "cmm_getField" unsafeField# :: a -> Int# -> b
unsafeField :: a -> Int -> b unsafeField a (I# i) = a' `pseq` unsafeField# a' i -- the pseq could be moved into the prim, I suppose. where a' = unsafeCoerce a
fst :: (a,b) -> a fst = unsafeField 0
snd :: (a,b) -> b snd = unsafeField 1
This becomes more reasonable to consider when you are forced to make something like
data V4 a = V4 a a a a
using
unsafeIndex (V4 a _ _ _) 0 = a unsafeIndex (V4 _ b _ _) 1 = b unsafeIndex (V4 _ _ c _) 2 = c unsafeIndex (V4 _ _ _ d) 3 = d
rather than
unsafeIndex :: V4 a -> Int -> a unsafeIndex = unsafeField
But I can only pass unboxed types to foreign prim. Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?

On Thu, Mar 8, 2012 at 8:12 PM, Edward Kmett
I'm currently working with a lot of very short arrays of fixed length and as a thought experiment I thought I would try to play with fast numeric field accessors
...
This becomes more reasonable to consider when you are forced to make something like
data V4 a = V4 a a a a
using
unsafeIndex (V4 a _ _ _) 0 = a unsafeIndex (V4 _ b _ _) 1 = b unsafeIndex (V4 _ _ c _) 2 = c unsafeIndex (V4 _ _ _ d) 3 = d
rather than
unsafeIndex :: V4 a -> Int -> a unsafeIndex = unsafeField
I'm dealing with exactly this problem in unordered-containers. I'm dealing with small (16) element arrays that I need 1) index into and 2) update a single element off. I use Array# and MutableArray# for this, but they aren't optimal, mostly because they are optimized for the case of larger arrays (e.g. they use card tables and out-of-line allocation.) -- Johan

Hi, Am Donnerstag, den 08.03.2012, 23:12 -0500 schrieb Edward Kmett:
But I can only pass unboxed types to foreign prim.
Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?
funny, I just stumbled over this two days ago as well. In my case, I’m trying to investigate the actual data on the heap, so I am also trying to pass arbitrary values to a foreign prim function, which then receive the pointer to the heap object as the argument. Since real primops (e.g. unpackClosure#) can take arbitrary values, I figured that this is an artifact, as you say. I made this change to enable this feature: diff --git a/compiler/typecheck/TcType.lhs b/compiler/typecheck/TcType.lhs index 669545a..83a59bd 100644 --- a/compiler/typecheck/TcType.lhs +++ b/compiler/typecheck/TcType.lhs @@ -1166,7 +1166,7 @@ isOverloadedTy _ = False \begin{code} isFloatTy, isDoubleTy, isIntegerTy, isIntTy, isWordTy, isBoolTy, - isUnitTy, isCharTy :: Type -> Bool + isUnitTy, isCharTy, isAnyTy :: Type -> Bool isFloatTy = is_tc floatTyConKey isDoubleTy = is_tc doubleTyConKey isIntegerTy = is_tc integerTyConKey @@ -1175,6 +1175,7 @@ isWordTy = is_tc wordTyConKey isBoolTy = is_tc boolTyConKey isUnitTy = is_tc unitTyConKey isCharTy = is_tc charTyConKey +isAnyTy = is_tc anyTyConKey isStringTy :: Type -> Bool isStringTy ty @@ -1344,7 +1345,7 @@ isFFIPrimArgumentTy :: DynFlags -> Type -> Bool -- Checks for valid argument type for a 'foreign import prim' -- Currently they must all be simple unlifted types. isFFIPrimArgumentTy dflags ty - = checkRepTyCon (legalFIPrimArgTyCon dflags) ty + = isAnyTy ty || checkRepTyCon (legalFIPrimArgTyCon dflags) ty isFFIPrimResultTy :: DynFlags -> Type -> Bool -- Checks for valid result type for a 'foreign import prim' Initially, I had isFFIPrimArgumentTy modfified to really accept "a ->", but I found it cleaner to just accept the Any type, to make it more obvious that, morally speaking, not the value is passed to the function, but a pointer to Any thing. In the Haskell code, I’m unsafeCoerce#’ing the a to Any. Seems to work smoothly. Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

On 09/03/2012 04:12, Edward Kmett wrote:
I'm currently working with a lot of very short arrays of fixed length and as a thought experiment I thought I would try to play with fast numeric field accessors
In particular, I'd like to use something like foreign prim to do something like
foreign import prim "cmm_getField" unsafeField# :: a -> Int# -> b
unsafeField :: a -> Int -> b unsafeField a (I# i) = a' `pseq` unsafeField# a' i -- the pseq could be moved into the prim, I suppose. where a' = unsafeCoerce a
fst :: (a,b) -> a fst = unsafeField 0
snd :: (a,b) -> b snd = unsafeField 1
This becomes more reasonable to consider when you are forced to make something like
data V4 a = V4 a a a a
using
unsafeIndex (V4 a _ _ _) 0 = a unsafeIndex (V4 _ b _ _) 1 = b unsafeIndex (V4 _ _ c _) 2 = c unsafeIndex (V4 _ _ _ d) 3 = d
rather than
unsafeIndex :: V4 a -> Int -> a unsafeIndex = unsafeField
But I can only pass unboxed types to foreign prim.
Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?
It's an intrinsic limitation - the I# box is handled entirely at the Haskell level, primitives only deal with primitive types. But anyway, I suspect your first definition of unsafeIndex will be faster than the one using foreign import prim, because calling out-of-line to do the indexing is slow. Also pseq is slow - use seq instead. what you really want is built-in support for unsafeField#, which is certainly do-able. It's very similar to dataToTag# in the way that the argument is required to be evaluated - this is the main fragility, unfortunately GHC doesn't have a way to talk about things that are unlifted (except for the primitive unlifted types). But it just about works if you make sure there's a seq in the right place. Cheers, Simon

On Mon, Mar 12, 2012 at 6:45 AM, Simon Marlow
But I can only pass unboxed types to foreign prim.
Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?
It's an intrinsic limitation - the I# box is handled entirely at the Haskell level, primitives only deal with primitive types.
Ah. I was reasoning by comparison to atomicModifyMutVar#, which deals with unboxed polymorphic types, and even lies with a too general return type. Though the result there is returned in an unboxed tuple, the argument is passed unboxed. Is that implemented specially? But anyway, I suspect your first definition of unsafeIndex will be faster
than the one using foreign import prim, because calling out-of-line to do the indexing is slow.
Sure though, I suppose that balance of may shift as the side of the short vector grows. (e.g. with Johan it'd probably be 16 items).
Also pseq is slow - use seq instead.
Of course. I was being paranoid at the time and trying to get it to work at all. ;) what you really want is built-in support for unsafeField#, which is
certainly do-able. It's very similar to dataToTag# in the way that the argument is required to be evaluated - this is the main fragility, unfortunately GHC doesn't have a way to talk about things that are unlifted (except for the primitive unlifted types). But it just about works if you make sure there's a seq in the right place.
I'd be happy even if I had to seq the argument myself before applying it, as I was trying above. -Edward

On 12/03/2012 14:22, Edward Kmett wrote:
On Mon, Mar 12, 2012 at 6:45 AM, Simon Marlow
mailto:marlowsd@gmail.com> wrote: But I can only pass unboxed types to foreign prim.
Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?
It's an intrinsic limitation - the I# box is handled entirely at the Haskell level, primitives only deal with primitive types.
Ah. I was reasoning by comparison to atomicModifyMutVar#, which deals with unboxed polymorphic types, and even lies with a too general return type. Though the result there is returned in an unboxed tuple, the argument is passed unboxed.
Is that implemented specially?
I'm a little bit confused. atomicModifyMutVar# :: MutVar# s a -> (a -> b) -> State# s -> (# State# s, c #) Is the "unboxed polymorphic type" you're referring to the "MutVar# s a"? Perhaps the confusion is around the term "unboxed" - we normally say that MutVar# is "unlifted" (no _|_), but it is not "unboxed" because its representation is a pointer to a heap object.
But anyway, I suspect your first definition of unsafeIndex will be faster than the one using foreign import prim, because calling out-of-line to do the indexing is slow.
Sure though, I suppose that balance of may shift as the side of the short vector grows. (e.g. with Johan it'd probably be 16 items).
Also pseq is slow - use seq instead.
Of course. I was being paranoid at the time and trying to get it to work at all. ;)
what you really want is built-in support for unsafeField#, which is certainly do-able. It's very similar to dataToTag# in the way that the argument is required to be evaluated - this is the main fragility, unfortunately GHC doesn't have a way to talk about things that are unlifted (except for the primitive unlifted types). But it just about works if you make sure there's a seq in the right place.
I'd be happy even if I had to seq the argument myself before applying it, as I was trying above.
The problem is, that can't be done reliably. For dataToTag# the compiler automatically inserts the seq just before code generation if it can't prove that the argument is already evaluated, I think we would want to do the same thing for unsafeField#. See CorePrep.saturateDataToTag in the GHC sources. Cheers, Simon

On Tue, Mar 13, 2012 at 4:57 AM, Simon Marlow
On 12/03/2012 14:22, Edward Kmett wrote:
On Mon, Mar 12, 2012 at 6:45 AM, Simon Marlow
mailto:marlowsd@gmail.com> wrote: But I can only pass unboxed types to foreign prim. Is this an intrinsic limitation or just an artifact of the use cases that have presented themselves to date?
It's an intrinsic limitation - the I# box is handled entirely at the Haskell level, primitives only deal with primitive types.
Ah. I was reasoning by comparison to atomicModifyMutVar#, which deals with unboxed polymorphic types, and even lies with a too general return type. Though the result there is returned in an unboxed tuple, the argument is passed unboxed.
Is that implemented specially?
I'm a little bit confused.
atomicModifyMutVar# :: MutVar# s a -> (a -> b) -> State# s -> (# State# s, c #)
Is the "unboxed polymorphic type" you're referring to the "MutVar# s a"? Perhaps the confusion is around the term "unboxed" - we normally say that MutVar# is "unlifted" (no _|_), but it is not "unboxed" because its representation is a pointer to a heap object.
I was talking about the (a -> b). I used it because the extraction of 'c' rather than a proper pair was closest to my situation. A less confused example might be newArray# which accepts a polymorphic 'a'.
The problem is, that can't be done reliably. For dataToTag# the compiler automatically inserts the seq just before code generation if it can't prove that the argument is already evaluated, I think we would want to do the same thing for unsafeField#.
Fair enough. Thanks again. -Edward
participants (4)
-
Edward Kmett
-
Joachim Breitner
-
Johan Tibell
-
Simon Marlow