Automatically using boxed or unboxed vectors

I am trying to create a data type that is an vector of values, but internally uses an unboxed vector when possible and a boxed array otherwise. If this already exists, let me know. For example, any easily unboxed type is stored in an unboxed array, but also any type that is a product type of only easily unboxed types like (Int, Int) is stored in multiple unboxed arrays. I don’t think there is a smart way of doing this for sum types, so those are stored in boxed arrays. I’m trying to create a typeclass class VectorElement a where data Vector a replicate :: Int -> a -> Vector a length :: Vector a -> Int to represent things that can be stored in Vectors. I can then implement it for specific types that I know can be stored in unboxed vectors: instance VectorElement Int where newtype Vector Int = VectorInt (U.Vector Int) deriving Show replicate len x = VectorInt $ U.replicate len x length (VectorInt v) = U.length v I also want to automatically derive instances of this class for other types using the Generic typeclass. Ideally these instances would be the most efficient possible, so that for example the instance for (Int, Int) used two unboxed arrays but the instance for Maybe Int uses a boxed array. To that end I created another typeclass and wrote instances for the Generic data constructors: class VectorElementG (r :: * -> *) where data VectorG r replicateG :: Int -> r a -> VectorG r lengthG :: VectorG r -> Int instance VectorElementG V1 where data VectorG V1 replicateG = undefined lengthG = undefined instance VectorElementG U1 where newtype VectorG U1 = VectorGUnit Int replicateG i U1 = VectorGUnit i lengthG (VectorGUnit i) = i instance VectorElement a => VectorElementG (K1 i a) where newtype VectorG (K1 i a) = VectorGK (Vector a) replicateG i (K1 x) = VectorGK $ replicate i x lengthG (VectorGK v) = length v instance (VectorElementG r1, VectorElementG r2) => VectorElementG (r1 :*: r2) where data VectorG (r1 :*: r2) = VectorGProd (VectorG r1) (VectorG r2) replicateG i (a :*: b) = VectorGProd (replicateG i a) (replicateG i b) lengthG (VectorGProd v _) = lengthG v instance VectorElement ((r1 :+: r2) p) where newtype Vector ((r1 :+: r2) p) = VectorSum (V.Vector ((r1 :+: r2) p)) replicate i x = VectorSum $ V.replicate i x length (VectorSum v) = V.length v instance VectorElementG f => VectorElementG (M1 i c f) where newtype VectorG (M1 i c f) = VectorGMeta (VectorG f) replicateG i (M1 f) = VectorGMeta $ replicateG i f lengthG (VectorGMeta v) = lengthG v I’m not sure if these are correct, especially the one for :+:. I want basically base cases to be any type that already has an instance of VectorElement or a sum type which is automatically boxed, and the recursive case to basically just use parallel vectors for product types. I think this sort of worked insofar as it allowed me to write an instance for tuples: instance (VectorElement a, VectorElement b) => VectorElement (a,b) where newtype Vector (a, b) = VectorTuple (VectorG (Rep (a, b))) replicate i x = VectorTuple $ replicateG i (from x) length (VectorTuple v) = lengthG v Ideally, however, the compiler would automatically derive this instance using the Generic instance. Is there a way to do that also? I would welcome any thoughts on this entire idea and approach. Thanks, Jake Waksbaum

Hello Jake,
Many libraries implement this "array of structures -> structure of arrays"
type of transformation, including for example the 'accelerate' library. I'm
however only aware of one that does it for good old vectors:
https://hackage.haskell.org/package/hybrid-vectors
I don't think it implements any Generic-deriving mechanism for making this
work on user-specified types without any effort. This could be a pretty
nice addition though :)
On Wed, Oct 25, 2017 at 12:43 AM, Jake Waksbaum
I am trying to create a data type that is an vector of values, but internally uses an unboxed vector when possible and a boxed array otherwise. If this already exists, let me know. For example, any easily unboxed type is stored in an unboxed array, but also any type that is a product type of only easily unboxed types like (Int, Int) is stored in multiple unboxed arrays. I don’t think there is a smart way of doing this for sum types, so those are stored in boxed arrays.
I’m trying to create a typeclass
class VectorElement a where data Vector a replicate :: Int -> a -> Vector a length :: Vector a -> Int
to represent things that can be stored in Vectors. I can then implement it for specific types that I know can be stored in unboxed vectors:
instance VectorElement Int where newtype Vector Int = VectorInt (U.Vector Int) deriving Show replicate len x = VectorInt $ U.replicate len x length (VectorInt v) = U.length v
I also want to automatically derive instances of this class for other types using the Generic typeclass. Ideally these instances would be the most efficient possible, so that for example the instance for (Int, Int) used two unboxed arrays but the instance for Maybe Int uses a boxed array. To that end I created another typeclass and wrote instances for the Generic data constructors:
class VectorElementG (r :: * -> *) where data VectorG r replicateG :: Int -> r a -> VectorG r lengthG :: VectorG r -> Int
instance VectorElementG V1 where data VectorG V1 replicateG = undefined lengthG = undefined
instance VectorElementG U1 where newtype VectorG U1 = VectorGUnit Int replicateG i U1 = VectorGUnit i lengthG (VectorGUnit i) = i
instance VectorElement a => VectorElementG (K1 i a) where newtype VectorG (K1 i a) = VectorGK (Vector a) replicateG i (K1 x) = VectorGK $ replicate i x lengthG (VectorGK v) = length v
instance (VectorElementG r1, VectorElementG r2) => VectorElementG (r1 :*: r2) where data VectorG (r1 :*: r2) = VectorGProd (VectorG r1) (VectorG r2) replicateG i (a :*: b) = VectorGProd (replicateG i a) (replicateG i b) lengthG (VectorGProd v _) = lengthG v
instance VectorElement ((r1 :+: r2) p) where newtype Vector ((r1 :+: r2) p) = VectorSum (V.Vector ((r1 :+: r2) p)) replicate i x = VectorSum $ V.replicate i x length (VectorSum v) = V.length v
instance VectorElementG f => VectorElementG (M1 i c f) where newtype VectorG (M1 i c f) = VectorGMeta (VectorG f) replicateG i (M1 f) = VectorGMeta $ replicateG i f lengthG (VectorGMeta v) = lengthG v
I’m not sure if these are correct, especially the one for :+:. I want basically base cases to be any type that already has an instance of VectorElement or a sum type which is automatically boxed, and the recursive case to basically just use parallel vectors for product types.
I think this sort of worked insofar as it allowed me to write an instance for tuples:
instance (VectorElement a, VectorElement b) => VectorElement (a,b) where newtype Vector (a, b) = VectorTuple (VectorG (Rep (a, b))) replicate i x = VectorTuple $ replicateG i (from x) length (VectorTuple v) = lengthG v
Ideally, however, the compiler would automatically derive this instance using the Generic instance. Is there a way to do that also?
I would welcome any thoughts on this entire idea and approach.
Thanks, Jake Waksbaum
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Alp Mestanogullari
participants (2)
-
Alp Mestanogullari
-
Jake Waksbaum