Re: [Haskell-cafe] vector-simd: some code available, and some questions

8 Jul 2012


      I've not been following this thread very closely, but it seems like what
you're trying to do may be related to Geoffrey Mainland's work on SIMD
support in GHC. See [1] for his "SIMD-enabled version of the vector
library". He's also written some blog posts about this [2].

Reiner

[1] https://github.com/mainland/vector
[2] http://ghc-simd.blogspot.com.au/

On 8 July 2012 05:13, Nicolas Trangez  wrote:
...
All,
After my message of yesterday [1] I got down to it and implemented
something along those lines. I created a playground repository
containing the code at [2]. Initial benchmark results at [3]. More about
the benchmark at the end of this email.
First some questions and requests for help:
- I'm stuck with a typing issue related to 'sizeOf' calculation at [4].
I tried a couple of things, but wasn't able to figure out how to fix it.
- I'm using unsafePerformIO at [5], yet I'm not certain it's OK to do
so. Are there better (safer/performant/...) ways to get this working?
- Currently Alignment phantom types (e.g. A8 and A16) are not related to
each other: a function (like Data.Vector.SIMD.Algorithms.unsafeXorSSE42)
can have this signature:
unsafeXorSSE42 :: Storable a => SV.Vector SV.A16 a -> SV.Vector SV.A16 a
-> SV.Vector SV.A16 a
Yet, imaging I'd have an "SV.Vector SV.A32 Word8" vector at hand, the
function should accept it as well (a 32-byte aligned vector is also
16-byte aligned). Is there any way to encode this at the type level?
That's about it :-)
As of now, I only implemented a couple of the vector API functions (the
ones required to execute my benchmark). Adding the others should be
trivial.
The benchmark works with Data.Vector.{Unboxed|Storable}.Vector (UV and
SV) vectors of Word8 values, as well as my custom
Data.Vector.SIMD.Vector type (MV) using 16-byte alignment (MV.Vector
MV.A16 Word8).
benchUV, benchSV and benchMV all take 2 pre-calculated Word8 vectors of
given size (1024 and 4096) and xor them pairwise into the result using
"zipWith xor". benchMVA takes 2 suitable MV vectors and xor's them into
a third using a rather simple and unoptimized C implementation using
SSE4.2 intrinsics [6]. This could be enhanced quite a bit (I guess using
the prim calling convention, FFI overhead can be reduced as well).
Currently, only vectors of a multiple of 32 bytes are supported (mostly
because of laziness on my part).
As you can see, the zipWith Data.Vector.SIMD implementation is slightly
slower than the Data.Vector.Storable based one. I didn't perform much
profiling yet, but I suspect allocation and ForeignPtr creation is to
blame, this seems to be highly optimized in
GHC.ForeignPtr.mallocPlainForeignPtrBytes as used by
Data.Vector.Storable.
Thanks for any input,
Nicolas
[1] http://www.haskell.org/pipermail/haskell-cafe/2012-July/102167.html
[2] https://github.com/NicolasT/vector-simd/
[3] http://linode2.nicolast.be/files/vector-simd-xor1.html
[4]
https://github.com/NicolasT/vector-simd/blob/master/src/Data/Vector/SIMD/Alg...
[5]
https://github.com/NicolasT/vector-simd/blob/master/src/Data/Vector/SIMD/Alg...
[6]
https://github.com/NicolasT/vector-simd/blob/master/cbits/vector-simd.c#L47
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] vector-simd: some code available, and some questions

Reiner Pope