Re: [Haskell-cafe] vector-simd: some code available, and some questions

8 Jul 2012

      On Sat, 2012-07-07 at 21:13 +0200, Nicolas Trangez wrote:
...
As you can see, the zipWith Data.Vector.SIMD implementation is slightly
slower than the Data.Vector.Storable based one. I didn't perform much
profiling yet, but I suspect allocation and ForeignPtr creation is to
blame, this seems to be highly optimized in
GHC.ForeignPtr.mallocPlainForeignPtrBytes as used by
Data.Vector.Storable.
I got the MV benchmark on-par with SV by reworking the allocation
mechanism: no more FFI involved, but based on
GHC.Exts.newAlignedPinnedByteArray# and some other trickery, see [1].
This could still be improved a little by using PlainPtr, but this is not
exported by GHC.ForeignPtr.

This did have a pretty big performance-impact on the SIMD-based
benchmark, compare [2] to the old one [3]. I have no clue why the 4096
case now only uses twice the time of the 1024 one, unlike the expected
4x (+- as before).

Nicolas

[1]
https://github.com/NicolasT/vector-simd/commit/5ec539167254435ef4e7d308706dc...
[2] http://linode2.nicolast.be/files/vector-simd-xor2.html
[3] http://linode2.nicolast.be/files/vector-simd-xor1.html