
On 21-08-2014 11:22, Felipe Lessa wrote:
I suggest that you add a few SPECIALIZE pragmas to vector-algorithms and check its performance again.
Scratch that, I remembered that one may just copy & paste :). Here's the code and criterion results: https://gist.github.com/meteficha/1acef6dc1e1ed81b63ae This is the relevant part: benchmarking --nothing--/10 time 151.5 ns (151.2 ns .. 151.8 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 151.8 ns (151.6 ns .. 152.1 ns) std dev 868.8 ps (732.4 ps .. 1.030 ns) benchmarking --best--/10 time 157.4 ns (157.1 ns .. 157.7 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 157.1 ns (156.6 ns .. 157.4 ns) std dev 1.297 ns (786.7 ps .. 2.366 ns) benchmarking Optimal/10 time 1.173 μs (1.168 μs .. 1.180 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.194 μs (1.187 μs .. 1.204 μs) std dev 27.45 ns (22.18 ns .. 36.34 ns) variance introduced by outliers: 29% (moderately inflated) benchmarking Optimal'/10 time 157.7 ns (157.5 ns .. 158.0 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 157.8 ns (157.5 ns .. 158.2 ns) std dev 1.178 ns (872.4 ps .. 1.965 ns) Optimal', which is a copy of Optimal, takes almost the same as --best--, which does no comparisons at all! So vector-algorithms really needs either to force INLINE (leading to code size bloat) or sprinkle SPECIALIZE pragmas everywhere. I suggest contacting the package's maintainer to see what their thoughts are on this matter. Cheers, -- Felipe.