
Ketil Malde wrote:
Blake Rain
writes: Here is the one in C:
void multiplyM4 (float m1[4][4], float m2[4][4], float m3[4][4]) { m1[0][0] = m2[0][0] * m3[0][0] + m2[0][1] * m3[1][0] + m2[0][2] * m3[2][0] + m2[0][3] * m3[3][0]; m1[0][1] = m2[0][0] * m3[0][1] + m2[0][1] * m3[1][1] + m2[0][2] * m3[2][1] + m2[0][3] * m3[3][1]; (etc) :
Some points about the C version: : 6. It's pretty fast (being written in C):
Performing 100,000,000 matrix multiples takes on average 6.117s (with gcc -O2). :
Haskell version:
[foldl (+) 0 $ zipWith (*) x y | x <- m1, y <- transpose m2] : 5. It's very fast (being written in Haskell):
Performing 100,000,000 matrix multiples takes on average 1.435s (with ghc -O2).
This is surprising: a C program doing an explicit list of arithmetic operation is beaten by a factor of 4 or so by a polymorphic, lazy, list-consuming Haskell program? Are you sure? What am I missing?
Constant folding? The numbers certainly depend on the kind of the test program.