 
            Why would lists-of-lists be faster than unboxed arrays? No indexing arithmetic? Deforestation? I'm very curious. The first advice I got from #haskell when trying to speed up my original code was "get rid of all those lists."
First, I think lists-of-lists is only faster (if at all) in your special case of having only very small matrices. Moreover, the pure implementation runs without the 'near zero' tests and the array implementation is not as smart as possible. For example instead of f j1 where f j | j>n = return () ; f j = work_on j >> f (j+1) you could simply write mapM_ work_on [j1..n] and save some comparsions. Second, clearly you have to "get rid of all those lists", as long as you are trying to implement the algorithm in the usual algebra book style, where you find formulations that are suitable to mutable array implementations. The pure implementation instead tries to exploit the recursive structure and some invariants of Gauss' algorithm in a direct way. BR, -- -- Mirko Rahn -- Tel +49-721 608 7504 -- --- http://liinwww.ira.uka.de/~rahn/ ---