I've been playing with your example to optimize it a bit, I have to run but here's what I have so far. It's about as fast as the Python code, I'll make it faster when I have more time over the next few days.

See https://gist.github.com/etrepum/4747507 and https://gist.github.com/etrepum/4747507/revisions


On Sat, Feb 9, 2013 at 2:35 PM, Nicolas Bock <nicolasbock@gmail.com> wrote:



On Fri, Feb 8, 2013 at 1:23 PM, Aleksey Khudyakov <alexey.skladnoy@gmail.com> wrote:
On 08.02.2013 23:26, Nicolas Bock wrote:
Hi list,

I wrote a script that reads matrix elements from standard input, parses
the input using a regular expression, and then bins the matrix elements
by magnitude. I wrote the same script in python (just to be sure :) )
and find that the python version vastly outperforms the Haskell script.

General performance hints

1) Strings are slow. Fast alternatives are text[1] for textual data and bytestrings[2] for binary data. I can't say anything about performance of Text.Regex.Posix.

2) Appending list wrong operation to do in performance sensitive code.
(++) traverses its first argument so it's O(n) in its length.


What exactly are you tryeing to do? Create a histogram?



The Haskell script was compiled with "ghc --make printMatrixDecay.hs".

If you want performance you absolutely should use -O2.

Another question: When I compile the code with --make and -O2, and then run it on a larger matrix, I get this error message:

$ ./createMatrixDump.py -N 512 | ./printMatrixDecay
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

When I use "runghc" instead, I don't get an error. What does this error mean, and how do I fix it?

Thanks,

nick



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe