Haskell performance in heavy numerical computations - Haskell-Cafe - Haskell.org

newer
Re: Breaking cycles in a directed...

Haskell performance in heavy numerical computations

older
packaging modules with...

Joel Reymont

6 Jul 2006 6 Jul '06

10:03 p.m.

Is anyone using Haskell for heavy numerical computations? Could you share your experience? My app will be mostly about running computations over huge amounts of stock data (time series) so I'm thinking I might be better of with OCaml. Thanks, Joel -- http://wagerlabs.com/

Reply

Sign in to reply online Use email software

Show replies by date

Greg Fitzgerald

7 Jul 7 Jul

12:40 a.m.

I have tried. Using just normal lists, I found it difficult to avoid huge space leaks, but once things are working, the end result is easy to reason about. I'm considering giving it another try using Stream Processors or some form of Arrows instead of lists. I think this strategy might allow me to do several transformations of the same input list in pseudo-parallel, which would allow the GC to collect old data as soon as its not needed by the computation that decides when to buy and sell. The Stream Processor library I was looking at uses Continuation Passing Style, which I don't know too much about, and I'm not sure Arrows will work out. Generalizing the problem, how does one express: "given multiple transformations of an infinite numeric list, perform some IO at the first instance some predicate is met"? I feel that Arrows might somehow trivially allow me to compute on huge lists without leaking, but don't understand them well enough to apply them. If one could write an arrow that applies the transformations to the input list in parallel, then I'd think you could solve the above example something like this: (f &&& g) >>> dropWhile (\(x,y) -> x > 0 && y > 0) >>> performSomeIO Maybe <+> is the arrow operator I'm looking for, not &&&, but to be honest, the documentation for Arrows blows my mind. I think a few examples would go a long way. Thanks, Greg On 7/6/06, Joel Reymont wrote:

Is anyone using Haskell for heavy numerical computations? Could you share your experience?

My app will be mostly about running computations over huge amounts of stock data (time series) so I'm thinking I might be better of with OCaml.

Thanks, Joel

-- http://wagerlabs.com/

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

Jeff Polakow

1:41 p.m.

honest, the documentation for Arrows blows my mind. I think a few examples would go a long way.

John Hughes' original paper on arrows is full of examples. Additionally, Hughes wrote a tutorial on programming with arrows, for the 2004 AFP summer school in Tartu, which is very accessible. Both papers are available from Hughes' publication page. Also Ross Paterson wrote a piece called Arrows and Computation, available on the arrows homepage, which contains many illuminating examples. -Jeff -- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Reply

Sign in to reply online Use email software

Brandon Moore

5:06 p.m.

Joel Reymont wrote:

Is anyone using Haskell for heavy numerical computations? Could you share your experience?

My app will be mostly about running computations over huge amounts of stock data (time series) so I'm thinking I might be better of with OCaml.

If you are really serious about numerical throughput, you probably should be looking at vector instructions, CPU cache sizes, and parallelising your algorithms, not choosing between high-level languages. Hopefully it's enough to do that sort of thing for a few important primitives, and then you can write the rest of the program in whatever langauge you like. * MonetDB It sounds like MonetDB might be related to your problem. This database is designed for ad-hoc analytical queries which touch most of the rows (but hopefully only a few of the columns) of a database. The design of the system has a lot of interesting analysis about the performance characteristics of modern hardware. Publications: http://monetdb.cwi.nl/Research/Articles/index.html The first sections of the Boncz thesis provide an overview which will hopefully tell you rather this stuff applies at all. The only paper with "X100" in the title is an overview of what's been happening more lately. * Optimizing Combinator Libraries If you are trying to build a nice and efficient library in Haskell over a few fast primitives, the ByteString library could be a good example. In particular, the use of rewrite rules for loop fusion might help in your case as well. If you are trying to build a combinator library in Haskell over a few efficient primitives, you might like the use of rewrite rules in Data.ByteString to implement some kinds of loop fusion. http://www.cse.unsw.edu.au/~dons/fps.html I liked the slides Perhaps idea from this paper could help if you are doing more complicated things: Dynamic Optimization for Functional Reactive Programming using Generalized Algebraic Data Types http://www.cs.nott.ac.uk/~nhn/Publications/icfp2005.pdf Brandon

Reply

Sign in to reply online Use email software

Bulat Ziganshin

8 Jul 8 Jul

5:04 a.m.

Hello Joel, Friday, July 7, 2006, 2:03:11 AM, you wrote:

Is anyone using Haskell for heavy numerical computations? Could you share your experience?

My app will be mostly about running computations over huge amounts of stock data (time series) so I'm thinking I might be better of with OCaml.

are you sure that numerical speed will be critical for your app? may be, for example, that speed of thread switching and passing data, or ability to easily spread tasks between several cpu cores will be more important? numerical speed is poor in ghc 6.4, according to my tests. it's 10-20 times worse than of gcc. afair, the mandelbrot benchmark of Great Language Shootout proves this - despite all optimization attempts, GHC entry is still 5-10 times slower than gcc/ocaml/clean ones i heard rumors that Roman Leshinsky works on improving this in 6.6, though. try to ask GHC Team about his work it's also possible that your computations can be run via existing libraries or written by you in C and rest of program can be written in Haskell. my own program uses architecture like this - it consists of time-critical part - compression library written in C, and remaining part written in Haskell -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Reply

Sign in to reply online Use email software

Simon Marlow

10 Jul 10 Jul

9:12 a.m.

New subject: Haskell performance in heavy numerical computations

Bulat Ziganshin wrote:

numerical speed is poor in ghc 6.4, according to my tests. it's 10-20 times worse than of gcc. afair, the mandelbrot benchmark of Great Language Shootout proves this - despite all optimization attempts, GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

We average 1.3x slower than C in the shootout. Thanks to Don Stewart for the following stats... The numerical floating-point-intensive benchmarks: mandelbrot 2.7x C (Clean 1.7x, OCaml 2.4x, C++ 2.6x) n-body 2.1x C (Clean 0.9x, OCaml 1.3x, C++ 1.4x) http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all http://shootout.alioth.debian.org/gp4/benchmark.php?test=nbody&lang=all So we're still only 2-3 times slower than C on these benchmarks, and we know how to improve them. In other cases, for example some integer/floating math, we beat GCC: partial-sums: http://shootout.alioth.debian.org/gp4/benchmark.php?test=partialsums&lang=all The full list, Haskell vs. C: 0.1x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all 0.1x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=message&lang=all 0.1x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=regexdna&lang=all 0.5x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=binarytrees&lang=all 0.7x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsieve&lang=all 0.9x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=all 0.9x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=partialsums&lang=all 1.3x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=recursive&lang=all 1.6x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=pidigits&lang=all 1.8x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=fasta&lang=all 1.8x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=nbody&lang=all 2.3x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=spectralnorm&lang=all * 2.5x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all 2.7x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all * 3.2x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=fannkuch&lang=all * 11.0x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=revcomp&lang=all * 22.0x C http://shootout.alioth.debian.org/gp4/benchmark.php?test=knucleotide&lang=all * Known, or expected, to improve under Data.ByteString Cheers, Simon

Reply

Sign in to reply online Use email software

Bulat Ziganshin

10:47 a.m.

New subject: Re[2]: Haskell performance in heavy numerical computations

Hello Simon, Monday, July 10, 2006, 1:12:13 PM, you wrote:

...
numerical speed is poor in ghc 6.4, according to my tests. it's 10-20 times worse than of gcc. afair, the mandelbrot benchmark of Great Language Shootout proves this - despite all optimization attempts, GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

We average 1.3x slower than C in the shootout. Thanks to Don Stewart for the following stats...

as i once wrote, most of shootout benchmarks depends on the libs. for example, multi-threading benchmark is much faster on GHC than on GCC because former has user-level threads support pre-included and for later the external libs should be used (testing policy prohibits using of additional libs)

The numerical floating-point-intensive benchmarks:

mandelbrot 2.7x C (Clean 1.7x, OCaml 2.4x, C++ 2.6x) n-body 2.1x C (Clean 0.9x, OCaml 1.3x, C++ 1.4x)

that is the benchmarks that i had in mind

http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all

the same benchmark with N=600 and Sempron processor: http://shootout.alioth.debian.org/debian/benchmark.php?test=mandelbrot&lang=all here GHC is 10x slower than GCC and more than 5 times compared to Clean and Ocaml. i think that this is the real computing speed difference while with N=2000 C/Ocaml/Clean programs was really limited by memory speed. at least the same situation was for trivial "a[i]=b[i]+c[i]" benchmark discussed half-year ago - when arrays was small enough to fit in cpu cache, speed difference was 20 times. on large arrays C program was limited by memory speed and speed difference was only 3 times

* 2.5x C * 3.2x C * 11.0x C * 22.0x C

* Known, or expected, to improve under Data.ByteString

one more sign that this is more testing of libraries than compilers. for example, revcomp test performs only about 1000 operations in main language but calls regex functions with 100kb strings. not strange that in this test TCL is now fastest entry about numerical apps - afair, Manuel Chakravarty works on new version of his parallel arrays engine -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Reply

Sign in to reply online Use email software

dons＠cse.unsw.edu.au

11:23 a.m.

New subject: Re[2]: Haskell performance in heavy numerical computations

bulat.ziganshin:

Hello Simon,

Monday, July 10, 2006, 1:12:13 PM, you wrote:

...
...
numerical speed is poor in ghc 6.4, according to my tests. it's 10-20 times worse than of gcc. afair, the mandelbrot benchmark of Great Language Shootout proves this - despite all optimization attempts, GHC entry is still 5-10 times slower than gcc/ocaml/clean ones

...
We average 1.3x slower than C in the shootout. Thanks to Don Stewart for the following stats...

as i once wrote, most of shootout benchmarks depends on the libs. for example, multi-threading benchmark is much faster on GHC than on GCC because former has user-level threads support pre-included and for later the external libs should be used (testing policy prohibits using of additional libs)

...
The numerical floating-point-intensive benchmarks:

...
mandelbrot 2.7x C (Clean 1.7x, OCaml 2.4x, C++ 2.6x) n-body 2.1x C (Clean 0.9x, OCaml 1.3x, C++ 1.4x)

that is the benchmarks that i had in mind

...
http://shootout.alioth.debian.org/gp4/benchmark.php?test=mandelbrot&lang=all

the same benchmark with N=600 and Sempron processor:

http://shootout.alioth.debian.org/debian/benchmark.php?test=mandelbrot&lang=all

here GHC is 10x slower than GCC and more than 5 times compared to Clean and Ocaml. i think that this is the real computing speed difference while with N=2000 C/Ocaml/Clean programs was really limited by memory speed.

Ah! In this case, on the debian, the benchmark has been compiled _without_ -fexcess-precision, that's what's causing the big slow down. We had to turn it on, on the gp4, but it seems the flag wasn't propagated to the debian/sempron builds for some reason. Looks like the ghc/mandelbrot benchmarks just needs to be rerun with -fexcess-precision in this case. -- Don

Reply

Sign in to reply online Use email software

Brent Fulgham

11 Jul 11 Jul

5:27 a.m.

New subject: Re[2]: Haskell performance in heavy numerical computations

On Jul 10, 2006, at 4:23 AM, Donald Bruce Stewart wrote:

Ah! In this case, on the debian, the benchmark has been compiled _without_ -fexcess-precision, that's what's causing the big slow down. We had to turn it on, on the gp4, but it seems the flag wasn't propagated to the debian/sempron builds for some reason.

Looks like the ghc/mandelbrot benchmarks just needs to be rerun with -fexcess-precision in this case.

Ah! Thanks for noticing this -- I did not catch it. I've rerun, and the new value is more like 0.428 secs for 600. Thanks, -Brent

Reply

Sign in to reply online Use email software

Frederik Eaton

16 Jul 16 Jul

4:40 p.m.

Alberto Ruiz has developed a linear algebra library which could be seen as an alternative to Matlab/Octave, using the GSL, ATLAS, LAPACK, etc. IIRC. http://dis.um.es/~alberto/GSLHaskell/ I've optimized it in some places, and added an interface which guarantees operand conformability through the type system at compile time. http://ofb.net/~frederik/stla/ It is almost twice as fast as Octave on an example program, and probably comparable to Matlab. However, it is currently very difficult to debug some of the type errors which are emitted by GHC when using it, and there are some outstanding heap-related bugs. Also, we don't have as wide a variety of built-in functions as Matlab and Octave do. Frederik On Thu, Jul 06, 2006 at 11:03:11PM +0100, Joel Reymont wrote:

Is anyone using Haskell for heavy numerical computations? Could you share your experience?

My app will be mostly about running computations over huge amounts of stock data (time series) so I'm thinking I might be better of with OCaml.

Thanks, Joel

-- http://ofb.net/~frederik/

Reply

Sign in to reply online Use email software

6925

Age (days ago)

6935

Last active (days ago)

Download

9 comments

9 participants

tags

participants (9)

Brandon Moore
Brent Fulgham
Bulat Ziganshin
dons＠cse.unsw.edu.au
Frederik Eaton
Greg Fitzgerald
Jeff Polakow
Joel Reymont
Simon Marlow