RE: [Haskell-cafe] Numerical methods in Haskell

| Between google searching and looking through the activity | report, I take it that no one has really developed serious | libraries for matrix manipulations, diff eqs, etc. | | Are there any practical reasons for this or is it just a | matter of the haskell community being small and there not | being many people interested in something so specialized? The latter I think, but it's just the sort of thing that a functional language should be good at. Two other difficulties (a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though? (b) A concern about efficiency, because numerical computation is typically an area where people really care about how many instructions you take. It's a legitimate concern, but I don't think that it'll turn out to be justified. With unboxed arrays, and/or calling external libraries for the inner loops -- and the potential for aggressive fusion and/or parallelism, there is plenty of upward potential. I also want to work on nested data parallelism (a la NESL, and NEPAL) which fits right in here. I'd love to see a little community of matrix manipulators spinning up. Simon

On Fri, 10 Feb 2006, Simon Peyton-Jones wrote:
| Between google searching and looking through the activity | report, I take it that no one has really developed serious | libraries for matrix manipulations, diff eqs, etc. | | Are there any practical reasons for this or is it just a | matter of the haskell community being small and there not | being many people interested in something so specialized?
The latter I think, but it's just the sort of thing that a functional language should be good at.
Yes, I think lazy functional languages are great for this job! Languages of which other kind can represent power series and (infinite) sequences so easily?
(a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though?
I collected the efforts of matrix libraries seen so far: http://www.haskell.org/hawiki/LinearAlgebra
I'd love to see a little community of matrix manipulators spinning up.
Here, here, here! :-) I wrote some little, really very basic modules for me for numerical integration, solving differential equations, approximation with Newton's method and Co., power series manipulation, wrapping GNUPlot for display of numerical data. Due to increasing interest I could make the repository public, but I can't guarantee for a stable interface. Jerzy Karczmarczuk gives a fine overview over how functional lazy languages assist solving mathematical problems: http://www.haskell.org/pipermail/haskell-cafe/2004-May/006197.html

Actually I was starting to develop a matrix library, but then I found someone beat me to it... which is nice of course because you can move straight on to using it... http://dis.um.es/~alberto/hmatrix/matrix.html It uses the GSL which can use an optimized cblas library for even faster computation. Regards, Keean. Simon Peyton-Jones wrote:
| Between google searching and looking through the activity | report, I take it that no one has really developed serious | libraries for matrix manipulations, diff eqs, etc. | | Are there any practical reasons for this or is it just a | matter of the haskell community being small and there not | being many people interested in something so specialized?
The latter I think, but it's just the sort of thing that a functional language should be good at. Two other difficulties
(a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though?
(b) A concern about efficiency, because numerical computation is typically an area where people really care about how many instructions you take. It's a legitimate concern, but I don't think that it'll turn out to be justified. With unboxed arrays, and/or calling external libraries for the inner loops -- and the potential for aggressive fusion and/or parallelism, there is plenty of upward potential. I also want to work on nested data parallelism (a la NESL, and NEPAL) which fits right in here.
I'd love to see a little community of matrix manipulators spinning up.
Simon
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I finally got some time to answer Simon's posting: Simon P-J:
| Between google searching and looking through the activity | report, I take it that no one has really developed serious | libraries for matrix manipulations, diff eqs, etc. | | Are there any practical reasons for this or is it just a | matter of the haskell community being small and there not | being many people interested in something so specialized?
The latter I think, but it's just the sort of thing that a functional language should be good at. Two other difficulties
(a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though?
Hard to compete, yes. But on the other hand, the rewards can be high. Numerical library code (especially matrix libraries) tends to be highly optimized for the hardware architecture at hand. As a consequence a small change in, say, the cache architecture, might require a thorough rewrite of the library code to regain high utilisation of the hardware. This is since languages like Fortran and C force you to code so close to the hardware. A high-level language, with good optimizing compilation, would make also library code more portable across hardware architectures. N.b. these optimizations are non-trivial to say the least.
(b) A concern about efficiency, because numerical computation is typically an area where people really care about how many instructions you take. It's a legitimate concern, but I don't think that it'll turn out to be justified. With unboxed arrays, and/or calling external libraries for the inner loops -- and the potential for aggressive fusion and/or parallelism, there is plenty of upward potential. I also want to work on nested data parallelism (a la NESL, and NEPAL) which fits right in here.
The number of instructions is only one side of the coin. For high-performance computing, memory issues are at least as important: both the amount of memory used (e.g., will the computation fit into memory at all), and how the memory hierarchy is utilized (caches, TLB:s, virtual memory, ...). This is a really sweet spot of functional languages, and laziness adds to it. On the other hand, the increased abstraction of functional languages gives an optimizing compiler larger freedom to reorder computations and choose memory layouts of data structures like matrices. This is potentially very useful, since optimizing for memory hierarchy utilization typically involves both data layout and order of memory accesses. However, to achieve a good result, the compiler must be able to predict a great deal of the computing and the memory usage. For instance, dynamic memory handling of numeric data structures will surely kill any serious attempt to predict the cache behavior. To achieve good optimizing compilation, we need either very good program analyses, or a library of recursion patterns or templates for which the compiler knows how to allocate memory statically and order the computations well, or possibly both. Some encouraging examples: Sven-Bodo Scholz has achieved very good performance for the restricted functional matrix language SAC, using optimizations for cache. My former student Peter Drakenberg invented a restricted functional matrix language, with analyses to infer matrix sizes statically, and sharing analysis, to find opportunities to allocate memory statically and update in-place. He also got some good experimental figures. This leads me to believe that compilers in more general languages could do something similar, by recognizing certain patterns or through advanced program analyses. However, both these languages are strict, and I am not sure at all how to do this in a lazy language. In any case, this is nontrivial compiler work and considerable research efforts would be needed. Unfortunately, I don't see how to fund such research, since the high-performance computing community largely seems to have given up on functional languages since the demise of the data-flow languages.
I'd love to see a little community of matrix manipulators spinning up.
Yes. There might me a niche for high-level numerical coding, somehwere where MATLAB is today. MATLAB isn't exactly blazingly fast, still very widespread. On the other hand, MATLAB is already in that niche. The question to answer is what advantages a functional language like Haskell could offer in this niche. We need to come up with these answers, and then convince enough people outside our own community. Björn Lisper

On Mon, Feb 20, 2006 at 11:47:49AM +0100, Bjorn Lisper wrote:
(a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though?
Hard to compete, yes. But on the other hand, the rewards can be high. Numerical library code (especially matrix libraries) tends to be highly optimized for the hardware architecture at hand. As a consequence a small change in, say, the cache architecture, might require a thorough rewrite of the library code to regain high utilisation of the hardware. This is since languages like Fortran and C force you to code so close to the hardware. A high-level language, with good optimizing compilation, would make also library code more portable across hardware architectures. N.b. these optimizations are non-trivial to say the least.
The only particularly relevant numerical libraries today (atlas and fftw) already do far better optimization than any compiler is going to acheive, and in the case of fftw can respond to changes in memory configuration at runtime. In both cases they're written in ANSI C (although the C code for fftw is written by an OCaml program... or at least some dialect of ML). In order to take advantage of cache behavior properly, it's necesary to allow adjustments in the actual algorithm used, which isn't something that a clever compiler is likely to accomplish. Which is to say that while I'd love native Haskell code to run incredibly fast, noone can compete with atlas or fftw when writing in any other language either, unless they're willing to do either runtime or compile-time (or coding-time) timings on the actual machine on which the code will run. There really is no reason to try to compete with these libraries. Unless, I suppose, your interest is in writing libraries... fwiw, atlas takes close to a day to compile on my machine, since it spends so long performing timings of various algorithms with various parameters. I don't want my haskell compiler to take that long to compile *anything*. -- David Roundy http://www.darcs.net

David Roundy:
On Mon, Feb 20, 2006 at 11:47:49AM +0100, Bjorn Lisper wrote:
(a) It's hard to compete with existing libraries. The obvious thing is not to compete; instead, just call them. But somehow that doesn't seem to be as motivating. Perhaps some bindings exist though?
Hard to compete, yes. But on the other hand, the rewards can be high. Numerical library code (especially matrix libraries) tends to be highly optimized for the hardware architecture at hand. As a consequence a small change in, say, the cache architecture, might require a thorough rewrite of the library code to regain high utilisation of the hardware. This is since languages like Fortran and C force you to code so close to the hardware. A high-level language, with good optimizing compilation, would make also library code more portable across hardware architectures. N.b. these optimizations are non-trivial to say the least.
The only particularly relevant numerical libraries today (atlas and fftw) already do far better optimization than any compiler is going to acheive, and in the case of fftw can respond to changes in memory configuration at runtime. In both cases they're written in ANSI C (although the C code for fftw is written by an OCaml program... or at least some dialect of ML). In order to take advantage of cache behavior properly, it's necesary to allow adjustments in the actual algorithm used, which isn't something that a clever compiler is likely to accomplish.
That's a valid point. You may want to change, e.g., the size of blocks in a block-oriented matrix algorithm to match the cache. This will, in general, require the use of algebraic laws like associativity and commutativity, which are not valid for floating-point arithmetics and thus can change the numerical behaviour, so a compiler shouldn't fiddle around with them unless under strict control of the programmer. Interestingly, the language invented by my aforementioned former PhD student contains a nondeterministic matrix decomposition primitive, which allows the partitioning of a matrix into a fixed number of blocks, but where block sizes can vary. This is exactly to let the programmer give an optimizing compiler this degree of freedom when deemed safe. Alas, he never got around to any serious experiments with this feature. Björn Lisper
participants (5)
-
Bjorn Lisper
-
David Roundy
-
Henning Thielemann
-
Keean Schupke
-
Simon Peyton-Jones