matrix computations based on the GSL

Alberto Ruiz

29 Jun 2005 29 Jun '05

8:22 a.m.

Hello all! This is my first message to the list. In my work I often need linear algebra algorithms and other numeric computations. An option is using scientific computing systems like Matlab, Mathematica, Maple, etc. In Haskell there are several modules and bindings to matrix libraries; many of them are excellent. But I think that Haskell would be even more attractive if the basic matrix computations (what we find, for example, in the GNU Octave Quick Reference) were easily available in a standard library. We have all we need: an interactive environment, excellent graphics (HOpenGL), FFI and ForeignPtr with automatic garbage collection, QuickCheck, etc. Computations can be implemented by the GSL (GNU Scientific Library), a well designed numerical library written in C. Well... I know that I am trying to reinvent the wheel: I am working on a user friendly functional interface to matrix computations based on the GSL. I think that many numeric problems that can be solved using Octave or a similar system can be also solved more elegantly using Haskell. The library is very preliminary and incomplete but some simple problems can already be solved. The sources, haddock documentation and the draft of a tutorial can be found at: http://dis.um.es/~alberto/hmatrix/matrix.html I am not a Haskell expert, I only use the most basic programming techniques, so any suggestion will be greatly appreciated... Alberto

Attachments:

attachment.html (text/html — 1.7 KB)

Show replies by date

Keean Schupke

29 Jun 29 Jun

10:30 a.m.

Thats cool, I needed a matrix library and had started to roll my own in Haskell, but a frontend for the GSL library seems much more efficient. I will give the library a go - and let you know if there are any problems... Keean. Alberto Ruiz wrote:

...

Hello all! This is my first message to the list.

In my work I often need linear algebra algorithms and other numeric

computations. An option is using scientific computing systems like Matlab,

Mathematica, Maple, etc. In Haskell there are several modules and bindings to

matrix libraries; many of them are excellent. But I think that Haskell would

be even more attractive if the basic matrix computations (what we find, for

example, in the GNU Octave Quick Reference) were easily available in a

standard library.

We have all we need: an interactive environment, excellent graphics

(HOpenGL), FFI and ForeignPtr with automatic garbage collection, QuickCheck,

etc. Computations can be implemented by the GSL (GNU Scientific Library), a

well designed numerical library written in C.

Well... I know that I am trying to reinvent the wheel: I am working on a user

friendly functional interface to matrix computations based on the GSL. I

think that many numeric problems that can be solved using Octave or a

similar system can be also solved more elegantly using Haskell.

The library is very preliminary and incomplete but some simple problems can

already be solved. The sources, haddock documentation and the draft of a

tutorial can be found at:

http://dis.um.es/~alberto/hmatrix/matrix.html

I am not a Haskell expert, I only use the most basic programming techniques,

so any suggestion will be greatly appreciated...

Alberto

------------------------------------------------------------------------

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Henning Thielemann

10:31 a.m.

On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...

In my work I often need linear algebra algorithms and other numeric computations.

Nice coincidence: http://www.haskell.org//pipermail/libraries/2005-June/003936.html

...

An option is using scientific computing systems like Matlab, Mathematica, Maple, etc. In Haskell there are several modules and bindings to matrix libraries; many of them are excellent.

Many bindings to (foreign?) matrix libraries? Which ones?

Alberto Ruiz

11:38 a.m.

On Wednesday 29 June 2005 12:31, Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...
In my work I often need linear algebra algorithms and other numeric computations.

Nice coincidence: http://www.haskell.org//pipermail/libraries/2005-June/003936.html

Wow! It is exactly the same idea! I did not find the above message by Keean in my google searchs when I decided to work on this, it is very recent! After a quick look to the thread I wish I would have followed the discussions... A much more serious work was in progress. At least it was fun and I have learnt a lot of Haskell... I'm going to subscribe immediately to the libraries Haskell list. I stupidly thought that it was only for discussion about existing libraries :( And next I will study Keean's library. Thank you for your message, Alberto

Henning Thielemann

11:43 a.m.

On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...

On Wednesday 29 June 2005 12:31, Henning Thielemann wrote:

...
On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...
In my work I often need linear algebra algorithms and other numeric computations.

Nice coincidence: http://www.haskell.org//pipermail/libraries/2005-June/003936.html

Wow! It is exactly the same idea! I did not find the above message by Keean in my google searchs when I decided to work on this, it is very recent! After a quick look to the thread I wish I would have followed the discussions... A much more serious work was in progress. At least it was fun and I have learnt a lot of Haskell...

I'm going to subscribe immediately to the libraries Haskell list. I stupidly thought that it was only for discussion about existing libraries :(

...

From a first look at your approach I like it even more because you explicitly distinguish between Matrix and Vector.

Henning Thielemann

11:53 a.m.

On Wed, 29 Jun 2005, Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...
On Wednesday 29 June 2005 12:31, Henning Thielemann wrote:

...
On Wed, 29 Jun 2005, Alberto Ruiz wrote:

...
In my work I often need linear algebra algorithms and other numeric computations.

Nice coincidence: http://www.haskell.org//pipermail/libraries/2005-June/003936.html

Wow! It is exactly the same idea! I did not find the above message by Keean in my google searchs when I decided to work on this, it is very recent! After a quick look to the thread I wish I would have followed the discussions... A much more serious work was in progress. At least it was fun and I have learnt a lot of Haskell...

I'm going to subscribe immediately to the libraries Haskell list. I stupidly thought that it was only for discussion about existing libraries :(

From a first look at your approach I like it even more because you explicitly distinguish between Matrix and Vector.

I was wrong, the different names are synonymes for the same type. :-(

David Roundy

12:32 p.m.

On Wed, Jun 29, 2005 at 01:38:51PM +0200, Alberto Ruiz wrote:

...

Wow! It is exactly the same idea! I did not find the above message by Keean in my google searchs when I decided to work on this, it is very recent! After a quick look to the thread I wish I would have followed the discussions... A much more serious work was in progress. At least it was fun and I have learnt a lot of Haskell...

Your library is far more complete than Keean's is so far, and looks very nice. The one thing I'd like to see (and here I agree with Hennig) is a distinction between matrices and tensors. Your library is really very tensor-based, but tensors and matrices are very different beasts. I imagine one could take your Block, which is really a sort of generalized tensor, and implement a Matrix type such as data Matrix = M (Block Double) (or perhaps for arbitrary element type) with the constructor not exported, so that Matrices will always be guaranteed to be two-dimensional. Then for matrices one could define * to be matrix multiplication, sqrt, exp, cos, sin etc to be matrix functions (expm etc in octave-speak), and then define .* and ./ to be as defined in octave. This definition would allow considerably more type-safeness than your current implementation (which seems scarily dynamically typed to me). Alas, we'd still not have the truly strong typing I'd dream about where one could define matMul :: Int n, m, l => Matrix n m -> Matrix m l -> Matrix n l which as I understand things, isn't possible in Haskell without some sort of weird type trickery. Of course, if you had this kind of type trickery, you might not need to declare a separate Matrix type, since you'd be able to represent the dimensionality of the Block in its type.

...

And next I will study Keean's library.

And hopefully you and he can work together to create a great library (and I'll be able to mooch off of whatever you create...). :) -- David Roundy http://www.darcs.net

Jacques Carette

4:27 p.m.

I would recommend that you look very closely at the design of the LinearAlgebra package, the Matrix and Vector constructors, and the underlying implementation data-structure rtable() for Maple's implementation of all these ideas. About 250 person-days were spent on just the high-level design, as we knew that the underlying implementation would take a lot longer (it ended up taking about 6 person-years). The main design points were: 1. complete separation of the user-visible types (Matrix, Vector, Array) from the implementation type (rtable) 2. complete separation of the usual functions for user use (matrix addition, multiplication, LUDecomposition, etc) from the actual underlying implementations 3. for efficiency purposes (only), allow access to the low-level details of both data-structures and implementations, but spend no effort on making these easy to use. Making them efficient was the only thing that mattered. Thus the calling sequences for the low-level routines are quite tedious. 4. the high-level commands should do what one expects, but they are expected to pay a (small) efficiency cost for this convenience 5. incorporate all of LAPACK (through ATLAS, raw LAPACK is now too slow), as well as *all* of NAG's LinearAlgebra routines in a completely seamless manner. 6. do #6 for double precision as well as arbitrary precision arithmetic, again seamlessly 7. use as much structure information to do algorithm dispatch as possible 8. Matrices and Arrays are different high-level types, with different defaults. A.B for Matrices is matrix multiplication, A.B for Arrays is component-wise. To serves the purpose of both those who want to do linear algebra and those who want to do signal processing. 9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly. Things that follow from that are numerous. For example, for the Matrix constructor there are: 1. 15 different abstract 'shapes' a matrix can have, with the band[n1,n2] 'shape' having two integer parameters 2. 14 different storage formats (element *layouts*) 3. each storage format can come in C or Fortran order 4. datatype which can be complex[8], float[8], integer[1], integer[2], integer[4], integer[8] or Maple 5. a variety of attributes (like positive_definite) which can be used by various routines for algorithm selection This implies that, in Maple (unlike in Matlab), the eigenvalues of a real symmetric matrix are guaranteed to be real, ie not contain spurious small imaginary parts. Solving Ax=b for A positive definite will use a Cholesky decomposition (automatically), etc. Computations on arbitrary length integers will also use different algorithms than for matrices of polynomials, for example. Where appropriate, fraction-free algorithms are used, etc. Some routines are even designed to be a bit 'lazy': one can use an LUDecomposition of a matrix A and return essentially only the information necessary to then do a large number of Ax=b solves for different b. Matrix access is also very convenient, at the element level, but also at the sub-matrix level, rather like Matlab's notation. If A is a 5x5 Matrix, then A[1..2,3..4] := A[2..3,-2..-1]; will assign the submatrix of A consisting of the elements from row 2,3 and columns 4,5 to the submatrix of A in rows 1,2 and columns 3,4. Notice the indexing from the right (positive numbers) and the left (negative). One can also say A[[2,1],3..4] := A[2..3,[-1,-2]]; As this is somewhat difficult to explain, let me show what this does:

...

A := Matrix(5,5,(i,j)->i*j);

[1 2 3 4 5] [ ] [2 4 6 8 10] [ ] A := [3 6 9 12 15] [ ] [4 8 12 16 20] [ ] [5 10 15 20 25]

...

A[[2,1],3..4] := A[2..3,[-1,-2]]: A;

[1 2 15 12 5] [ ] [2 4 10 8 10] [ ] [3 6 9 12 15] [ ] [4 8 12 16 20] [ ] [5 10 15 20 25] (hopefully the above will come out in a fixed-width font!). This makes writing block-algorithms extremely easy. Overall, this design lets one easily add new algorithms without having to worry about breaking user code. Jacques

Henning Thielemann

4:45 p.m.

On Wed, 29 Jun 2005, Jacques Carette wrote:

...

9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly.

What do you mean with "row vectors and column vectors are different types"? Do you mean that in a well designed library they should be distinguished? I disagree to this point of view which is also represented by MatLab. Distinction of row and column vectors is a misconcept and it seems to have its roots in mixing up vectors with matrices. In linear functional analysis you have functions (as vectors), operators (as matrices) and scalar products. You would never try to transpose a function. A vector of R^n can be considered as function from {1,...,n} -> R. So what is the operation of transposition? A type conversion? If we instead distinguish row and column vectors because we treat them as matrices, then the quadratic form x^T * A * x denotes a 1x1 matrix, not a real. The MatLab hack for this inconvenience is to consider 1x1 matrix as real. But there is no need to do so since evaluates a real and works also for continuous functions x.

Dan Piponi

5:39 p.m.

...

On Wed, 29 Jun 2005, Jacques Carette wrote:

...

Distinction of row and column vectors is a misconcept

Row and column vectors are sometimes worth distinguishing because they can represent entirely different types of object. For example, if a column vector represents an element of a vector space V over a field F, then row vectors can be used to represent elements of the dual space, V* = {f:V->F, f linear}. Quite different objects and in some applications it makes sense to distinguish them.

...

You would never try to transpose a function.

Funny you say that, that's exactly what I've been writing Haskell code to do, at least for linear functions. Given any linear function f:V->W then the dual function, written f* maps W*->V*. In effect this is the transpose and it makes perfect sense.

...

So what is the operation of transposition? A type conversion?

It maps objects of type V to objects of type V* and objects of type V->W to objects of type W*->V*. -- Dan

Henning Thielemann

8:54 p.m.

On Wed, 29 Jun 2005, Dan Piponi wrote:

...

...
On Wed, 29 Jun 2005, Jacques Carette wrote:

...
Distinction of row and column vectors is a misconcept

Row and column vectors are sometimes worth distinguishing because they can represent entirely different types of object. For example, if a column vector represents an element of a vector space V over a field F, then row vectors can be used to represent elements of the dual space, V* = {f:V->F, f linear}. Quite different objects and in some applications it makes sense to distinguish them.

If f is a function f :: a -> a then, of course, the dual space is again a vector space. But it contains functionals and they have very different type, namely f' :: (a -> a) -> a If dual spaces would represent the concept of transposition then the dual space of the dual space should be original space. It is not, it can even not be identified in many cases, only if the space is reflexive. f'' :: ((a -> a) -> a) -> a has certainly a type very different from (a -> a)!

Benjamin Franksen

30 Jun 30 Jun

9:23 p.m.

On Wednesday 29 June 2005 22:54, Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Dan Piponi wrote:

...
...
On Wed, 29 Jun 2005, Jacques Carette wrote:

Distinction of row and column vectors is a misconcept

Row and column vectors are sometimes worth distinguishing because they can represent entirely different types of object. For example, if a column vector represents an element of a vector space V over a field F, then row vectors can be used to represent elements of the dual space, V* = {f:V->F, f linear}. Quite different objects and in some applications it makes sense to distinguish them.

If f is a function f :: a -> a then, of course, the dual space is again a vector space. But it contains functionals and they have very different type, namely f' :: (a -> a) -> a If dual spaces would represent the concept of transposition then the dual space of the dual space should be original space. It is not, it can even not be identified in many cases, only if the space is reflexive. f'' :: ((a -> a) -> a) -> a has certainly a type very different from (a -> a)!

IIRC, finite-dimensional spaces are always reflexive, as witnessed by the identification of elements of the dual space f with the vector y in f x = (real scalars, here), where the identification is, of course, with respect to a given basis. I bet you know all this very well! Ben

Jacques Carette

29 Jun 29 Jun

5:40 p.m.

Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Jacques Carette wrote:

...
9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly.

What do you mean with "row vectors and column vectors are different types"? Do you mean that in a well designed library they should be distinguished? I disagree to this point of view which is also represented by MatLab.

This was hotly debated during our design sessions too. It does appear to be a rather odd decision, I agree! It was arrived at after trying many alternatives, and finding all of them wanting. Mathematically, vectors are elements of R^n which does not have any inherent 'orientation'. They are just as simply abstracted as functions from {1,...,n) -> R, as you point out. But matrices are then just linear operators on R^n, and if you go there, then you have to first specify a basis for R^n before you can write down a representation for any matrix. Not useful. However, when you step away from the pure mathematics point of view and instead look at actual mathematical usage, things change significantly. Mathematics is full of abuse of notation, and to make usage of matrices and vectors both convenient yet 'safe' required us to re-examine these many de facto conventions that mathematicians routinely use. And try to adapt our design to find the middle road between safety and usefulness. One of the hacks in Matlab, because it only has matrices, is that vectors are also 1xN and Nx1 matrices. In Maple, these are different types. This helps a lot, because then a dot product is a function from R^n x R^n -> R. 1x1 matrices are not reals in Maple. Nor are Nx1 matrices Vectors.

...

If we instead distinguish row and column vectors because we treat them as matrices, then the quadratic form x^T * A * x denotes a 1x1 matrix, not a real.

But if you consider x to be a vector without orientation, writing down x^T is *completely meaningless*! If x is oriented, then x^T makes sense. Also, if x is oriented, then x^T * (A * x) = (x^T * A) * x. What is the meaning of (x * A) for a 'vector' x ? It gets much worse when the middle A is of the form B.C. To ensure that everything is as associative as can be, but no more, is very difficult. I don't have the time to go into all the details of why this design 'works' (and it does!), giving the 'expected' result to all meaningful linear algebra operations, but let me just say that this was the result of long and intense debate, and was the *only* design that actually allowed us to translate all of linear algebra idioms into convenient notation. Please believe me that this design was not arrived at lightly! Jacques

Conal Elliott

7:42 p.m.

On row & column vectors, do you really want to think of them as {1,...,n)->R? They often represent linear maps from R^n to R or R to R^n, which are very different types. Similarly, instead of working with matrices, how about linear maps from R^n to R^m? In this view, column and row vectors, matrices, and often scalars are useful as representations of linear maps. I've played around some with this idea of working with linear maps instead of the common representations, especially in the context of derivatives (including higher-dimensional and higher-order), where it is the view of calculus on manifolds. It's a lovely, unifying approach and combines all of the various chain rules into a single one. I'd love to explore it more thoroughly with one or more collaborators. Cheers, - Conal -----Original Message----- Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Jacques Carette wrote:

...
9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly.

What do you mean with "row vectors and column vectors are different types"? Do you mean that in a well designed library they should be distinguished? I disagree to this point of view which is also represented by MatLab.

...

If we instead distinguish row and column vectors because we treat them as matrices, then the quadratic form x^T * A * x denotes a 1x1 matrix, not a real.

Henning Thielemann

9:04 p.m.

On Wed, 29 Jun 2005, Conal Elliott wrote:

...

On row & column vectors, do you really want to think of them as {1,...,n)->R? They often represent linear maps from R^n to R or R to R^n, which are very different types. Similarly, instead of working with matrices, how about linear maps from R^n to R^m? In this view, column and row vectors, matrices, and often scalars are useful as representations of linear maps.

We should not identify things which can be mapped bijectively. "1" and 1 are very different, although they may mean the same in a given context. Yes it is possible to describe linear maps with vectors but vectors are not linear maps. Namely, x |-> is a linear map, so is y itself a linear map? Certainly not! If you want to put an interpretation of row or column into a vector then do it when you actually do the linear map but keep the vector itself free of this information.

...

I've played around some with this idea of working with linear maps instead of the common representations, especially in the context of derivatives (including higher-dimensional and higher-order), where it is the view of calculus on manifolds. It's a lovely, unifying approach and combines all of the various chain rules into a single one. I'd love to explore it more thoroughly with one or more collaborators.

I think matrices and derivatives are very different issues. I have often seen that the first derivative is considered as vector, and the second derivative is considered as matrix. In this spirit it is used like x^T * (D2 f)(x) * x but this is only abuse of the common multiplication definitions. A good interpretation and notation should seamless extend to higher derivatives. But the interpretation above does not work in higher dimensions. I like the following type for derivation. derive :: ((i -> a) -> b) -> ((i -> a) -> (i -> b)) Here i is the index type, (i -> a) is the vector type, b is the type the vector function maps to. Its derivative has the same type of argument, but the result is a vector with indices of type i. You see that it is easy to repeat the application of 'derive', just replace b by say i->b. The second derivative yields vectors of type (i -> i -> b). This can be interpreted as matrix because it has two indices. But this is certainly not a matrix which represents a linear mapping as usual, but it is a matrix representing a bilinear form. The only thing we need is a multiplication to reduce one level of indices. mul :: (i -> c) -> (i -> b) -> b Though, what we still need is a general (overloaded?) definition of the scaling of b by c and a sum of b.

Henning Thielemann

30 Jun 30 Jun

8:39 a.m.

Let me a bit elaborate on what I wrote yesterday. On Wed, 29 Jun 2005, Henning Thielemann wrote:

...

I think matrices and derivatives are very different issues. I have often seen that the first derivative is considered as vector, and the second derivative is considered as matrix. In this spirit it is used like x^T * (D2 f)(x) * x

It should be h^T * D2 f x * h (if at all the transposition notation is used) if the curvature of f at x is meant.

...

but this is only abuse of the common multiplication definitions. A good interpretation and notation should seamless extend to higher derivatives. But the interpretation above does not work in higher dimensions.

I like the following type for derivation. derive :: ((i -> a) -> b) -> ((i -> a) -> (i -> b)) Here i is the index type, (i -> a) is the vector type, b is the type the vector function maps to. Its derivative has the same type of argument, but the result is a vector with indices of type i. You see that it is easy to repeat the application of 'derive', just replace b by say i->b. The second derivative yields vectors of type (i -> i -> b). This can be interpreted as matrix because it has two indices. But this is certainly not a matrix which represents a linear mapping as usual, but it is a matrix representing a bilinear form. The only thing we need is a multiplication to reduce one level of indices. mul :: (i -> c) -> (i -> b) -> b Though, what we still need is a general (overloaded?) definition of the scaling of b by c and a sum of b.

That is b must be a vector space with respect to c. 'derive' in this form is a bundled partial derivative, but it can be identified with the total derivative. That is derive f x i is the partial derivative of f at x with respect to the i-th component derive f x is the total derivative of f at x.

Conal Elliott

7 Jul 7 Jul

6:42 p.m.

Sorry for the delayed response, which has now led to two separate threads. See below.

...

From: Henning Thielemann

...

On Wed, 29 Jun 2005, Conal Elliott wrote:

...

...
On row & column vectors, do you really want to think of them as {1,...,n)->R? They often represent linear maps from R^n to R or R to R^n, which are very different types. Similarly, instead of working with matrices, how about linear maps from R^n to R^m? In this view, column and row vectors, matrices, and often scalars are useful as representations of linear maps.

...

We should not identify things which can be mapped bijectively. 1" "and 1 are very different [...]

...

I think matrices and derivatives are very different issues. [...]

Of course. My suggestion is to use linear maps instead of vectors or matrices when the vectors or matrices serve as representations of linear maps.

...

I have often seen that the first derivative is considered as vector, and the second derivative is considered as matrix.

I'm guessing you mean for derivatives of functions in R^n->R.

...

In this spirit it is used like x^T * (D2 f)(x) * x but this is only abuse of the common multiplication definitions. A good interpretation and notation should seamless extend to higher derivatives. But the interpretation above does not work in higher dimensions.

What does work I think, for all degrees of derivatives and all dimensions of vector spaces (and well beyond R^n), is keeping a clear distinction between linear maps and representations of linear maps. Linear maps get composed and applied, but certainly not multiplied.

...

I like the following type for derivation. derive :: ((i -> a) -> b) -> ((i -> a) -> (i -> b)) Here i is the index type, (i -> a) is the vector type, b is the type the vector function maps to.

This formulation is not much more general than R^n (i.e., {1,..,n}->R). The vectors are still restricted to "tuples" (indexed by i) of elements of the same type, right?

...

Its derivative has the same type of argument, but the result is a vector with indices of type i. You see that it is easy to repeat the application of 'derive', just replace b by say i->b. The second derivative yields vectors of type (i -> i -> b). This can be interpreted as matrix because it has two indices. But this is certainly not a matrix which represents a linear mapping as usual, but it is a matrix representing a bilinear form. The only thing we need is a multiplication to reduce one level of indices. mul :: (i -> c) -> (i -> b) -> b Though, what we still need is a general (overloaded?) definition of

the

...

scaling of b by c and a sum of b.

I prefer something like the following instead, where "VS s u" means that u is a vector space over the scalar field s.

...

derive :: (VS s u, VS s v) => (u -> v) -> (u -> LMap u v)

Since VS s (LMap u v), the result of derive may be given back to derive. Second derivatives then have type u -> LMap u (LMap u v), where as we'd expect, LMap u (LMap u v) is isomorphic to the type of bilinear maps from u to v. By using LMap instead of Matrix, we're not tempted to confuse (for instance) linear and bilinear maps, just as you pointed out. - Conal

Henning Thielemann

29 Jun 29 Jun

8:23 p.m.

On Wed, 29 Jun 2005, Jacques Carette wrote:

...

...
If we instead distinguish row and column vectors because we treat them as matrices, then the quadratic form x^T * A * x denotes a 1x1 matrix, not a real.

But if you consider x to be a vector without orientation, writing down x^T is *completely meaningless*!

That's what I stated.

...

If x is oriented, then x^T makes sense. Also, if x is oriented, then x^T * (A * x) = (x^T * A) * x. What is the meaning of (x * A) for a 'vector' x ?

It has of course no meaning. Mathematical notation has the problem that it doesn't distinguish between things that are different but in turn discriminates things which are essentially the same. If your design goal is to keep as close as possible to common notational habits you have already lost! As I already pointed out in an earlier discussion I see it the other way round: Computer languages are the touchstones for mathematical notation because you can't tell a computer about an imprecise expression: "Don't be stupid, you _know_ what I mean!" More specific: You give two different things the same name. You write A*B and you mean a matrix multiplication. Matrix multiplication means finding a representation of the composition of the operators represented by A and B. But you also write A*x and you mean the matrix-vector multiplication. This corresponds to the application of the operator represented by A to the vector x. You see: Two different things, but the same sign (*). Why? You like this ambiguity because of its conciseness. You are used to it. What else? But you have to introduce an orientation of vectors, thus you discriminate two things (row and column vectors) which are essentially the same! What is the overall benefit? It seems to me like the effort of most Haskell XML libraries in trying to have as few as possible combinator functions (namely one: o) which forces you to not discriminate the types for the functions to be combined (the three essential types are unified to a -> [a]) and even more it forces you to put conversions from the natural type (like a->Bool, a->a) in every atomic function!

...

It gets much worse when the middle A is of the form B.C. To ensure that everything is as associative as can be, but no more, is very difficult.

I don't see the problem. There are three very different kinds of multiplication, they should also have their own signs: Scalar product, matrix-vector multiplication, matrix-matrix multiplication.

...

I don't have the time to go into all the details of why this design 'works' (and it does!),

I have worked with Maple and I have finally dropped it because of its design. I dropped MatLab, too, because the distinction of row and column vectors, because it makes no sense to distinguish between e.g. convolving row vectors or column vectors. Many routines have to be aware of this difference for which it is irrelevant and many routines work only with one of these kinds and you are often busy with transposing them.

...

giving the 'expected' result to all meaningful linear algebra operations, but let me just say that this was the result of long and intense debate, and was the *only* design that actually allowed us to translate all of linear algebra idioms into convenient notation.

If translating all of existing idioms is your goal, then this is certainly the only design. But adapting the sloppy (not really convenient) mathematical notation is not a good design guideline. I advise everyone who likes this kind of convenience to use Maple, MatLab, and friends instead!

Jacques Carette

8:57 p.m.

Henning Thielemann wrote:

...

Mathematical notation has the problem that it doesn't distinguish between things that are different but in turn discriminates things which are essentially the same.

I used to think that too. And while that is sometimes true, it is actually quite rare! When common mathematical usage makes 2 things the same, it is usually because either they really are the same, or context is always sufficient to tell them apart. When it keeps things separate, usually it is because there is something subtle going on.

...

If your design goal is to keep as close as possible to common notational habits you have already lost! As I already pointed out in an earlier discussion I see it the other way round: Computer languages are the touchstones for mathematical notation because you can't tell a computer about an imprecise expression: "Don't be stupid, you _know_ what I mean!"

No, you misplace the problem: you seem to want all your mathematical expressions to be 100% meaningful in a context-free environment. Computers really do love that. Humans mathematicians are much more sophisticated than that, they can deal with a lot of context-sensitivity without much difficulty. The goal here should be to allow computers to deal with context sensitivity with as much ease. In fact, type classes in Haskell is a *great* way to do just that!

...

More specific: You give two different things the same name. You write A*B and you mean a matrix multiplication. Matrix multiplication means finding a representation of the composition of the operators represented by A and B. But you also write A*x and you mean the matrix-vector multiplication. This corresponds to the application of the operator represented by A to the vector x. You see: Two different things, but the same sign (*). Why? You like this ambiguity because of its conciseness. You are used to it. What else?

This is called operator overloading. It is completely harmless because you can tell the two * apart from their type signatures. It is a complete and total waste of time to use two different symbols to mean the same thing. <sarcasm>Next thing you know, you'll want a different 'application' symbol for every arity of function, because they are ``different''. </sarcasm> Seriously, the unification of the concept of application, achieved through currying and first-class functions, is wonderful. Operator application is no different!

...

But you have to introduce an orientation of vectors, thus you discriminate two things (row and column vectors) which are essentially the same! What is the overall benefit?

Someone else already answered this much better than I could.

...

It seems to me like the effort of most Haskell XML libraries in trying to have as few as possible combinator functions (namely one: o) which forces you to not discriminate the types for the functions to be combined (the three essential types are unified to a -> [a]) and even more it forces you to put conversions from the natural type (like a->Bool, a->a) in every atomic function!

Here we agree. Too much polymorphism can hurt too - you end up essentially pushing everything into dynamic typing, which is rather anti-Haskell. The problem that the authors faced was that Haskell doesn't yet have enough facilities to eliminate *all* boilerplate code, and I guess they chose dynamic typing over boilerplate.

...

I don't see the problem. There are three very different kinds of multiplication, they should also have their own signs: Scalar product, matrix-vector multiplication, matrix-matrix multiplication.

You see 3 concepts, I see one: multiplication. Abstract Algebra is the branch of mathematics where you want to abstract out the *unimportant* details. Much progress was made in the late 1800's when this was discovered by mathematicians ;-).

...

I have worked with Maple and I have finally dropped it because of its design. I dropped MatLab, too, because the distinction of row and column vectors, because it makes no sense to distinguish between e.g. convolving row vectors or column vectors. Many routines have to be aware of this difference for which it is irrelevant and many routines work only with one of these kinds and you are often busy with transposing them.

The core design of much of Maple and Matlab were done in the early 80s. That core design hasn't changed. That it turns out to be sub-optimal is to be expected!!! Note that the one giant try at a statically typed mathematics system (Axiom w/ programming language Aldor) never caught on, because they were way way too hard to use by mere mortals. And Aldor has a type system that makes Haskell's look simple and pedestrian by comparison. Aldor has full dependent types, recursive modules, first-class types, just to name a few. There is a 50-long (recursive) dependency chain in Aldor's algebra library!

...

If translating all of existing idioms is your goal, then this is certainly the only design. But adapting the sloppy (not really convenient) mathematical notation is not a good design guideline.

I should know better than to get into a discussion like this with someone who believes they singlehandedly know better than tens of thousands of mathematicians... Rather reminds me of my days as a PhD student... Jacques

Henning Thielemann

9:39 p.m.

On Wed, 29 Jun 2005, Jacques Carette wrote:

...

In fact, type classes in Haskell is a *great* way to do just that!

I agree. I'm also aware of that I mean different objects when I write uniformly '1'. But I know that they are somehow different. I'm also ok with not writing a conversion from say the integer 1 to the fraction 1/1, but I know that there had to be one. We should be aware of such abbreviations before adding them to a programming language.

...

This is called operator overloading. It is completely harmless because you can tell the two * apart from their type signatures. It is a complete and total waste of time to use two different symbols to mean the same thing.

But the multiplication in A*x already needs multi-parameter type classes. :-) Btw. A function in prefix notation for matrix vector multiplication would also stress the role of a matrix as linear mapping. This is because this function would just map the matrix to the linear mapping it represents. Matrix.mulVec :: Matrix -> (Vector -> Vector)

...

Abstract Algebra is the branch of mathematics where you want to abstract out the *unimportant* details.

But typically they don't say 'identical' if they mean 'isomorphic'.

...

Note that the one giant try at a statically typed mathematics system (Axiom w/ programming language Aldor) never caught on,

You mean I should have a closer look on it?

...

...
If translating all of existing idioms is your goal, then this is certainly the only design. But adapting the sloppy (not really convenient) mathematical notation is not a good design guideline.

I should know better than to get into a discussion like this with someone who believes they singlehandedly know better than tens of thousands of mathematicians...

Millions of people believe that it makes no sense to work with infinite dimensional spaces ... The next argument, please. :-) I learned from "History of mathematical notation" by Florian Cajori that defending _the_ mathematical notation is useless because there is not the one notation. There are so much more notations than we know of. Notations were chosen because of typographical issues, because there was no notation for a specific operation before and so on. But there was never a mathematician who designed a consistent notation from scratch, there was no commitee to setup a standard, mathematical notation was always a patchwork from very different sources. It looks like too many cooks spoil the broth.

Jacques Carette

30 Jun 30 Jun

2:31 a.m.

Henning Thielemann wrote:

...

I'm also aware of that I mean different objects when I write uniformly '1'. But I know that they are somehow different.

Since '1' can safely be used to denote the unit of any monoid, it does indeed have a lot of applications. And of course the syntactic artifact should not be confused with its denotation.

...

I'm also ok with not writing a conversion from say the integer 1 to the fraction 1/1, but I know that there had to be one. We should be aware of such abbreviations before adding them to a programming language.

But there is a difference here that is worth exploring. There is a canonical embedding of Z into Q. This embedding preserves the properties of Z as much as possible. It is even possible to directly ``see'' Z in Q. There is also an embedding of the representable integers into the floating point domain - but it does not have nice properties. It most definitely does not allow one to see the integers in amongst the floats (you just have to consider any representable integer with more significant bits than your float representation to see that). It seems safe to ignore canonical embeddings that preserve properties, but not the others.

...

But the multiplication in A*x already needs multi-parameter type classes. :-)

The matrix A already needs dependent types :-) Even simpler things do - see the work on Automath done in the late 60s.

...

...
Note that the one giant try at a statically typed mathematics system (Axiom w/ programming language Aldor) never caught on,

You mean I should have a closer look on it?

By all means! Axiom (http://page.axiom-developer.org/zope/mathaction/FrontPage) and Aldor (http://www.aldor.org) are extremely elegant, and contain many interesting ideas that are still ahead of other so-called ``research'' languages. They can both slice and dice through algebraic problems like few other systems can, at least on the elegance front. Magma (http://magma.maths.usyd.edu.au/magma/) comes close, with the added advantage that it is *fast*. Of course, don't try to symbolically solve a differential equation, do a Laplace transform, compute a closed form for a definite sum or a definite integral in any of those systems -- those facilities don't even exist. But that might have something to do with the problem that ``symbolic'' mathematics (especially in analysis) works on intensional representations directly, while most type theories are purely extensional. But see Oleg's message http://www.mail-archive.com/haskell@haskell.org/msg15686.html to see how Haskell can support intensional representations (indirectly). It is worth comparing that solution with similar functionality in Scheme to see the distance that still needs to be covered. Jacques

Henning Thielemann

9:16 a.m.

On Wed, 29 Jun 2005, Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Jacques Carette wrote:

...
This is called operator overloading. It is completely harmless because you can tell the two * apart from their type signatures. It is a complete and total waste of time to use two different symbols to mean the same thing.

But the multiplication in A*x already needs multi-parameter type classes. :-)

I'm uncertain about how who want to put the different kinds of multiplication into one method, even with multi-parameter type classes. You need instances (*) :: Matrix -> Matrix -> Matrix (*) :: RowVector -> Matrix -> RowVector (*) :: Matrix -> ColumnVector -> ColumnVector (*) :: RowVector -> ColumnVector -> Scalar (*) :: ColumnVector -> RowVector -> Matrix (*) :: Scalar -> RowVector -> RowVector (*) :: RowVector -> Scalar -> RowVector (*) :: Scalar -> ColumnVector -> ColumnVector (*) :: ColumnVector -> Scalar -> ColumnVector but you have to make sure that it is not possible to write an expression which needs (*) :: Matrix -> RowVector -> RowVector Further you need transpose :: RowVector -> ColumnVector transpose :: ColumnVector -> RowVector transpose :: Matrix -> Matrix and you must forbid, say transpose :: RowVector -> RowVector Not to mention many special (linear) operators which need to be prepared for both column and row vectors, like Fourier transform, component permutation, convolution, which essentially do the same on both types. I assume that you mean with "common linear algebra idioms" transforms like the following one. (x and y are column vectors) (x * y^T)^2 = x * y^T * x * y^T = x * (y^T * x) * y^T | because y^T * x is "scalar" = (y^T * x) * (x * y^T) (Analogously this can be done with bra-ket notation, i.e. , then transposition must be adjoint) Now your concerns are that this becomes too complicated if the distinction between row and column vectors is dropped and instead different operators are used?

...

...
I should know better than to get into a discussion like this with someone who believes they singlehandedly know better than tens of thousands of mathematicians...

Is the design discussion related to linear algebra for Maple documented somewhere? For Modula-3 such a discussion was recorded on video tape (then got lost :-( ) and was later edited in "Systems programming with Modula-3" by Greg Nelson. It's fun to read and very informative and I haven't seen such a discussion for other languages.

Jacques Carette

1:39 p.m.

Henning Thielemann wrote:

...

I'm uncertain about how who want to put the different kinds of multiplication into one method, even with multi-parameter type classes. You need instances

(*) :: Matrix -> Matrix -> Matrix (*) :: RowVector -> Matrix -> RowVector

[many other instances removed.] Definitely not. You could do: Data Orientation = Row | Column Data Vector a = Vector Orientation [a] (well, the size should be in there too, but I'll skip that). That easily takes care of most of the complexity. And you can easily define operations that can choose to ignore Orientation if they wish (like inner product of vectors). I said that this design works. Of course, implementing it along the lines you suggested in this email would have been idiotic - there is a better way to do it. If you go back to my original email, the sheer complexity of all the combinations possible (several thousand) should have given you a clue that your proposal above was a complete and utter non-starter.

...

Is the design discussion related to linear algebra for Maple documented somewhere?

Unfortunately not. The design is documented, with some of the discussion captured, but those documents are still in internal papers to Maplesoft Inc. There is no profit to be made in publishing it (in their estimation), so that information remains internal. Jacques

Henning Thielemann

4:24 p.m.

On Thu, 30 Jun 2005, Jacques Carette wrote:

...

Henning Thielemann wrote:

...
I'm uncertain about how who want to put the different kinds of multiplication into one method, even with multi-parameter type classes. You need instances

(*) :: Matrix -> Matrix -> Matrix (*) :: RowVector -> Matrix -> RowVector

[many other instances removed.]

Definitely not. You could do: Data Orientation = Row | Column Data Vector a = Vector Orientation [a]

In the first mail you wrote "9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly." I interpreted that you want to encode the information Row or Column into the type. This sounds reasonable to me because multiplying e.g. a column vector by a matrix is so obviously wrong that it should be detected statically.

Jacques Carette

4:32 p.m.

Henning Thielemann wrote:

...

...
Data Orientation = Row | Column Data Vector a = Vector Orientation [a]

In the first mail you wrote "9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly."

I interpreted that you want to encode the information Row or Column into the type. This sounds reasonable to me because multiplying e.g. a column vector by a matrix is so obviously wrong that it should be detected statically.

Sorry, I should have been more precise, I used too straightforward an encoding from Maple (which is dynamically typed) into Haskell. I should have used the usual type-trickery to encode the Orientation as 2 different types, to have ``types reflect values''. Jacques

Henning Thielemann

4:41 p.m.

On Thu, 30 Jun 2005, Jacques Carette wrote:

...

Henning Thielemann wrote:

...
...
Data Orientation = Row | Column Data Vector a = Vector Orientation [a]

In the first mail you wrote "9. There are row vectors and column vectors, and these are different types. You get type errors if you mix them incorrectly."

I interpreted that you want to encode the information Row or Column into the type. This sounds reasonable to me because multiplying e.g. a column vector by a matrix is so obviously wrong that it should be detected statically.

Sorry, I should have been more precise, I used too straightforward an encoding from Maple (which is dynamically typed) into Haskell. I should have used the usual type-trickery to encode the Orientation as 2 different types, to have ``types reflect values''.

You mean phantom types? data Vector orient a = Vector [a] I'm excited to add lots of new orientations then! :-]

Keean Schupke

5 Jul 5 Jul

5:37 p.m.

Henning Thielemann wrote:

...

I'm uncertain about how who want to put the different kinds of multiplication into one method, even with multi-parameter type classes. You need instances

(*) :: Matrix -> Matrix -> Matrix (*) :: RowVector -> Matrix -> RowVector (*) :: Matrix -> ColumnVector -> ColumnVector (*) :: RowVector -> ColumnVector -> Scalar (*) :: ColumnVector -> RowVector -> Matrix (*) :: Scalar -> RowVector -> RowVector (*) :: RowVector -> Scalar -> RowVector (*) :: Scalar -> ColumnVector -> ColumnVector (*) :: ColumnVector -> Scalar -> ColumnVector

but you have to make sure that it is not possible to write an expression which needs (*) :: Matrix -> RowVector -> RowVector

Further you need transpose :: RowVector -> ColumnVector transpose :: ColumnVector -> RowVector transpose :: Matrix -> Matrix and you must forbid, say transpose :: RowVector -> RowVector

Of course if they are all of type Matrix this problem disappears. What is the difference between a 1xN matrix and a vector? Please explain... Keean.

Henning Thielemann

6:17 p.m.

On Tue, 5 Jul 2005, Keean Schupke wrote:

...

Henning Thielemann wrote:

...
I'm uncertain about how who want to put the different kinds of multiplication into one method, even with multi-parameter type classes. You need instances

(*) :: Matrix -> Matrix -> Matrix (*) :: RowVector -> Matrix -> RowVector (*) :: Matrix -> ColumnVector -> ColumnVector (*) :: RowVector -> ColumnVector -> Scalar (*) :: ColumnVector -> RowVector -> Matrix (*) :: Scalar -> RowVector -> RowVector (*) :: RowVector -> Scalar -> RowVector (*) :: Scalar -> ColumnVector -> ColumnVector (*) :: ColumnVector -> Scalar -> ColumnVector

but you have to make sure that it is not possible to write an expression which needs (*) :: Matrix -> RowVector -> RowVector

This was my reply to Jacques Carette who suggested to distinguish row and column vectors by their _type_. His revised suggestion was to use phantom types. Then the types changes to (*) :: Matrix -> Matrix -> Matrix (*) :: Vector Row -> Matrix -> Vector Row (*) :: Matrix -> Vector Column -> Vector Column (*) :: Vector Row -> Vector Column -> Scalar ... but the problem remains ...

...

...
Further you need transpose :: RowVector -> ColumnVector transpose :: ColumnVector -> RowVector transpose :: Matrix -> Matrix and you must forbid, say transpose :: RowVector -> RowVector

Of course if they are all of type Matrix this problem disappears.

All type problems disappear when we switch to exclusive String representation. 8-]

...

What is the difference between a 1xN matrix and a vector? Please explain...

My objections to making everything a matrix were the objections I sketched for MatLab. The example, again: If you write some common expression like transpose x * a * x then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector. It may be that 'x' is a row vector and 'a' a 1x1 matrix, then the expression denotes a square matrix of the size of the vector simulated by 'x'. It may be that 'x' is a column vector and 'a' a square matrix. Certainly in most cases I want the latter one and I want to have a scalar as a result. But if everything is a matrix then I have to check at run-time if the result is a 1x1 matrix and then I have to extract the only element from this matrix. If I omit the 1x1 test mistakes can remain undiscovered. I blame the common notation x^T * A * x for this trouble since the alternative notation can be immediately transcribed into computer language code while retaining strong distinction between a vector and a matrix type. More examples? More theory?

Michael Walter

6 Jul 6 Jul

12:41 a.m.

On 7/5/05, Henning Thielemann wrote:

...

The example, again: If you write some common expression like

transpose x * a * x

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector.

But since a, by definition (question), is a 1xN matrix, what's the ambiguity exactly? Michael

Henning Thielemann

8:09 a.m.

On Tue, 5 Jul 2005, Michael Walter wrote:

...

On 7/5/05, Henning Thielemann wrote:

...
The example, again: If you write some common expression like

transpose x * a * x

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector.

But since a, by definition (question), is a 1xN matrix, what's the ambiguity exactly?

If you want this definition then you must also interpret any 1x1 matrix as a real. That's what I wanted to show with my example, that's the way MatLab works and why it sucks. Multiplication of reals is commutative, reals are naturally totally ordered and so on, matrices (including 1x1 matrices) don't have these properties. Since it is sensible to work with one dimensional vectors it is also sensible to work with 1x1 matrices. But 1x1 matrices of this kind are certainly different from 1x1 matrices produced by transpose x * a * x. A 1x1 matrix in MatLab can thus mean a scalar, a row 1-vector, a column 1-vector or a 1x1-matrix. (If you accept the differences between these terms.) Alternatively you could convert the expected 1x1-matrix into a real which must be checked at run time. Theoretically vectors are objects which can be scaled and added and matrices represent linear operators on vectors. (Operators including linear operators may build a vector space itself, but that's a different issue.) Why should we convert each vector into the representation of some linear operator before doing linear algebra and why should we convert this representation of a linear operator back to the vector after linear algebra has happened? Would you load an audio signal into a 1xN-matrix or into a N-vector? Loading it into a 1xN-matrix forces you to check dynamically and repeatedly if the matrix has really only row. Btw. I would also load an image into a vector, but a vector with a two-dimensional index set.

David Roundy

7 Jul 7 Jul

12:17 p.m.

On Tue, Jul 05, 2005 at 08:17:58PM +0200, Henning Thielemann wrote:

...

The example, again: If you write some common expression like

transpose x * a * x

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector. It may be that 'x' is a row vector and 'a' a 1x1 matrix, then the expression denotes a square matrix of the size of the vector simulated by 'x'. It may be that 'x' is a column vector and 'a' a square matrix. Certainly in most cases I want the latter one and I want to have a scalar as a result. But if everything is a matrix then I have to check at run-time if the result is a 1x1 matrix and then I have to extract the only element from this matrix. If I omit the 1x1 test mistakes can remain undiscovered. I blame the common notation x^T * A * x for this trouble since the alternative notation can be immediately transcribed into computer language code while retaining strong distinction between a vector and a matrix type.

The issue is that Haskell (as far as I understand, and noone has suggested anything to the contrary) doesn't have a sufficiently powerful type system to represent matrices or vectors in a statically typed way. It would be wonderful if we could represent matrix multiplication as matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o (where n, m and o are dimensions) but we can't do that. So we're stuck with dynamic size checking for all matrix and vector operations that involve two or more operands. This is just something we have to live with for now. If anyone knows differently, please speak up! Perhaps one could do this with template haskell, but that sounds scary and a bit ugly (although I'd love to be proven wrong). You'd like to have certain special cases where a dimension 1 is treated differently from other size dimensions. I'd say that it'd be simpler to start out with everything done in a general manner, and then you can always add special cases later if you'd like. Given that we're stuck with a dynamically typed system, I'd prefer to have a simple dynamically typed system. You think of vectors as fundamental and matrices as a sort of derived quantity. I'd prefer to think the other way around, with vectors being special cases of matrices, since it's simpler to implement and work with. Since matrices form a complete self-contained algebra (and I'm sure I'm massacring the terminology--sorry), it seems simplest to implement that first, especially since you'd want to implement it anyways. Also note that if you have several vectors x for which you want to compute the dot product with metric A, and if you want to do this efficiently, you'll have to convert your list of vectors into a matrix anyways. Writing functions of vectors instead as functions of 1xN matrices allows them to be efficiently applied to multiple vectors simultaneously, provided they are written carefully. Alas, if we had static dimension checking, we wouldn't have to worry about writing carefully, but alas that doesn't seem to be an option. -- David Roundy http://www.darcs.net

Lennart Augustsson

12:44 p.m.

David Roundy wrote:

...

The issue is that Haskell (as far as I understand, and noone has suggested anything to the contrary) doesn't have a sufficiently powerful type system to represent matrices or vectors in a statically typed way. It would be wonderful if we could represent matrix multiplication as

matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o

Actually, Haskell does allow you to do that. But the syntax of the types gets pretty horrendous. I'm sure Oleg will show you how any moment now. :) -- Lennart

Conal Elliott

6:10 p.m.

Maybe we could solve this problem in a simple and general way by working with a more abstract notion of linear maps, rather than the matrices commonly used to represent linear maps. Instead of "Matrix n m", where n and m are either integers (requiring something like dependent types) or type encodings of integers, use "LMap u v", where u and v are vector spaces, e.g., R^n and R^m. This change would eliminate the need for dependent types or integer encodings. From another perspective, the integers are encodings of the vector space types, but a very restricted subset of vector space types. If we take the encoding out of the interface, we might get a much mor general library, allowing for linear maps between *any* vector spaces, not just ones of the form R^n. For instance, linear maps are vector spaces also, so we can handle "LMap u (LMap u v)" (the range type of the second derivative of a u->v function). Also, it's clear that "LMap R R^n" and "LMap R^n R" are different types (for n/=1), so we can distinguish between "row" and "column" vectors without creating more types. Are there other reasons for orienting vectors? I can't call these comments a concrete proposal, as it's not clear to me how to define LMap and what operations it supports. It may be worth exploring, though. Cheers, - Conal -----Original Message----- From: Lennart Augustsson Sent: Thursday, July 07, 2005 5:44 AM To: David Roundy Cc: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] matrix computations based on the GSL David Roundy wrote:

...

The issue is that Haskell (as far as I understand, and noone has suggested anything to the contrary) doesn't have a sufficiently powerful type system to represent matrices or vectors in a statically typed way. It would be wonderful if we could represent matrix multiplication as

matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o

Actually, Haskell does allow you to do that. But the syntax of the types gets pretty horrendous. I'm sure Oleg will show you how any moment now. :) -- Lennart

Henning Thielemann

6:39 p.m.

On Thu, 7 Jul 2005, Conal Elliott wrote:

...

Maybe we could solve this problem in a simple and general way by working with a more abstract notion of linear maps, rather than the matrices commonly used to represent linear maps. Instead of "Matrix n m", where n and m are either integers (requiring something like dependent types) or type encodings of integers, use "LMap u v", where u and v are vector spaces, e.g., R^n and R^m. This change would eliminate the need for dependent types or integer encodings.

I would also dislike integer encodings in the matrix type because it is very sensible to have other types of indices. I would just do it how it is solved for Arrays: Two index types for a matrix but the actual matrix size is not coded in the type. E.g. I would like to multiply a matrix of type Matrix (Bool, Char) Double with a vector of type Vector Char Double obtaining a result of type Vector Bool Double.

...

Also, it's clear that "LMap R R^n" and "LMap R^n R" are different types (for n/=1), so we can distinguish between "row" and "column" vectors without creating more types. Are there other reasons for orienting vectors?

My point was that vectors naturally do _not_ represent linear maps at all, but they are the objects linear maps act on. If I process an audio signal or an image I can consider it well as vector but why should I consider it as linear map?

...

I can't call these comments a concrete proposal, as it's not clear to me how to define LMap and what operations it supports.

The problem will be that there is no canonical representation for linear maps in general vector spaces. I consider a function of type (Double -> Double) as an approximation to a real function but we don't have a general representation for linear operators on real functions. The remaining problem is that Vector and Linear Map are relative terms. That is, as you point out, a Linear Map can be a Vector with respect to a higher order Linear Map. I think in contrast to real mathematics we have to distinct due to computational restrictions. On the one hand we can work with a vector space type class which only specifies what basic operations (namely add and scale) shall be supported by types that want to be called vectors. This type class has the full flexibility but no canonical representation for linear maps. On the other hand we define finite dimensional vectors (array like) and matrices as representations for linear maps. For the case of higher order linear maps there should be a conversion between Matrix and Vector.

Aaron Denney

14 Jul 14 Jul

6:13 p.m.

On 2005-07-07, Henning Thielemann wrote:

...

My point was that vectors naturally do _not_ represent linear maps at all, but they are the objects linear maps act on. If I process an audio signal or an image I can consider it well as vector but why should I consider it as linear map?

Yes, they do. given a vector "a" in R^n, there are natural, invertible maps co(x): R^n -> R = (a dot x), and scale(b) : R -> R^n = (b * a). These contain all the information in the vector, and do come up naturally in some cases. (Consider a similarity metric on images -- co is a natural one). -- Aaron Denney -><-

Henning Thielemann

7 Jul 7 Jul

1:08 p.m.

On Thu, 7 Jul 2005, David Roundy wrote:

...

On Tue, Jul 05, 2005 at 08:17:58PM +0200, Henning Thielemann wrote:

...
The example, again: If you write some common expression like

transpose x * a * x

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector. It may be that 'x' is a row vector and 'a' a 1x1 matrix, then the expression denotes a square matrix of the size of the vector simulated by 'x'. It may be that 'x' is a column vector and 'a' a square matrix. Certainly in most cases I want the latter one and I want to have a scalar as a result. But if everything is a matrix then I have to check at run-time if the result is a 1x1 matrix and then I have to extract the only element from this matrix. If I omit the 1x1 test mistakes can remain undiscovered. I blame the common notation x^T * A * x for this trouble since the alternative notation can be immediately transcribed into computer language code while retaining strong distinction between a vector and a matrix type.

The issue is that Haskell (as far as I understand, and noone has suggested anything to the contrary) doesn't have a sufficiently powerful type system to represent matrices or vectors in a statically typed way. It would be wonderful if we could represent matrix multiplication as

matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o

Even if we had that I would vote for distinction of Vector and Matrix! I wanted to show with my example that vectors and matrices are different enough that it makes sense to give them different types. In my opinion multiplying a matrix with a so-called column vector is a more fundamental bug than multiplying a matrix with a vector of non-compatible size. That is I like static check of matrix-vector combinations and a dynamic check of their size. The latter also because in many application you don't know the vector size at compile time.

...

You think of vectors as fundamental and matrices as a sort of derived quantity. I'd prefer to think the other way around, with vectors being special cases of matrices, since it's simpler to implement and work with.

It's less type-safe and thus cumbersome to work with. That's my experience with MatLab.

...

Also note that if you have several vectors x for which you want to compute the dot product with metric A, and if you want to do this efficiently, you'll have to convert your list of vectors into a matrix anyways.

If you bundle some vectors as columns in matrix B and want to compute norms with respect to matrix A writing B^T * A * B you will not only get the norms of the vectors in B but also many mixed scalar products. This example shows to me that matrices are not simply collections of vectors. Btw. we should not mess up the design with decisions on efficiency concerns. If you want efficiency then you can abuse matrices for that but I consider a 'map' over a list of vectors as the cleaner way (and more type safe, because you can't make transposition errors).

Henning Thielemann

2:07 p.m.

On Thu, 7 Jul 2005, Henning Thielemann wrote:

...

...
Also note that if you have several vectors x for which you want to compute the dot product with metric A, and if you want to do this efficiently, you'll have to convert your list of vectors into a matrix anyways.

If you bundle some vectors as columns in matrix B and want to compute norms with respect to matrix A writing B^T * A * B you will not only get the norms of the vectors in B but also many mixed scalar products. This example shows to me that matrices are not simply collections of vectors.

Let me elaborate on that: In some cases putting vectors as columns into a matrix then applying a matrix operation on this matrix leads to the same like to 'map' a matrix-vector operation to a list of vectors. But in other cases (as the one above) this is not what you want. I consider it as an incidence not as a general principle if this kind of extension works. Let's consider another example: The basic definition of the Fourier transform is for vectors. MatLab wants to make the effect of vector operations consistent for row and column vectors, thus

...

...
fft([1,0])

ans = 1 1

...

...
fft([1;0])

ans = 1 1 What is the most natural extension to matrices? I would say it is the composition of a row-wise Fourier transform and a column-wise transform. This interpretation would have the special cases shown above. It would yield fft([1,0;0,0]) = [1,1;1,1] But MatLab says

...

...
fft([1,0;0,0])

ans = 1 0 1 0 That is by default it considers the Fourier transform as a column-wise transform - but in the special case of a 1-row matrix it behaves completely different.

Henning Thielemann

3:40 p.m.

On Thu, 7 Jul 2005, Henning Thielemann wrote:

...

...
...
fft([1,0;0,0])

ans =

1 0 1 0

Also funny:

...

...
conv([1;1],[1,1])

ans = 1 2 1

...

...
conv([1;1;1],[1,1])

ans = 1 2 2 1

Keean Schupke

8 Jul 8 Jul

1:53 p.m.

Henning Thielemann wrote:

...

Let me elaborate on that: In some cases putting vectors as columns into a matrix then applying a matrix operation on this matrix leads to the same like to 'map' a matrix-vector operation to a list of vectors. But in other cases (as the one above) this is not what you want. I consider it as an incidence not as a general principle if this kind of extension works.

Let's consider another example: The basic definition of the Fourier transform is for vectors. MatLab wants to make the effect of vector operations consistent for row and column vectors, thus

Okay, this approach is starting to make sense to me... I can see now that Vectors are a different type of object to Matrices. Vectors represent points in N-Space and matrices represent operations on those points (say rotations or translations)... But it seems we can represent translations as adding vectors or matrix operations (although we need to introduce the 'extra' dimension W... and have an extra field in vectors that contains the value '1'). (3D translation) [x,y,z,1] * [[0,0,0,0],[0,0,0,0],[0,0,0,0],[dx,dy,dz,dw]] = [x+dx,y+dy,z+dz,1+dw] but how is this different from adding vectors? If we allow vector addition then we no longer have the nice separation between values and linear operators, as a value can also be a linear operator (a translation)? Keean.

Henning Thielemann

2:03 p.m.

On Fri, 8 Jul 2005, Keean Schupke wrote:

...

Okay, this approach is starting to make sense to me... I can see now that Vectors are a different type of object to Matrices. Vectors represent points in N-Space and matrices represent operations on those points

That's what I wanted to express.

...

(say rotations or translations)... But it seems we can represent translations as adding vectors or matrix operations (although we need to introduce the 'extra' dimension W... and have an extra field in vectors that contains the value '1').

(3D translation)

[x,y,z,1] * [[0,0,0,0],[0,0,0,0],[0,0,0,0],[dx,dy,dz,dw]] = [x+dx,y+dy,z+dz,1+dw]

Do you mean [x,y,z,1] * [[1,0,0,0],[0,1,0,0],[0,0,1,0],[dx,dy,dz,dw+1]] ?

...

but how is this different from adding vectors? If we allow vector addition then we no longer have the nice separation between values and linear operators, as a value can also be a linear operator (a translation)?

???

Keean Schupke

2:10 p.m.

Henning Thielemann wrote:

...

Do you mean [x,y,z,1] * [[1,0,0,0],[0,1,0,0],[0,0,1,0],[dx,dy,dz,dw+1]] ?

Erm, yes thats what I meant ... but you obviously got the point.

...

...
but how is this different from adding vectors? If we allow vector addition then we no longer have the nice separation between values and linear operators, as a value can also be a linear operator (a translation)?

???

Well if a vector can be a linear-operator, then surely it _is_ a matrix! Keean.

Henning Thielemann

3:46 p.m.

On Fri, 8 Jul 2005, Keean Schupke wrote:

...

Henning Thielemann wrote:

...
Do you mean [x,y,z,1] * [[1,0,0,0],[0,1,0,0],[0,0,1,0],[dx,dy,dz,dw+1]] ?

Erm, yes thats what I meant ... but you obviously got the point.

...
...
but how is this different from adding vectors? If we allow vector addition then we no longer have the nice separation between values and linear operators, as a value can also be a linear operator (a translation)?

???

Well if a vector can be a linear-operator, then surely it _is_ a matrix!

In general a vector need not to be a linear operator. You talked about vector translation, translation is not a linear operator. You gave some process to map the problem to somewhere, where it becomes a linear operator. Other people said that the scalar product with a fixed vector is a linear operator. That's true. Now what is a natural interpretation of a vector as linear operator? The scalar product or the translation? Vectors can be used and abused for many things but an object which can be called a vector (because of its ability of to be added and to be scaled) is not a linear operator itself and does not naturally represent one.

Keean Schupke

4:16 p.m.

Henning Thielemann wrote:

...

In general a vector need not to be a linear operator. You talked about vector translation, translation is not a linear operator. You gave some process to map the problem to somewhere, where it becomes a linear operator. Other people said that the scalar product with a fixed vector is a linear operator. That's true. Now what is a natural interpretation of a vector as linear operator? The scalar product or the translation? Vectors can be used and abused for many things but an object which can be called a vector (because of its ability of to be added and to be scaled) is not a linear operator itself and does not naturally represent one.

So the linear operator is translation (ie: + v)... effectively 'plus' could be viewed as a function which takes a vector and returns a matrix (operator) (+) :: Vector -> Matrix Which could also be called 'translate'. So 'translate' takes a vector and returns a linear-operator which can be applied to a vector: mapply (translate vector1) vector2 So I guess I could now ask, why allow vector addition at all? Keean.

Henning Thielemann

4:26 p.m.

On Fri, 8 Jul 2005, Keean Schupke wrote:

...

So the linear operator is translation (ie: + v)... effectively 'plus' could be viewed as a function which takes a vector and returns a matrix (operator)

(+) :: Vector -> Matrix

Since a matrix _is_ not a linear map but only its representation, this would not make sense. As I said (v+) is not a linear map thus there is no matrix which represents it. A linear map f must fulfill f 0 == 0 But since v+0 == v the function (v+) is only a linear map if 'v' is zero. I can't see how to fit in your vector extension by the 1-component.

Keean Schupke

5:08 p.m.

Henning Thielemann wrote:

...

On Fri, 8 Jul 2005, Keean Schupke wrote:

...
So the linear operator is translation (ie: + v)... effectively 'plus' could be viewed as a function which takes a vector and returns a matrix (operator)

(+) :: Vector -> Matrix

Since a matrix _is_ not a linear map but only its representation, this would not make sense. As I said (v+) is not a linear map thus there is no matrix which represents it. A linear map f must fulfill f 0 == 0

But since v+0 == v the function (v+) is only a linear map if 'v' is zero.

I can't see how to fit in your vector extension by the 1-component.

Eh? Translation is a linear operation no? Adding vectors translates the first by the second (or the second by the first - the two are isomorphic)... A translation can be represented by the matrix: 1 0 0 0 0 1 0 0 0 0 1 0 dx dy dz 1 So the result of "v+" is this matrix. In other words this matrix is the 'vector addition operator'... providing you pad the vectors with an additional '1' at the end. So if: translate :: Vector -> Matrix [x,y,z,1] = [[1,0,0,0],[0,1,0,0],[0,0,1,0],[x,y,z,1]] we can create a matrix representing translation from: translate [3,4,5,1] and can apply this translation to another vector: mapply (translate [3,4,5,1]) [2,3,4,1] = [5,7,9,1] All I was saying is that following this, partial application of vector addition: [3,4,5] + [2,3,4] = [5,7,9] but partially applying: ([3,4,5] +) would be a the matrix defined above as (translate [3,4,5,1]) ... Of course this has the drawback that you need an extra dimension in you vectors and matrices to cope with translation. Anyway I have more or less convinced myself that separating vectors and matrices is the right thing to do... I was just trying to define vector addition in terms of a matrix operation for neatness. Keean.

Henning Thielemann

5:38 p.m.

On Fri, 8 Jul 2005, Keean Schupke wrote:

...

Henning Thielemann wrote:

...
On Fri, 8 Jul 2005, Keean Schupke wrote:

...
So the linear operator is translation (ie: + v)... effectively 'plus' could be viewed as a function which takes a vector and returns a matrix (operator)

(+) :: Vector -> Matrix

Since a matrix _is_ not a linear map but only its representation, this would not make sense. As I said (v+) is not a linear map thus there is no matrix which represents it. A linear map f must fulfill f 0 == 0

But since v+0 == v the function (v+) is only a linear map if 'v' is zero.

I can't see how to fit in your vector extension by the 1-component.

Eh?

Today: +----------------+ | The Linear Map | +----------------+ If F is a field, V a set of vectors, and (F,V,+,*) a vectorspace then a function f of V -> V is linear if: for all x and y from V f (x+y) == f x + f y for all k from F and x from V f (k*x) == k * f x Lemma If f is a linear map, then f 0 = 0. Proof: For any v from V it is v+(-1)*v = 0. f 0 = f (v+(-1)*v) = f v + f ((-1)*v) = f v + (-1) * f v = 0 Theorem If (v+) is a linear map then v == 0. Proof: (v+0) == 0 (conclusion from the lemma) => v == 0 8-]

...

(or the second by the first - the two are isomorphic)... A translation can be represented by the matrix:

1 0 0 0 0 1 0 0 0 0 1 0 dx dy dz 1

So the result of "v+" is this matrix.

But the vectors I can add to v have one component less than necessary for multiplication with this matrix. Indeed you can map all v's with three components to an affine sub-space of the 4-dimensional space, namely to those vectors with a 1 in the last component. But you are no longer working with three dimensional vectors but with four-dimensional ones. Again: isomorphy is not identity.

Glynn Clements

9 Jul 9 Jul

8:01 p.m.

Keean Schupke wrote:

...

...
...
So the linear operator is translation (ie: + v)... effectively 'plus' could be viewed as a function which takes a vector and returns a matrix (operator)

(+) :: Vector -> Matrix

Since a matrix _is_ not a linear map but only its representation, this would not make sense. As I said (v+) is not a linear map thus there is no matrix which represents it. A linear map f must fulfill f 0 == 0

But since v+0 == v the function (v+) is only a linear map if 'v' is zero.

I can't see how to fit in your vector extension by the 1-component.

Eh?

Translation is a linear operation no?

No. It's affine, but not linear. As Henning said, to be linear, it must map zero to zero.

...

Adding vectors translates the first by the second (or the second by the first - the two are isomorphic)... A translation can be represented by the matrix:

1 0 0 0 0 1 0 0 0 0 1 0 dx dy dz 1

So the result of "v+" is this matrix.

No. If the above matrix is M, then: [x y z w].M = [x+w.dx y+w.dy z+w.dz w] which isn't a translation. In the specific case of homogeneous coordinates, where: h [x y z] = [x y z 1] h' [x y z w] = [x/w y/w z/w] then \v -> h'(h(v).M) is a translation, but M isn't itself a translation. -- Glynn Clements

Benjamin Franksen

12:54 a.m.

On Friday 08 July 2005 17:46, Henning Thielemann wrote:

...

Vectors can be used and abused for many things but an object which can be called a vector (because of its ability of to be added and to be scaled) is not a linear operator itself and does not naturally represent one.

At least for finite dimensional spaces (and these are the only ones under consideration here, right?) scalar multiplication is a very nice and natural way to view a vector as a linear operation (into the scalar field). I know, in linear algebra the told us all that this corespondence is not a 'natural' or 'canonic' one because it depends on a chosen basis. Well, well. For practical purposes of programming, we always use the 'canonic base' right? So if the base is canonic, then so is the correspondence between vector space and its dual. On a different not, one could argue that a 1xn matrix M is indeed a vector of dimension n, an then M' = M^T is a nx1 matrix, that is also a vector of dimension n, but this is the same vector as the non-transposed version. Now, the two things (M and M') are the same, if viewed as a vector, but not the same if viewed as a matrix. Can we express this in Haskell? Ben

David Roundy

7 Jul 7 Jul

2:36 p.m.

On Thu, Jul 07, 2005 at 03:08:50PM +0200, Henning Thielemann wrote:

...

On Thu, 7 Jul 2005, David Roundy wrote:

...
On Tue, Jul 05, 2005 at 08:17:58PM +0200, Henning Thielemann wrote:

...
The example, again: If you write some common expression like

transpose x * a * x

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector. It may be that 'x' is a row vector and 'a' a 1x1 matrix, then the expression denotes a square matrix of the size of the vector simulated by 'x'. It may be that 'x' is a column vector and 'a' a square matrix. Certainly in most cases I want the latter one and I want to have a scalar as a result. But if everything is a matrix then I have to check at run-time if the result is a 1x1 matrix and then I have to extract the only element from this matrix. If I omit the 1x1 test mistakes can remain undiscovered. I blame the common notation x^T * A * x for this trouble since the alternative notation can be immediately transcribed into computer language code while retaining strong distinction between a vector and a matrix type.

The issue is that Haskell (as far as I understand, and noone has suggested anything to the contrary) doesn't have a sufficiently powerful type system to represent matrices or vectors in a statically typed way. It would be wonderful if we could represent matrix multiplication as

matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o

Even if we had that I would vote for distinction of Vector and Matrix! I wanted to show with my example that vectors and matrices are different enough that it makes sense to give them different types. In my opinion multiplying a matrix with a so-called column vector is a more fundamental bug than multiplying a matrix with a vector of non-compatible size. That is I like static check of matrix-vector combinations and a dynamic check of their size. The latter also because in many application you don't know the vector size at compile time.

You wouldn't need to know the vector size at compile time in order to get static type checking of vector sizes--if the type system is sufficiently powerful.

...

...
Also note that if you have several vectors x for which you want to compute the dot product with metric A, and if you want to do this efficiently, you'll have to convert your list of vectors into a matrix anyways.

If you bundle some vectors as columns in matrix B and want to compute norms with respect to matrix A writing B^T * A * B you will not only get the norms of the vectors in B but also many mixed scalar products. This example shows to me that matrices are not simply collections of vectors.

Good point, and that does illustrate the pain of doing this (treating sets of vectors as matrices).

...

Btw. we should not mess up the design with decisions on efficiency concerns. If you want efficiency then you can abuse matrices for that but I consider a 'map' over a list of vectors as the cleaner way (and more type safe, because you can't make transposition errors).

Mapping is cleaner, but something like 10 times slower... not for my example above, but for a simple y = A*x computation--you then could do the vector-vector inner product better than transpose y * x. Efficiency concerns shouldn't be separated from design decisions. An API shouldn't force an implementation to be inefficient. On the other hand, this is sort of a silly debate, since the API *I* want is a subset of the API *you* want. (Except that I'd like matrices to be an instance of Floating, but that's easy enough to do once we've got all the matrix operations...) -- David Roundy

Henning Thielemann

8 Jul 8 Jul

1:33 p.m.

On Thu, 7 Jul 2005, David Roundy wrote:

...

On the other hand, this is sort of a silly debate, since the API *I* want is a subset of the API *you* want.

The API you want is certainly not just a subset of what I want. E.g. the singular value decomposition in your favorite API will probably return a 1xN matrix or a MxN diagonal matrix containing the singular values, while with my favorite API it will return a vector or a list. Right?

David Roundy

3:05 p.m.

On Fri, Jul 08, 2005 at 03:33:16PM +0200, Henning Thielemann wrote:

...

On Thu, 7 Jul 2005, David Roundy wrote:

...
On the other hand, this is sort of a silly debate, since the API *I* want is a subset of the API *you* want.

The API you want is certainly not just a subset of what I want. E.g. the singular value decomposition in your favorite API will probably return a 1xN matrix or a MxN diagonal matrix containing the singular values, while with my favorite API it will return a vector or a list. Right?

I don't particularly care what API you use for svd, since it's trivial to convert from one API to the other. It's matrix arithmetic I care about, since that's the complicated part of the API. On the other hand, the most natural return value for svd would be a diagonal matrix, since that is what the objects are, right? svd returns three matrices, which when multiplied together give the original matrix... or at least that's how I think of it. But I'll grant that a diagonal matrix isn't the most convenient representation, and certain is far from the most efficient, unless we introduce a diagonal matrix constructor (which would certainly be nice). I guess you'd prefer that svd returns a list of doubles and two lists of vectors? Or a list of triplets of a double and two vectors? (Answering another email...) I certainly agree that fft is not a function of a matrix. -- David Roundy

Henning Thielemann

4:18 p.m.

On Fri, 8 Jul 2005, David Roundy wrote:

...

I don't particularly care what API you use for svd, since it's trivial to convert from one API to the other. It's matrix arithmetic I care about, since that's the complicated part of the API.

Of course I want to use the results of more complicated routines with basic matrix arithmetic and I like to reduce the number of conversions. The other reason for the debate is that if you don't like the extra vector type then you will not use it. If I want to apply a routine of you to my, say, audio data I will have to decide whether I shall store it as column or as row vector/matrix although this decision seems to me rather irrelevant and annoying.

...

On the other hand, the most natural return value for svd would be a diagonal matrix, since that is what the objects are, right?

Hm, since SVD means Singular Value Decomposition I like to have the singular values as they are. I don't want to search them in a sparse matrix.

...

svd returns three matrices, which when multiplied together give the original matrix ...

This would be a nice property, though. I could do it by converting the singular value list to a diagonal matrix. So I need a conversion, hm.

...

or at least that's how I think of it. But I'll grant that a diagonal matrix isn't the most convenient representation, and certain is far from the most efficient, unless we introduce a diagonal matrix constructor (which would certainly be nice).

I would not like to obtain a value of general Matrix type since it is statically guaranted that it is always diagonal.

...

I guess you'd prefer that svd returns a list of doubles and two lists of vectors? Or a list of triplets of a double and two vectors?

Either there is a function to scale the columns of a matrix separately, then two matrices and a list/vector of doubles are ok. Or there is a function to multiply two vectors to obtain a matrix with rank 1 (Outer product or tensor product might be the right name. Btw. the (<>) operator has the problem, that it is always interpreted as scalar product. But it could also mean this kind of multiplication.) Then adding such products of left and right singular vectors scaled by the corresponding singular value let me reconstruct the matrix. The triplet approach has the advantage that it is statically sure that the number of singular values and singular vectors match. The matrix approach has the advantage that it is statically guaranteed that the dimension of all singular vectors match. With the (orthogonal) singular vector matrices we map vectors from one basis to the representation in another basis. So these matrices represents naturally linear maps and it makes sense to use them.

Henning Thielemann

7 Jul 7 Jul

3:02 p.m.

On Thu, 7 Jul 2005, David Roundy wrote:

...

Also note that if you have several vectors x for which you want to compute the dot product with metric A, and if you want to do this efficiently, you'll have to convert your list of vectors into a matrix anyways. Writing functions of vectors instead as functions of 1xN matrices allows them to be efficiently applied to multiple vectors simultaneously, provided they are written carefully.

Btw. it is interesting that "source code efficiency" differs from runtime efficiency for matrix operations: a*b*c*v where a, b, c are square matrices and v is a column vector/matrix. This expression is interpreted as ((a*b)*c)*v and it need cubic time to multiply matrices. In contrast to that a*(b*(c*v)) needs only quadratic time. If we would use matrices as what they are invented for, namely as representations of linear operators, this efficiency leak would not occur: mulVec a $ mulVec b $ mulVec c v (Read mulVec as the mapping from a matrix to the linear operator it represents.)

Alberto Ruiz

6:37 p.m.

Hello! Thank you very much for all your suggestions. A new version of the library can be found at: http://dis.um.es/~alberto/hmatrix/matrix.html Here are the main changes: - Vector and Matrix are now different types with different functions operating on them. They cannot be interchanged and we must use explicit conversion functions. Things like (v::Vector) + (m::Matrix), or svd (v::Vector) are statically rejected. You must use functions like matrix_x_matrix, matrix_x_vector, dot :: Vector->Vector->Double, etc. On the other hand, there are functions to build a matrix from a list of vectors, to concatenate the columns of a matrix as a vector, etc. If I am not mistaken, now the ambiguities explained by Henning do not occur. - Only Eq, Show, and Read instances are now defined for Vector and Matrix. No numeric instances are defined, although it is very easy to do it using the functions provided. I have included a tentative "Interface" module including some hopefully user friendly operators. But it can be replaced or deleted without any problem... :) If this approach makes sense, we can proceed to include more standard linear algebra functions based on GSL, LAPACK, or on any other available resource. We can use a darcs repository to allow contributions from all interested people. And a few comments: David Roundy wrote:

...

It would be wonderful if we could represent matrix multiplication as

matrix_mul :: Matrix n m -> Matrix m o -> Matrix n o

...

(where n, m and o are dimensions) but we can't do that. So we're stuck with dynamic size checking for all matrix and vector operations that involve two or more operands.

Lennart Augustsson wrote:

...

Actually, Haskell does allow you to do that. But the syntax of the types gets pretty horrendous.

We must also "add" types. To build a matrix from blocks we need something like: matrix_block_row :: Matrix n m -> Matrix n o -> Matrix n (o+m) I understand from some papers and messages to the list that it can be done, but then we may also want lists of matrices with different dimensions: matrix_block :: [[Matrix ? ?]] -> Matrix The compile-time consistency check is not trivial. On the other hand, many useful functions have a type which depends on a value: vector :: [Double] -> Vector n We should statically reject this: vector [1,2,3] `vector_add` vector [1,2] (or perhaps we can imitate the zip behavior? :) The sizes of some vectors and matrices depend on run time computations, even without taking into account IO: nullspace :: Matrix m n -> Matrix k n k depends on the particular matrix. We can improve this to: nullspace :: Matrix m n -> [Vector n] but if we need the nullspace in another matrix operation we must build a matrix from a list whose size is not known at compile time... Static size checking is an advanced topic! Alberto

Henning Thielemann

6:53 p.m.

On Thu, 7 Jul 2005, Alberto Ruiz wrote:

...

Hello! Thank you very much for all your suggestions. A new version of the library can be found at:

http://dis.um.es/~alberto/hmatrix/matrix.html

I get time-out/ :-(

...

- Vector and Matrix are now different types with different functions operating on them. They cannot be interchanged and we must use explicit conversion functions. Things like (v::Vector) + (m::Matrix), or svd (v::Vector) are statically rejected.

Great!

...

You must use functions like matrix_x_matrix, matrix_x_vector, dot :: Vector->Vector->Double, etc. On the other hand, there are functions to build a matrix from a list of vectors, to concatenate the columns of a matrix as a vector, etc. If I am not mistaken, now the ambiguities explained by Henning do not occur.

Very great!

...

- Only Eq, Show, and Read instances are now defined for Vector and Matrix. No numeric instances are defined, although it is very easy to do it using the functions provided. I have included a tentative "Interface" module including some hopefully user friendly operators.

This should be an excellent basis for experiments. Maybe I can contribute an interface for the numericprelude type classes.

...

We can use a darcs repository to allow contributions from all interested people.

Yes please!

...

On the other hand, many useful functions have a type which depends on a value:

vector :: [Double] -> Vector n

We should statically reject this:

vector [1,2,3] `vector_add` vector [1,2]

(or perhaps we can imitate the zip behavior? :)

Better not :-) A static error is a must but obviously impossible.

...

The sizes of some vectors and matrices depend on run time computations, even without taking into account IO:

nullspace :: Matrix m n -> Matrix k n

k depends on the particular matrix. We can improve this to:

nullspace :: Matrix m n -> [Vector n]

but if we need the nullspace in another matrix operation we must build a matrix from a list whose size is not known at compile time...

Yes, these are even more arguments for making the matrix size not integers expressed by types.

...

Static size checking is an advanced topic!

Static size checking seems to me possible only if the size is static. Though clever people may have methods to statically check dynamic sizes.

Alberto Ruiz

6:58 p.m.

I' sorry, our web server is temporarily down :-( On Thursday 07 July 2005 20:37, Alberto Ruiz wrote:

...

Hello! Thank you very much for all your suggestions. A new version of the library can be found at:

http://dis.um.es/~alberto/hmatrix/matrix.html

Alberto Ruiz

8 Jul 8 Jul

8:24 a.m.

The server is working again. On Thursday 07 July 2005 20:58, Alberto Ruiz wrote:

...

I' sorry, our web server is temporarily down :-(

...
http://dis.um.es/~alberto/hmatrix/matrix.html

Henning Thielemann

9:29 a.m.

On Fri, 8 Jul 2005, Alberto Ruiz wrote:

...

The server is working again.

On Thursday 07 July 2005 20:58, Alberto Ruiz wrote:

...
I' sorry, our web server is temporarily down :-(

...
http://dis.um.es/~alberto/hmatrix/matrix.html

I would remove the 'matrix' portions of the function names, since this information is already contained in the module name. E.g. after importing LinearAlgebra.Matrix as Matrix, Matrix.matrix look strange, but Matrix.fromList says everything. I suggest matrix -> fromList matrixElements -> toList displayMatrix -> display readMatrix -> fromFile :: FilePath -> IO Matrix writeMatrix -> toFile :: FilePath -> Matrix -> IO () blockMatrix -> fromBlocks or fromMatrices subMatrix -> clip or cut or take matrixScale -> scale matrixOffset -> offset matrixAdd -> add matrixSub -> sub matrixMul -> mul (element-wise multiplication? then better zipMul or elementwiseMul, elMul what know I :-) matrixDiv -> div matrixSigns -> signs matrixAbs -> abs matrix_x_matrix ... hm difficult matrix_x_vector vector_x_matrix msin :: Matrix -> Matrix mcos :: Matrix -> Matrix mtan :: Matrix -> Matrix masin :: Matrix -> Matrix macos :: Matrix -> Matrix matan :: Matrix -> Matrix matan2 :: Matrix -> Matrix -> Matrix msinh :: Matrix -> Matrix mcosh :: Matrix -> Matrix mtanh :: Matrix -> Matrix masinh :: Matrix -> Matrix macosh :: Matrix -> Matrix matanh :: Matrix -> Matrix mexp :: Matrix -> Matrix mlog :: Matrix -> Matrix If there are no efficiency concerns, I would drop element-wise operations and prefer a matrix-map and a matrix-zipWith. If these operations shall remain I would somehow point to their element-wise operation in the name.

Henning Thielemann

1:35 p.m.

On Fri, 8 Jul 2005, Alberto Ruiz wrote:

...

The server is working again.

On Thursday 07 July 2005 20:58, Alberto Ruiz wrote:

...
I' sorry, our web server is temporarily down :-(

...
http://dis.um.es/~alberto/hmatrix/matrix.html

I would also prefer a vector of complex numbers for the FFT function instead of a nx2 matrix.

Alberto Ruiz

13 Jul 13 Jul

4:13 p.m.

Hello! I have a few doubts concerning the LinearAlgebra library... On Friday 08 July 2005 11:29, Henning Thielemann wrote:

...

I would remove the 'matrix' portions of the function names, since this information is already contained in the module name. E.g. after importing LinearAlgebra.Matrix as Matrix, Matrix.matrix look strange, but Matrix.fromList says everything.

I have changed the function names as suggested. This new style is clearly better, allowing Vector.add, Matrix.add, Vector.Complex.add, Matrix.Complex.add, etc.

...

If the Matrix type would be parametrised (...)

Now we can have Vector.T a and Matrix.T a for any storable a (although at this point most functions are only defined for Double). For example: eig :: Matrix.T Double -> ([Double], Matrix.T Double)

...

I would also prefer a vector of complex numbers for the FFT function instead of a nx2 matrix.

This comes from the old version in which vectors and matrices were the same type :-) Now we have: fft1D :: Vector.T (Complex Double) -> Vector.T (Complex Double) ifft1D :: Vector.T (Complex Double) -> Vector.T (Complex Double) fft2D :: Matrix (Complex Double) -> Matrix (Complex Double) etc. But now I have two problems: 1) If I define foo :: Vector.T a -> Matrix.T a Haddock (version 0.6) shows just this: foo :: T a -> T a I don't know how to tell haddock that I want the full names in the signatures. 2) When setting up the "main" module which reexports all the modules exposed to the user I cannot write this: --------------------------------- module LinearAlgebra ( module Vector, module Matrix, module Matrix.Complex, module LinearAlgebra.Algorithms --etc. ) where import LinearAlgebra.Vector as Vector import LinearAlgebra.Matrix as Matrix import LinearAlgebra.Matrix.Complex as Matrix.Complex import LinearAlgebra.Algorithms --etc. ------------------------------------- since we get lots of conflicting exports for T, add, scale, etc. I we use qualified imports: ----------------------------------------------------------- module LinearAlgebra ( module Vector, module Matrix, module Matrix.Complex, module LinearAlgebra.Algorithms --etc. ) where import qualified LinearAlgebra.Vector as Vector import qualified LinearAlgebra.Matrix as Matrix import qualified LinearAlgebra.Matrix.Complex as Matrix.Complex import LinearAlgebra.Algorithms ----------------------------------------------------------- then all the names in Vector, Matrix and Matrix.Complex must be always qualified, which is not very practical. I am afraid that I am doing something wrong... Another possibility is using type classes, since there is a lot of types addmiting add, sub, (ew)mul, scale, etc. (e.g., Vector Int, Vector Double, Matrix (Complex Float), etc.) We could also define type classes for higher level linear algebra algorithms working with different base types in the Matrices. Any suggestion? And some final comments.

...

If there are no efficiency concerns, I would drop element-wise operations and prefer a matrix-map and a matrix-zipWith. If these operations shall remain I would somehow point to their element-wise operation in the name.

There is about 5x speed gain if we map in the C side. The "optimized" floating map functions could be moved to a separate module.

...

If the Matrix type would be parametrised then Matrix.fromBlocks could use a more natural indexing.

...

Matrix.fromBlocks :: [[Matrix a b]] -> Matrix (Int,a) (Int,b)

...

The Int of the index pair would store the block number and the type a index would store the index within the block. Hm, but it would not be possible to store the sizes of the sub-matrices. It would be possible only if the sub-matrices have the same size. Maybe it is better to allow matrices of matrices and to be able to apply a matrix-matrix multiplication which in turn employs a matrix-matrix multiplication of its elements. But the matrix decompositions will fail on this structure - but possibly those which fail aren't appropriate for block matrices.

We can have a Matrix (Matrix Double) if we define the instance Storable (Matrix a). Although a "block matrix" is perhaps better represented at a higher level (storing only one copy of repeated blocks, etc.).

...

It would be nice to have functions which don't simply write the formatted vectors and matrices to stdout but which return them as strings. E.g.

...

Matrix.format :: Int -> Matrix.T -> String Vector.format :: Int -> Vector.T -> String

(or Matrix.toString)

The results can be reused in other contexts and the 'display' routines can use their results.

Done. --Alberto

Henning Thielemann

6:21 p.m.

On Wed, 13 Jul 2005, Alberto Ruiz wrote:

...

But now I have two problems:

1) If I define

foo :: Vector.T a -> Matrix.T a

Haddock (version 0.6) shows just this:

foo :: T a -> T a

I don't know how to tell haddock that I want the full names in the signatures.

I don't know, too. I'm afraid this is not possible with Haddock, yet.

...

2) When setting up the "main" module which reexports all the modules exposed to the user I cannot write this:

I guess a main module would be for re-exporting unqualified names which is certainly no longer necessary if all names are used qualified. Every program using the library should just import the modules from the library it needs, e.g. import qualified LinearAlgebra.Vector as Vector import qualified LinearAlgebra.Matrix as Matrix

Andrew Pimlott

14 Jul 14 Jul

11:02 p.m.

New subject: Module.T naming style (was: matrix computations based on the GSL)

On Wed, Jul 13, 2005 at 06:13:48PM +0200, Alberto Ruiz wrote:

...

I have changed the function names as suggested. This new style is clearly better, allowing Vector.add, Matrix.add, Vector.Complex.add, Matrix.Complex.add, etc. ... Now we can have Vector.T a and Matrix.T a for any storable a (although at this point most functions are only defined for Double). For example:

I would like to bristle mildly against the style of using Vector.T to represent the vector type. The reasons are 1) it is cryptic to those not used to the convention; 2) enshrining one-type-per-module in the naming convention is not IMO justified, and may prove limiting; 3) it doesn't work out well when you import a module that reexports Vector. I would say leave it Vector.Vector. Then the user may import the module unqualified, or qualified with an abbreviation like V, define his own synonym ("type Vector = Vector.Vector"), or just put up with the light annoyance of writing "Vector.Vector".

...

1) If I define

foo :: Vector.T a -> Matrix.T a

Haddock (version 0.6) shows just this:

foo :: T a -> T a

I think this is evidence that trying to treat the "primary type" as a special case in naming creates more annoyances than it eliminates. Andrew

Henning Thielemann

15 Jul 15 Jul

8:48 a.m.

New subject: Module.T naming style

On Thu, 14 Jul 2005, Andrew Pimlott wrote:

...

I would like to bristle mildly against the style of using Vector.T to represent the vector type. The reasons are 1) it is cryptic to those not used to the convention;

What does this tell about the quality of the concept?

...

2) enshrining one-type-per-module in the naming convention is not IMO justified, and may prove limiting;

Other languages like Modula-3 and Oberon do it with great success. The limit in Haskell is that most compilers don't conform to the Haskell 98 report which allows mutually recursive modules. But I think the compilers should allow them instead of forcing users to put many type and class definitions into one module.

...

3) it doesn't work out well when you import a module that reexports Vector.

Reexporting is only necessary for unqualified naming style. Vector.T is just the consequence of removing type-related prefixes from all identifiers. If you stick to qualified naming you only need to import the module once. Otherwise you must import a module twice. import LinearAlgebra.Vector (Vector) import qualified LinearAlgebra.Vector as V With this style importing as Vector is even not possible and you have to manage two abbreviations of the module. Setting up one module per data type (there are certainly auxiliary types which don't need it) allows a very clean and consistent naming style: DataType.T for the main type DataType.from* for conversions to that type DataType.to* for conversions into other types DataType.* for other functions related to that type It also answers the question whether the module name should be in singular or in plural form.

...

I would say leave it Vector.Vector. Then the user may import the module unqualified, or qualified with an abbreviation like V, define his own synonym ("type Vector = Vector.Vector"),

What about import qualified LinearAlgebra.Vector as V type Vector = V.T ?

...

...
1) If I define

foo :: Vector.T a -> Matrix.T a

Haddock (version 0.6) shows just this:

foo :: T a -> T a

I think this is evidence that trying to treat the "primary type" as a special case in naming creates more annoyances than it eliminates.

Haddock could adapt the type synonymes that are used in the signatures, or if it shall make the type names more uniform it could look how the modules are imported. That is, if a module is imported qualified the type should be shown qualified with the proposed abbreviation. Following the hyperlink of the type the user can check what the abbreviated module name means.

Andrew Pimlott

8:38 p.m.

New subject: Module.T naming style

On Fri, Jul 15, 2005 at 10:48:04AM +0200, Henning Thielemann wrote:

...

On Thu, 14 Jul 2005, Andrew Pimlott wrote:

...
2) enshrining one-type-per-module in the naming convention is not IMO justified, and may prove limiting;

Other languages like Modula-3 and Oberon do it with great success. The limit in Haskell is that most compilers don't conform to the Haskell 98 report which allows mutually recursive modules. But I think the compilers should allow them instead of forcing users to put many type and class definitions into one module.

Even given an ideal implementation (I would add that it should allow multiple modules in one file), I don't find one type or class per module preferable. I think it's usually a false division. For one, there will be functions that operate on multiple types, so where do you put those? For another, the user will probably have to import several modules, and remember which which module contains a function that operates on multiple types. That said, my point is not that this style of choosing module boundaries is always wrong, or that you can't answer the above objections in many cases. It's that you shouldn't build a general naming convention around it, especially one that some will find confusing or awkward, and when it has only small practical advantages (and some disadvantages). But I'm not going to push this any further: If people still like it, it's not that big a deal to me. :-)

...

Vector.T is just the consequence of removing type-related prefixes from all identifiers.

I interpreted the style as removing anything that is redundant in the context of the module. In math, when you're talking about vector spaces, you say "add" and "multiply", but you don't say "thing" or "one"--you still say "vector".

...

If you stick to qualified naming you only need to import the module once. Otherwise you must import a module twice.

import LinearAlgebra.Vector (Vector) import qualified LinearAlgebra.Vector as V

I don't think I understand what you're saying with this example. If you export the type as "Vector", the user could "import LinearAlgebra.Vector as Vector" and then refer to both "Vector" and "Vector.add". Or "import LinearAlgebra.Vector as V" and then refer to "Vector" (or "V.Vector") and "V.Add". Andrew

Claus Reinke

9:36 p.m.

New subject: Module.T naming style

...

I interpreted the style as removing anything that is redundant in the context of the module. In math, when you're talking about vector spaces, you say "add" and "multiply", but you don't say "thing" or "one"--you still say "vector".

you mean if I have a Vector "thing" and a Matrix "thing", I cannot just Operation.Multiplication.doit them things? what a Pity.T! claus

Henning Thielemann

18 Jul 18 Jul

5:01 p.m.

New subject: Module.T naming style

On Fri, 15 Jul 2005, Andrew Pimlott wrote:

...

Even given an ideal implementation (I would add that it should allow multiple modules in one file),

Why?

...

I don't find one type or class per module preferable. I think it's usually a false division.

It helped me to decide for divisions early. What do you use as division criterion?

...

For one, there will be functions that operate on multiple types, so where do you put those?

I try to find out which is the more complex one, which one depends on the other. In this example I consider a matrix as a representation of a linear map (see the long preceding discussion about how matrices can be interpreted :-), where the linear maps operate on vectors. Thus matrices depend on vectors but not vice versa, so I put (basic) matrix-vector operations to the matrix module. More complicated functions can be moved in a separate module.

...

For another, the user will probably have to import several modules, and remember which which module contains a function that operates on multiple types.

Yes, that's the way modularization work. That's not a problem with the T name but general to modularization, isn't it?

...

When it has only small practical advantages (and some disadvantages).

The big advantage is that you can keep the interfaces of all modules consistent. Ideally you can switch an 'import qualified ... as' statement to a different module which serves the same interface with different implementation. For instance you could switch between matrix implementations based on different back-ends. Though, this is probably better solved with type classes.

Andrew Pimlott

5:59 p.m.

New subject: Module.T naming style

On Mon, Jul 18, 2005 at 07:01:08PM +0200, Henning Thielemann wrote:

...

On Fri, 15 Jul 2005, Andrew Pimlott wrote:

...
Even given an ideal implementation (I would add that it should allow multiple modules in one file),

Why?

Mere comfort, given my tools and habits. I sometimes want a new module (say for namespace management), but not a new file (it would be inconvenient or divide logically related pieces). But this leads into a tools war, which I don't want.

...

...
I don't find one type or class per module preferable. I think it's usually a false division.

It helped me to decide for divisions early. What do you use as division criterion?

It wasn't my intent to propose one. :-) But probably based on some fuzzy measure of user convenience.

...

...
For another, the user will probably have to import several modules, and remember which which module contains a function that operates on multiple types.

Yes, that's the way modularization work. That's not a problem with the T name but general to modularization, isn't it?

When you modularize, you can choose to expose bigger or smaller chunks. Indeed, many libraries (eg, Parsec, System.Posix, WASH CGI) do both, reexporting (parts of) several modules in a single module. Your style tends to imply many small modules, which may not be convenient for the user. As a user, I generally prefer a library exposed as a single module. Especially when I am just learning the library, hunting through several modules is a headache--and the beauty of careful module divisions is lost on me!

...

...
When it has only small practical advantages (and some disadvantages).

The big advantage is that you can keep the interfaces of all modules consistent. Ideally you can switch an 'import qualified ... as' statement to a different module which serves the same interface with different implementation. For instance you could switch between matrix implementations based on different back-ends. Though, this is probably better solved with type classes.

I don't see how your style makes it easier to write multiple modules with the same interface. Agreeing on the type name seems no harder than agreeing on the function names. And to the extent that it is "consistent", it seems only a minimal and superficial consistency. I really doubt it is a "big advantage" in practice. Andrew

John Meacham

15 Jul 15 Jul

9:01 p.m.

New subject: Module.T naming style

I have mixed feelings about Module.T, but consider it useful. basically, I use it only when the module naturally only exports a single type which is the purpose of the module (modules which incidentally only export a single type but have a different focus shouldn't use the type synonym) and the modules functions are named in such a way that it is intended to be used in a qualified way. (like the new Data.Map and Data.Set). However, I don't try to force any modules or types into this model if they don't naturally fall into them. I also make sure that the T is a type synonym for the actual name. as in module Vector where data Vector = ... type T = Vector This ensures that tools that strip the module name still give reasonable names to things since most expand type synonyms away. So, in conclusion, I think Vector and Matrix do naturally fall into these conditions so should have T but as type synonyms and not the base name. John -- John Meacham - ⑆repetae.net⑆john⑈

Henning Thielemann

18 Jul 18 Jul

5:05 p.m.

New subject: Module.T naming style

On Fri, 15 Jul 2005, John Meacham wrote:

...

I also make sure that the T is a type synonym for the actual name. as in

module Vector where

data Vector = ... type T = Vector

I had to use type synonymes sometimes to avoid mutually recursive modules. It has the disadvantage that a type synonyme can't be made an instance of a type class. I also like to use names like TypeClass.C for type classes where it is even more difficult because there are no type class synonymes. (Only class A a => B a )

Bulat Ziganshin

8:14 a.m.

New subject: Re[2]: matrix computations based on the GSL

Hello Alberto, Wednesday, July 13, 2005, 8:13:48 PM, you wrote:

...

...
If there are no efficiency concerns, I would drop element-wise operations and prefer a matrix-map and a matrix-zipWith. If these operations shall remain I would somehow point to their element-wise operation in the name.

AR> There is about 5x speed gain if we map in the C side. The "optimized" floating AR> map functions could be moved to a separate module. GHC also have a RULES pragma which can be used to automatically convert, for example, "mmap (*)" to "multipleElementWise". below is examples of using this pragma in the standard GHC modules: {-# RULES "foldr/id" foldr (:) [] = \x->x "foldr/single" forall k z x. foldr k z [x] = k x z "foldr/nil" forall k z. foldr k z [] = z #-} -- Best regards, Bulat mailto:bulatz@HotPOP.com

Henning Thielemann

12:46 p.m.

New subject: Re[2]: matrix computations based on the GSL

On Mon, 18 Jul 2005, Bulat Ziganshin wrote:

...

GHC also have a RULES pragma which can be used to automatically convert, for example, "mmap (*)" to "multipleElementWise".

Nice idea! But how can GHC decide which optimization is better? M.map sin . M.map cos can be optimized to M.map (sin . cos) or M.sin . M.cos Which is better?

Alberto Ruiz

19 Jul 19 Jul

1:11 p.m.

Hello Bulat, thanks a lot for your message, the RULES pragma is just what we need! However, in some initial experiments I have observed some strange behavior. For instance, in the following program: ------------------------------------------ {-# OPTIONS_GHC -fglasgow-exts #-} apply :: (Int -> Int) -> Int -> Int apply f n = f n sqr :: Int -> Int sqr n = n * n optimized_sqr :: Int -> Int optimized_sqr n = n*n+1 -- to check that the rule works :-) {-# RULES "apply/sqr" apply sqr = optimized_sqr #-} main = do --print $ apply sqr 3 print $ apply sqr 5 ----------------------------------------- The rule is not applied. 1 RuleFired 1 *# if we uncomment the first line in the main function main = do print $ apply sqr 3 print $ apply sqr 5 then the rule is correctly applied: 6 RuleFired 2 *# 2 +# 2 apply/sqr Solution: include at the beginning of the file module Main where and then the rule works in both cases. I have a similar problem in the LinearAlgebra library but there, curiously, the rule only works if it is applied once: module Main where import (...) (...) main = do (...) print $ Vector.map cos v --print $ Vector.map cos v ==================== Grand total simplifier statistics ==================== Total ticks: 2584 461 PreInlineUnconditionally 230 PostInlineUnconditionally 387 UnfoldingDone 91 RuleFired 2 *# 16 *## 5 +## 8 ++ 5 -## 2 SPEC $fLinearArray1 2 SPEC $fLinearArray2 1 SPEC $fNumComplex 1 SPEC $fShowComplex 1 Vector.map/cos <--------------- OK 20 int2Double# 4 map 4 mapList 2 plusDouble 0.0 x 4 plusDouble x 0.0 2 timesDouble x 0.0 2 timesDouble x 1.0 3 unpack 3 unpack-list 2 zipWith 2 zipWithList 47 LetFloatFromLet 9 EtaReduction 1136 BetaReduction 6 CaseOfCase 217 KnownBranch 14 SimplifierDone But: (...) main = do (...) print $ Vector.map cos v print $ Vector.map cos v ==================== Grand total simplifier statistics ==================== Total ticks: 2664 470 PreInlineUnconditionally 240 PostInlineUnconditionally 402 UnfoldingDone 90 RuleFired 2 *# 16 *## 5 +## 8 ++ 5 -## 2 SPEC $fLinearArray1 2 SPEC $fLinearArray2 1 SPEC $fNumComplex 1 SPEC $fShowComplex 20 int2Double# 4 map 4 mapList 2 plusDouble 0.0 x 4 plusDouble x 0.0 2 timesDouble x 0.0 2 timesDouble x 1.0 3 unpack 3 unpack-list 2 zipWith 2 zipWithList 49 LetFloatFromLet 9 EtaReduction 1181 BetaReduction 5 CaseOfCase 218 KnownBranch 17 SimplifierDone I have tried several ideas, without any luck. Alberto On Monday 18 July 2005 10:14, Bulat Ziganshin wrote:

...

Hello Alberto,

Wednesday, July 13, 2005, 8:13:48 PM, you wrote:

...
...
If there are no efficiency concerns, I would drop element-wise operations and prefer a matrix-map and a matrix-zipWith. If these operations shall remain I would somehow point to their element-wise operation in the name.

AR> There is about 5x speed gain if we map in the C side. The "optimized" floating AR> map functions could be moved to a separate module.

GHC also have a RULES pragma which can be used to automatically convert, for example, "mmap (*)" to "multipleElementWise". below is examples of using this pragma in the standard GHC modules:

{-# RULES "foldr/id" foldr (:) [] = \x->x "foldr/single" forall k z x. foldr k z [x] = k x z "foldr/nil" forall k z. foldr k z [] = z #-}

Bulat Ziganshin

3:05 p.m.

New subject: Re[2]: matrix computations based on the GSL

Hello Alberto, Tuesday, July 19, 2005, 5:11:27 PM, you wrote: AR> Hello Bulat, thanks a lot for your message, the RULES pragma is just what we AR> need! AR> However, in some initial experiments I have observed some strange behavior. AR> For instance, in the following program: 1) there is no guaranties of any kind, explicit or implied, that your RULE will be ever used :) 2) you can report this as misfeature to GHC team -- Best regards, Bulat mailto:bulatz@HotPOP.com

Frederik Eaton

16 Mar 16 Mar

5:13 p.m.

Hi Alberto, I'm sorry if this has been discussed before... I'm reading your paper, at one point it says (re. the PCA example): "Octave achieves the same result, slightly faster. (In this experiment we have not used optimized BLAS libraries which can improve efficiency of the GSL)" That seems to imply that there is a way to use optimized BLAS libraries? How can I do that? Also, in my experiments (with matrix inversion) it seems, subjectively, that Octave is about 5 or so times faster for operations on large matrices. Presumably you've tested this as well, do you have any comparison results? Thanks, Frederik -- http://ofb.net/~frederik/

Alberto Ruiz

5:48 p.m.

Hi Frederick, I refer to the ATLAS library: http://math-atlas.sourceforge.net/ Some versions of octave use it. I have not yet linked the GSL library with it, you must compile it yourself to take advantage of the optimizations for your architecture, but I think that it must be easy. It is in my TO DO list... :) Curiously, I am just now working in a new (and hopefully improved) version of the GSL wrappers. A first and preliminary approximation to the documentation can be found at: http://dis.um.es/~alberto/GSLHaskell/doc/ The code will be available in two or three days in a darcs repository. I have simplified the wrapper infrastructure and the user interface. Now we can freely work both with real and complex vectors or matrices, still with static type checking. There are also some bug corrections (the eigensystem wrapper destroys its argument!). I made some run time comparisons, precisely in the PCA example with the big matrix. I don't remember the exact difference, but I think that 5 times faster is too much... I will check it as soon as possible. Of course our goal is to have something like a functional "octave" with the same performance (and a much nicer language :) Many thanks for your message! Alberto On Thursday 16 March 2006 18:13, Frederik Eaton wrote:

...

Hi Alberto,

I'm sorry if this has been discussed before...

I'm reading your paper, at one point it says (re. the PCA example): "Octave achieves the same result, slightly faster. (In this experiment we have not used optimized BLAS libraries which can improve efficiency of the GSL)"

That seems to imply that there is a way to use optimized BLAS libraries? How can I do that?

Also, in my experiments (with matrix inversion) it seems, subjectively, that Octave is about 5 or so times faster for operations on large matrices. Presumably you've tested this as well, do you have any comparison results?

Thanks,

Frederik

Alberto Ruiz

17 Mar 17 Mar

12:30 p.m.

On Thursday 16 March 2006 18:13, Frederik Eaton wrote:

...

Also, in my experiments (with matrix inversion) it seems, subjectively, that Octave is about 5 or so times faster for operations on large matrices. Presumably you've tested this as well, do you have any comparison results?

Frederik, you are right, in my machine Octave is at least 3x faster than gsl (v1.5). Too much, specially in a simple matrix multiplication, and I don't know why. See below the times measured in the C side for the PCA example. I will look into this... Alberto gsl 1.5 GHC> main Loading package haskell98-1.0 ... linking ... done. GSL Wrapper submatrix: 0 s GSL Wrapper constant: 0 s GSL Wrapper multiplyR (gsl_blas_dgemm): 0 s GSL Wrapper vector_scale: 0 s GSL Wrapper vector_scale: 1 s GSL Wrapper gsl_vector_add: 0 s GSL Wrapper trans: 0 s GSL Wrapper multiplyR (gsl_blas_dgemm): 36 s <---- (784x5000) x (5000 x 784) GSL Wrapper vector_scale: 0 s GSL Wrapper trans: 0 s GSL Wrapper vector_scale: 0 s GSL Wrapper gsl_vector_add: 0 s GSL Wrapper toScalar: 0 s GSL Wrapper eigensystem: 11 s <---------- (784x784) [337829.65625394206,253504.7867502207, etc. (97.95 secs, 13096315340 bytes) (includes file loading, to be optimized) GNU Octave, version 2.1.57 (i686-pc-linux-gnu). octave:6> testmnist x - repmat(mean(x),rows(x),1) 0.46144 (xc'*xc)/rows(x) 11.623 <-------------- eig 4.3873 <-------------- 3.3776e+05 2.5345e+05 2.0975e+05 etc. ----------octave------------------ load mnist.txt x = mnist(:,1:784); d = mnist(:,785); t0=time(); xc = x - repmat(mean(x),rows(x),1); disp("x - repmat(mean(x),rows(x),1)"); disp(time()-t0) t0=time(); mc = (xc'*xc)/rows(x); disp("(xc'*xc)/rows(x)"); disp(time()-t0) t0=time(); [v,l]=eig(mc); disp("eig"); disp(time()-t0) disp(flipud(diag(l))(1:10)); ------------- GSL + Haskell --------- module Main where import GSL import GSL.Util mean x = sumCols x ./. fromIntegral (rows x) center x = x |-| fromRows (replicate (rows x) (mean x)) cov x = (trans xc <> xc) ./. n where n = fromIntegral (rows x -1) xc = center x m ./. r = m <> (1/r ::Double) test :: M -> [Double] test x = take 10 (toList lambdas) where mc = cov x (lambdas, _) = eig mc main = do m <- fromFile "mnist.txt" let x = subMatrix 0 (rows m-1) 0 (cols m -2) m print (test x)

Alberto Ruiz

18 Mar 18 Mar

5:03 p.m.

The Atlas library (Linux_P4SSE2) seems to be 9x faster (!) than gslcblas in matrix multiplication: ghc-6.4.1 --make examples/pca.hs GSL/debuggslaux.o \ -L$(LIBATLAS) -lcblas /usr/lib/libgsl.a -latlas $ time ./a.out GSL Wrapper gsl_matrix_fscanf: 2 s GSL Wrapper submatrix: 0 s GSL Wrapper constant: 0 s GSL Wrapper constant: 0 s GSL Wrapper multiplyR (gsl_blas_dgemm): 1 s GSL Wrapper vector_scale: 0 s GSL Wrapper multiplyR (gsl_blas_dgemm): 0 s GSL Wrapper vector_scale: 0 s GSL Wrapper gsl_vector_add: 0 s GSL Wrapper trans: 0 s GSL Wrapper multiplyR (gsl_blas_dgemm): 4 s <---(the atlas version) GSL Wrapper vector_scale: 0 s GSL Wrapper trans: 0 s GSL Wrapper vector_scale: 0 s GSL Wrapper gsl_vector_add: 0 s GSL Wrapper toScalar: 0 s GSL Wrapper eigensystem: 11 s [337829.6562539419,253504.78675022084,209790.38118221355, etc.... real 0m17.630s user 0m17.255s sys 0m0.179s So gsl+atlas+haskell is not that bad... Also, the 5000x785 matrix can now be loaded just in 2 seconds, using a wrapper for gsl_matrix_fscanf. We could also try to include some lapack or atlas-lapack routines. Precompiled atlas for different processors can be downloaded from https://sourceforge.net/project/showfiles.php?group_id=23725 However, I have not yet been able to link them in interactive mode. The latest version of the library (extremely provisional, I am currently working actively in it) can be obtained from: darcs get http://dis.um.es/~alberto/GSLHaskell Of course, any contribution or suggestion will be greatly appreciated, including code samples that "should work" with this kind of library, to be used as examples or tests. Best regards, Alberto On Friday 17 March 2006 13:30, Alberto Ruiz wrote:

...

On Thursday 16 March 2006 18:13, Frederik Eaton wrote:

...
Also, in my experiments (with matrix inversion) it seems, subjectively, that Octave is about 5 or so times faster for operations on large matrices. Presumably you've tested this as well, do you have any comparison results?

Frederik, you are right, in my machine Octave is at least 3x faster than gsl (v1.5). Too much, specially in a simple matrix multiplication, and I don't know why. See below the times measured in the C side for the PCA example.

I will look into this...

Alberto

Andrew U. Frank

19 Apr 19 Apr

10:32 a.m.

New subject: recursive matrix computations

it appears possible to define matrix operations like LU-decomposition, SVD, inverse etc. in a recursive fashion (see transpose in the prelude). is this possible? has anybody code in haskell which does this (not asking for high performance here, more instructive value). thank you! andrew

Wilhelm B. Kloke

11:31 a.m.

New subject: recursive matrix computations

Andrew U. Frank schrieb:

...

it appears possible to define matrix operations like LU-decomposition, SVD, inverse etc. in a recursive fashion (see transpose in the prelude).

is this possible? has anybody code in haskell which does this (not asking for high performance here, more instructive value).

I would like to see this, too. Myself I did some experiments in trying to implement the transformation to upper diagonal form by rational Givens transforms. The following code fragment does this recursively: -- rationalGivens :: Fractional a => a -> [a] -> [[a]] -> [[a]] -- marginal cases first rationalGivens qq [x] ((pp:[]):[]) = ( pp + qq * x * x : [] ) : [] rationalGivens _ [] [[]] = [[]] rationalGivens qq xy pmat | qq == 0 = pmat rationalGivens qq (x:y) (ppr:pmat) | x == 0 = ppr:rationalGivens qq y pmat rationalGivens qq (x:y) [[]] = ( qq * x * x : (1/x).*y ) : [] -- main case rationalGivens qq (x:y) ((pp:r):pmat) = let pp' = pp + qq * x * x qq' = pp * qq / pp' s = x * qq / pp' y' = y - x .* r r' = r + s .* y' in ( pp' : r' ) : rationalGivens qq' y' pmat -- rationalGivens 1 [2,1] [[1,0],[1]] == [[5.0,0.4],[1.2]] Arrays are double lists in this code, q a scale factor (starting with 1) xy a row vector to be added to the u.t. matrix pmat. The diagonal elements of pmat contain the square of the scale factor of the row, i.E. the matrix [[5.0,0.4],[1.2]] is to be interpreted as the product (sqrt(5) 0 ) ( 1 0.4 ) ( 0 sqrt(1.2)) ( 0 1 ) -- Dipl.-Math. Wilhelm Bernhard Kloke Institut fuer Arbeitsphysiologie an der Universitaet Dortmund Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257

Henning Thielemann

11:45 a.m.

New subject: recursive matrix computations

On Wed, 19 Apr 2006, Andrew U. Frank wrote:

...

it appears possible to define matrix operations like LU-decomposition, SVD, inverse etc. in a recursive fashion (see transpose in the prelude).

is this possible? has anybody code in haskell which does this (not asking for high performance here, more instructive value).

There is some code by Jan Skibinski called 'indexless linear algebra': http://www.haskell.org/haskellwiki/Libraries_and_tools/Mathematics

Brian Boutel

21 Apr 21 Apr

10:55 a.m.

New subject: recursive matrix computations

On 19/04/2006, at 10:32 PM, Andrew U. Frank wrote:

...

it appears possible to define matrix operations like LU- decomposition, SVD, inverse etc. in a recursive fashion (see transpose in the prelude).

is this possible? has anybody code in haskell which does this (not asking for high performance here, more instructive value).

thank you!

I recall Paul Hudak coming up with a version of LU-decomposition in Haskell using Dooliitle's method. This was in response to a challenge by (I think) Arvind, at a meeting in Mystic CT, in 1989. --brian Brian Boutel Wellington New Zealand

Henning Thielemann

8 Jul 8 Jul

11:24 a.m.

On Thu, 7 Jul 2005, Alberto Ruiz wrote:

...

Hello! Thank you very much for all your suggestions. A new version of the library can be found at:

http://dis.um.es/~alberto/hmatrix/matrix.html

If the Matrix type would be parametrised then Matrix.fromBlocks could use a more natural indexing. Matrix.fromBlocks :: [[Matrix a b]] -> Matrix (Int,a) (Int,b) The Int of the index pair would store the block number and the type a index would store the index within the block. Hm, but it would not be possible to store the sizes of the sub-matrices. It would be possible only if the sub-matrices have the same size. Maybe it is better to allow matrices of matrices and to be able to apply a matrix-matrix multiplication which in turn employs a matrix-matrix multiplication of its elements. But the matrix decompositions will fail on this structure - but possibly those which fail aren't appropriate for block matrices.

Keean Schupke

1:04 p.m.

Henning Thielemann wrote:

...

My objections to making everything a matrix were the objections I sketched for MatLab.

The example, again: If you write some common expression like

transpose x * a * x

Which just goes to show why haskell limits the '*' operator to multiplying the same types. Keep this to Matrix-times-Matrix operators. Surely a vector is a 1xN matrix? And in terms of matrices a 1x1 matrix and a scalar are the same? I don't see the point of having separate matix and vector types... just have 'matrix constructors: matrix :: [[a]] -> Matrix a vector :: [a] -> Matrix a scalar :: a -> Matrix a I don't see any problems with this, and it lets you define the matrix exponential: instance Floating a => Floating (Matrix a) where exp = undefined Type of exp in Floating is (a -> a -> a), so substituting the class parameter gives: exp :: Matrix a -> Matrix a -> Matrix a, however you can only compute the matrix exponential (x^y) for "scalar-x matrix-y" or "matrix-x scalar-y"... I feel using separate types for vectors and scalars just overcomplicates things...

...

then both the human reader and the compiler don't know whether x is a "true" matrix or if it simulates a column or a row vector. It may be that 'x' is a row vector and 'a' a 1x1 matrix, then the expression denotes a square matrix of the size of the vector simulated by 'x'. It may be that 'x' is a column vector and 'a' a square matrix. Certainly in most cases I want the latter one and I want to have a scalar as a result. But if everything is a matrix then I have to check at run-time if the result is a 1x1 matrix and then I have to extract the only element from this matrix. If I omit the 1x1 test mistakes can remain undiscovered. I blame the common notation x^T * A * x for this trouble since the alternative notation can be immediately transcribed into computer language code while retaining strong distinction between a vector and a matrix type.

If x is a matrix and y is a matrix then x * y can only be interpreted in one simple way - no special cases. Sometimes you may need to transpose a 1xN into an Nx1 but at least you will get an 'error' if you try to use the wrong type...

...

More examples? More theory?

More complexity? Keean.

Henning Thielemann

1:54 p.m.

On Fri, 8 Jul 2005, Keean Schupke wrote:

...

Henning Thielemann wrote:

...
My objections to making everything a matrix were the objections I sketched for MatLab.

The example, again: If you write some common expression like

transpose x * a * x

Which just goes to show why haskell limits the '*' operator to multiplying the same types. Keep this to Matrix-times-Matrix operators.

Btw. the interface file by Alberto gives you uniform usage of an operator symbol (<>) while retaining full type safety by functional dependencies. http://dis.um.es/~alberto/hmatrix/doc/LinearAlgebra.Interface.html This is a good solution, I think.

...

I feel using separate types for vectors and scalars just overcomplicates things...

I'm excited if your code gets swamped by conversions between Double and Matrix then. I really plead for representing everything with strings, this is the most simple and most flexible solution! :-]

Keean Schupke

2:08 p.m.

Henning Thielemann wrote:

...

I'm excited if your code gets swamped by conversions between Double and Matrix then. I really plead for representing everything with strings, this is the most simple and most flexible solution! :-]

Surely its a case of balancing the advantage of type safety against the added complexity. The point after all is to reduce bugs. Type-sigs reduce one type of bug, at the cost of an increase in complexity. My argument is that at some point the probability of introducing errors due to the increased complexity outweighs the reduction in the probability of error due to type-safety? I think this is a pragmatic point, that cannot be argued with, all you can argue over is where in the continuum this point is. As such we were dicussing the complexity/type-safety trade off with respect to matrices - to go from this to such a sweeping statment... is like saying 'if I can't have it my way I don't want anything at all"... Keean.

Henning Thielemann

29 Jun 29 Jun

9:44 p.m.

On Wed, 29 Jun 2005, Jacques Carette wrote:

...

<sarcasm>Next thing you know, you'll want a different 'application' symbol for every arity of function, because they are ``different''. </sarcasm>

Btw. there is less sarcasm in it as may you think. There was already a proposal to extend function application: http://www.haskell.org/pipermail/haskell/2002-October/010629.html and guess - I would be very unhappy about such an extension because it mixes the representation of a map with the map itself.

Jacques Carette

10:18 p.m.

Henning Thielemann wrote:

...

On Wed, 29 Jun 2005, Jacques Carette wrote:

...
<sarcasm>Next thing you know, you'll want a different 'application' symbol for every arity of function, because they are ``different''. </sarcasm>

Btw. there is less sarcasm in it as may you think. There was already a proposal to extend function application: http://www.haskell.org/pipermail/haskell/2002-October/010629.html and guess - I would be very unhappy about such an extension because it mixes the representation of a map with the map itself.

There are no maps in Haskell (or in any syntactic lambda calculus), only representations of maps. It just turns out that things of type -> are builtin representations of maps, where other representations are not first class ``maps''. This is a bias of all lambda-calculus based languages. In ZFC, there are no maps, just sets, yet you can do lots of mathematics and CS in ZFC ;-). I happen to believe that a more structure-centric view of the world (a la category theory) reveals more than an object-centric view a la ZFC. But even arrows in a category sometimes turn out to be too restrictive. They are too ``functional'' instead of being more ``relational''. Once you realize that \x.x is *not* a function, but a denotation of a function, then a proposal like the one you point to starts to make a lot of sense. I rather like it. Jacques

karczma＠info.unicaen.fr

10:40 p.m.

Jacques Carette writes:

...

Henning Thielemann wrote:

...

...
I don't see the problem. There are three very different kinds of multiplication, they should also have their own signs: Scalar product, matrix-vector multiplication, matrix-matrix multiplication.

...

You see 3 concepts, I see one: multiplication. Abstract Algebra is the branch of mathematics where you want to abstract out the *unimportant* details. Much progress was made in the late 1800's when this was discovered by mathematicians ;-).

One of the things I appreciate and I hate simultaneously in your postings is that you are so categorical. This time you will *not* convince me that there is "one concept: multipli- cation", moreover "abstracted over unimportant details". If matrices represent operators, their multiplication is a *group* operation, the op. composition. Acting of a matrix on a vector is not. "Multiplication" of two vectors giving a scalar (their contration) is yet another beast. I believe that some progress has been done in math, when people discovered that mixing-up things is not necessarily a good thing, and different entities should be treated differently. Jerzy Karczmarczuk

Jacques Carette

11:42 p.m.

karczma@info.unicaen.fr wrote:

...

One of the things I appreciate and I hate simultaneously in your postings is that you are so categorical.

'tis indeed simultaneously one of my strengths and one of my weaknesses ;-) I also like to play Devil's Advocate, to draw out the interesting arguments. Luckily for me (but not my readers), I readily admit when I'm wrong...

...

This time you will *not* convince me that there is "one concept: multipli- cation", moreover "abstracted over unimportant details". If matrices represent operators, their multiplication is a *group* operation, the op. composition. Acting of a matrix on a vector is not. "Multiplication" of two vectors giving a scalar (their contration) is yet another beast.

They are all operations that have signatures a -> b -> c where either a=b or b=c, is that enough structure? ;-) More seriously, what about forall A. 0.A = 0 [0 matrix, 0 matrix] forall x. 0.x = 0 [0 matrix, 0 vector] forall x. 0.x = 0 [0 vector, 0 scalar] where x is a vector and A matrix. Also there is something strangely similar in forall x y. = [matrix, matrix] forall x y. = [matrix, vector] forall x y = [vector, vector] where x,y are vectors, A,B matrices. There is some structure relating to the identity matrix too, but it is a little more contrived. That's a lot of similar structure for 3 ``unrelated'' operations, isn't it? However, you might still be able to convince me that I was stretching this a bit too far on that particular point.

...

I believe that some progress has been done in math, when people discovered that mixing-up things is not necessarily a good thing, and different entities should be treated differently.

I agree. I certainly like going back to Newton to see that he made a difference between derivatives and fluxions (ie static vs dynamic derivatives) and Grassmann to make the difference between different kinds of vectors (and linear operators and ...). Cauchy also knew some things about solutions to real linear ODEs that are quite fascinating [if you ``line up'' the singularities of the coefficients of an ODE with the ODEs own singularities, you can get more solutions than the order of the ODE -- see his 1821 Ecole Polytechnique lecture notes] that are not included in most theorems about ODEs, because few appreciate the real qualitative ``difference'' it makes to allow functions with singularities in the coefficients of an ODE. But just as much progress has been made when ``different'' things were found to have a lot of similar structure. Or at least that is the main lesson I draw from category theory. I draw similar lessons from Euler's total disregard for convergence (with Tauberian theorems and the work of Ecalle justifying him). I like to find whatever scraps of underlying structure are present between disparate looking concepts, just as much as I like seeing subtle differences between concepts that had not been noticed before. Jacques

David Roundy

30 Jun 30 Jun

11:50 a.m.

On Wed, Jun 29, 2005 at 10:23:23PM +0200, Henning Thielemann wrote:

...

More specific: You give two different things the same name. You write A*B and you mean a matrix multiplication. Matrix multiplication means finding a representation of the composition of the operators represented by A and B. But you also write A*x and you mean the matrix-vector multiplication. This corresponds to the application of the operator represented by A to the vector x. You see: Two different things, but the same sign (*). Why? You like this ambiguity because of its conciseness. You are used to it. What else? But you have to introduce an orientation of vectors, thus you discriminate two things (row and column vectors) which are essentially the same! What is the overall benefit?

I just wanted to insert a few thoughts on the subject of row and column vectors. (And I'm not going to address most of the content of this discussion.) If we support matrix-matrix multiplication, we already automatically support matrix-column-vector and row-vector-matrix multiplication, whether or not we actually intend to, unless you want to forbid the use of 1xn or nx1 matrices. So (provided we *do* want to support matrix-matrix multiplication, and *I* certainly would like that) there is no question whether we'll have octavish/matlabish functionality in terms of row and column vectors--we already have this behavior with a single multiplication representing a single operation. There is indeed a special case where one or more of the dimensions is 1, but it's just that, a special case, not a separate function. If you want to introduce a more general set of tensor datatypes, you could do that, but matrix arithmetic is all *I* really need. I agree that when you add other data types you'll need other operators (or will need to do something more tricky), but starting with the simple case seems like a good idea, especially since the simple case is the one for which there exist fast algorithms (i.e. level 3 BLAS). There are good reasons when if I've got a set of vectors that I'd like to multiply by a matrix that I should group those vectors into a separate matrix--it's far faster (I'm not sure how much, but usually around 10 times faster, sometimes more). In fact, I'd think it's more likely that a general tensor library will be implemented using matrix operations than the other way around, since it's the matrix-matrix operations that have optimized implementations. In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point, since they're where most of the interesting algorithms are (that is, the ones that are interesting to me, such as diagonalization, svd, matrix multiplication, etc). -- David Roundy http://www.darcs.net

Henning Thielemann

12:20 p.m.

On Thu, 30 Jun 2005, David Roundy wrote:

...

If we support matrix-matrix multiplication, we already automatically support matrix-column-vector and row-vector-matrix multiplication, whether or not we actually intend to, unless you want to forbid the use of 1xn or nx1 matrices. So (provided we *do* want to support matrix-matrix multiplication, and *I* certainly would like that) there is no question whether we'll have octavish/matlabish functionality in terms of row and column vectors--we already have this behavior with a single multiplication representing a single operation.

Of course, I only wanted a separate vector type (which also means a separate matrix-vector multiplication) and I argued against a further discrimination of row and column vectors.

...

If you want to introduce a more general set of tensor datatypes,

Did someone requested tensor support? At least not me. I used the tensor example to show that MatLab throws them all together with matrices and vectors and I wanted to give an idea how to make it better by separating them.

...

In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point,

Matrices _and_ vectors! Because matrices represent operators on vectors and it is certainly not sensible to support only the operators but not the objects they act on ... Adding a vector type by a library that is build on top of a matrix library seems to me like making the first step after the second one.

David Roundy

12:57 p.m.

On Thu, Jun 30, 2005 at 02:20:16PM +0200, Henning Thielemann wrote:

...

On Thu, 30 Jun 2005, David Roundy wrote:

...
If we support matrix-matrix multiplication, we already automatically support matrix-column-vector and row-vector-matrix multiplication, whether or not we actually intend to, unless you want to forbid the use of 1xn or nx1 matrices. So (provided we *do* want to support matrix-matrix multiplication, and *I* certainly would like that) there is no question whether we'll have octavish/matlabish functionality in terms of row and column vectors--we already have this behavior with a single multiplication representing a single operation.

Of course, I only wanted a separate vector type (which also means a separate matrix-vector multiplication) and I argued against a further discrimination of row and column vectors.

...
If you want to introduce a more general set of tensor datatypes,

Did someone requested tensor support? At least not me. I used the tensor example to show that MatLab throws them all together with matrices and vectors and I wanted to give an idea how to make it better by separating them.

I disagree. Vectors *are* tensors... and once you've added rank-1 tensors, you'll also be wanting to add support for rank-0 tensors. These are nice objects (and would be nice to support eventually), but they add considerable complexity. There's only one matrix-matrix multiplication, but there are two vector-vector multiplications, two vector-matrix multiplications. All four of these multiplications can be expressed in terms of the single matrix-matrix multiplication. I agree that if we introduce a vector type there's no reason to introduce separate row and column vectors, but I think we can get by quite well for many purposes without introducing the vector type, which adds so much complexity. I guess I've left out the ".*" pointwise multiply, probably because I'm a physicist and am tensor-biased, and it's not a tensor operation in the physics sense (it doesn't behave right under unitary transformations of its arguments).

...

...
In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point,

Matrices _and_ vectors! Because matrices represent operators on vectors and it is certainly not sensible to support only the operators but not the objects they act on ... Adding a vector type by a library that is build on top of a matrix library seems to me like making the first step after the second one.

No, matrices operate on matrices and return matrices. This is the wonderful thing about matrix arithmetic, why it's unique, and why I'd like to have a library that supports matrix arithemetic. Tensors are also nice, but are much more complicated, since most tensor multiplications change the rank of the tensor--and that complication is precisely what you want. You're taking an interperetation of what you mean by the math and wanting to encode it in the type system. Matrix arithmetic is entirely self-contained, and I'd like to have *that* reflected in the type system. -- David Roundy http://www.darcs.net

Jacques Carette

1:19 p.m.

David Roundy and Henning Thielemann have been arguing about the nature of vectors and matrices, in particular saying:

...

On Thu, Jun 30, 2005 at 02:20:16PM +0200, Henning Thielemann wrote:

...
On Thu, 30 Jun 2005, David Roundy wrote:

Matrices _and_ vectors! Because matrices represent operators on vectors and it is certainly not sensible to support only the operators but not the objects they act on ... Adding a vector type by a library that is build on top of a matrix library seems to me like making the first step after the second one.

No, matrices operate on matrices and return matrices. This is the wonderful thing about matrix arithmetic, why it's unique, and why I'd like to have a library that supports matrix arithemetic.

The really funny thing about that exchange is that you are *both right* ! You're just using different interpretations of the same objects. 1) Matrices represent linear operators which naturally act (via application) on vectors 2) Matrices of compatible sizes, almost form a non-commutative graded ring. It does not matter what a matrix represents here, this is true purely algebraically. [to be a proper ring, the ``compatible sizes'' condition would need to be dropped] There is a problem that the _types_ associated with both interpretations are quite different. But if you want 2 different products on vectors (inner and outer) represented as matrices, you have no choice -- the ``inner product'' will return a 1x1 matrix, not a real. The same thing is true for differential operators, BTW. They can either represent actions on spaces of smooth-enough functions, or represent elements of a Weyl Algebra (or Ore Algebra if you really want to be algebraic). You end up having the dichotomy of algebraic D-modules versus analytic D-modules, where they share a number of theorems, but the ``corner'' cases behave quite differently. Jacques

Keean Schupke

5 Jul 5 Jul

5:39 p.m.

David Roundy wrote:

...

In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point, since they're where most of the interesting algorithms are (that is, the ones that are interesting to me, such as diagonalization, svd, matrix multiplication, etc).

This is a really good idea. I would like a Matrix library soon, not in 6 years time. Slice it up into managable pieces and keep it simple! Keean.

Henning Thielemann

5:49 p.m.

On Tue, 5 Jul 2005, Keean Schupke wrote:

...

David Roundy wrote:

...
In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point, since they're where most of the interesting algorithms are (that is, the ones that are interesting to me, such as diagonalization, svd, matrix multiplication, etc).

This is a really good idea. I would like a Matrix library soon, not in 6 years time. Slice it up into managable pieces and keep it simple!

As I said, _that_ already exists: MatLab, HaskellDSP (with some simple new definitions for infix operators) ...

David Roundy

7 Jul 7 Jul

12:23 p.m.

On Tue, Jul 05, 2005 at 07:49:56PM +0200, Henning Thielemann wrote:

...

On Tue, 5 Jul 2005, Keean Schupke wrote:

...
David Roundy wrote:

...
In short, especially since the folks doing the work (not me) seem to want plain old octave-style matrix operations, it makes sense to actually do that. *Then* someone can implement an ultra-uber-tensor library on top of that, if they like. And I would be interested in a nice tensor library... it's just that matrices need to be the starting point, since they're where most of the interesting algorithms are (that is, the ones that are interesting to me, such as diagonalization, svd, matrix multiplication, etc).

This is a really good idea. I would like a Matrix library soon, not in 6 years time. Slice it up into managable pieces and keep it simple!

As I said, _that_ already exists: MatLab, HaskellDSP (with some simple new definitions for infix operators) ...

Except that matlab isn't a Matrix library--it's a horribly language--and HaskellDSP doesn't seem to implement linear algebra and its interface is such that it is probably very difficult to implement efficiently (since efficient implementation means calling lapack). In my opinion a decent matrix API needs to have an opaque data type. In other words, using HaskellDSP to create a decent haskell Matrix library would be about as much work as using lapack to do so. -- David Roundy http://www.darcs.net

Alberto Ruiz

29 Jun 29 Jun

5:46 p.m.

Henning:

...

I was wrong, the different names are synonymes for the same type. :-(

I agree that we must statically distinguish Vector and Matrix (see below).

...

Some notes: I would not call it a matrix library but a linear algebra library. Then setup modules like LinearAlgebra.Matrix, LinearAlgebra.Vector and so on and move functions related to each type into these modules. Vector is more basic than Matrix, thus Matrix should import Vector, not vice versa, it should contain Matrix-Vector operations. The Vector module could contain e.g. the scalar product. I also find it good style not to repeat the modules name in function and type names. E.g. Matrix.mulVector is a good function name, but the type name should also not be Matrix.Matrix but say Matrix.T. Unfortunately this style is not very widely spread currently, I wished it would be more.

You are right, module structure and names are provisional.

...

Then you may consider no to restrict the library to Double. Most operations can be defined in terms of any numbers (including even say matrices of matrices). This can be handled by type classes. The instances for Double can invoke GSL, others may use Haskell routines.

Ok. Currently we can have "blocks" of Storable types. Block Double works with the GSL and Block Int is only in Num, using Haskell. Clearly we should admit other any number as base type.

...

If you are interested I have here some algorithm for computing the determinant of a n by n matrix in n^4 steps which does not need division and thus can be used for integers and polynomials.

Of course I would like to see it, and we can include it in the library... David:

...

The one thing I'd like to see (and here I agree with Hennig) is a distinction between matrices and tensors. Your library is really very tensor-based, but tensors and matrices are very different beasts.

I imagine one could take your Block, which is really a sort of generalized tensor, and implement a Matrix type such as

data Matrix = M (Block Double)

(or perhaps for arbitrary element type) with the constructor not exported, so that Matrices will always be guaranteed to be two-dimensional.

Then for matrices one could define * to be matrix multiplication, sqrt, exp, cos, sin etc to be matrix functions (expm etc in octave-speak), and then define .* and ./ to be as defined in octave.

This definition would allow considerably more type-safeness than your current implementation (which seems scarily dynamically typed to me).

To me too! Due to the type trickery required to statically check matrix dimensions in this first version I also used a common type for blocks with any number of indices, which ruins many of the Haskell advantages. I will work on your idea. We can have different types for scalar, vector and matrix, with elementwise numeric operations, and using multiparamenter classes we can statically check typical multiplications and other functions. We can also define a constructor from vectors or matrices to arbitrary blocks, to be used (dynamically) in more general but less frequent situations.

...

Alas, we'd still not have the truly strong typing I'd dream about where one could define

matMul :: Int n, m, l => Matrix n m -> Matrix m l -> Matrix n l

which as I understand things, isn't possible in Haskell without some sort of weird type trickery. Of course, if you had this kind of type trickery, you might not need to declare a separate Matrix type, since you'd be able to represent the dimensionality of the Block in its type.

And hopefully you and he can work together to create a great library (and I'll be able to mooch off of whatever you create...). :)

Of course! I am very happy to know that other people is also interested in a simple "scientific computing" library for Haskell. Alberto

Frederik Eaton

9 Dec 9 Dec

6:30 p.m.

Hi, I've just looked through this discussion, since I'm working on my own library, I wanted to see what people are doing. It's something like this, using the Prepose (Implicit Configurations) paper: data L n = L Int deriving (Show, Eq, Ord) -- singleton domain type S = L Zero class (Bounded a, Ix a) => IxB a instance ReflectNum n => Bounded (L n) where minBound = L 0 maxBound = L $ reflectNum (__ :: n) - 1 mul :: (Num k, IxB a, IxB b, IxB c) => Matrix k a b -> Matrix k b c -> Matrix k a c add :: (Num k, IxB a, IxB b) => Matrix k a b -> Matrix k a b -> Matrix k a b vec :: (Num k, IxB a, IxB b) => Matrix k a b -> Matrix k (a,b) S This way one can form a type "L n" which represents integers between 0 inclusive and n (rather, 'reflectNum (__ :: n)') exclusive, which can serve to index the matrices and vectors... Of course, other index types are allowed, such as "Either a b" - if we want to, say, take the sum of two vector spaces, one indexed by "a" and the other by "b", then the result should be indexed by "Either a b", etc. - we never have to do anything with the type-level numerals other than assert equality, i.e. we don't have to be able to add or multiply them in our type-signatures, since structural operations will suffice. I think having the ability to guarantee through the type system that column and row dimensions are correct is of paramount importance to those who would use a matrix library, but so far in this thread I haven't seen any suggestions which would accomplish that. I'm sorry if I didn't read carefully. Does my approach not work? I haven't filled in the implementation yet, but it type-checks. Frederik

7103

Age (days ago)

7399

Last active (days ago)

List overview

Download

97 comments

22 participants

participants (22)

Aaron Denney
Alberto Ruiz
Andrew Pimlott
Andrew U. Frank
Benjamin Franksen
Brian Boutel
Bulat Ziganshin
Claus Reinke
Conal Elliott
Dan Piponi
David Roundy
David Roundy
Frederik Eaton
Glynn Clements
Henning Thielemann
Jacques Carette
John Meacham
karczma＠info.unicaen.fr
Keean Schupke
Lennart Augustsson
Michael Walter
Wilhelm B. Kloke