
Sorry for the delayed response, which has now led to two separate threads. See below.
From: Henning Thielemann
On Wed, 29 Jun 2005, Conal Elliott wrote:
On row & column vectors, do you really want to think of them as {1,...,n)->R? They often represent linear maps from R^n to R or R to R^n, which are very different types. Similarly, instead of working with matrices, how about linear maps from R^n to R^m? In this view, column and row vectors, matrices, and often scalars are useful as representations of linear maps.
We should not identify things which can be mapped bijectively. 1" "and 1 are very different [...]
I think matrices and derivatives are very different issues. [...]
Of course. My suggestion is to use linear maps instead of vectors or matrices when the vectors or matrices serve as representations of linear maps.
I have often seen that the first derivative is considered as vector, and the second derivative is considered as matrix.
I'm guessing you mean for derivatives of functions in R^n->R.
In this spirit it is used like x^T * (D2 f)(x) * x but this is only abuse of the common multiplication definitions. A good interpretation and notation should seamless extend to higher derivatives. But the interpretation above does not work in higher dimensions.
What does work I think, for all degrees of derivatives and all dimensions of vector spaces (and well beyond R^n), is keeping a clear distinction between linear maps and representations of linear maps. Linear maps get composed and applied, but certainly not multiplied.
I like the following type for derivation. derive :: ((i -> a) -> b) -> ((i -> a) -> (i -> b)) Here i is the index type, (i -> a) is the vector type, b is the type the vector function maps to.
This formulation is not much more general than R^n (i.e., {1,..,n}->R). The vectors are still restricted to "tuples" (indexed by i) of elements of the same type, right?
Its derivative has the same type of argument, but the result is a vector with indices of type i. You see that it is easy to repeat the application of 'derive', just replace b by say i->b. The second derivative yields vectors of type (i -> i -> b). This can be interpreted as matrix because it has two indices. But this is certainly not a matrix which represents a linear mapping as usual, but it is a matrix representing a bilinear form. The only thing we need is a multiplication to reduce one level of indices. mul :: (i -> c) -> (i -> b) -> b Though, what we still need is a general (overloaded?) definition of
the
scaling of b by c and a sum of b.
I prefer something like the following instead, where "VS s u" means that u is a vector space over the scalar field s.
derive :: (VS s u, VS s v) => (u -> v) -> (u -> LMap u v)
Since VS s (LMap u v), the result of derive may be given back to derive. Second derivatives then have type u -> LMap u (LMap u v), where as we'd expect, LMap u (LMap u v) is isomorphic to the type of bilinear maps from u to v. By using LMap instead of Matrix, we're not tempted to confuse (for instance) linear and bilinear maps, just as you pointed out. - Conal