Re: [Haskell] Typing in haskell and mathematics

Hello,
On Fri, 28 Jan 2005 10:01:33 -0500, Jacques Carette
The previous post on record syntax reminded me of some 'problems' I had noticed where Haskell and mathematics have a (deep) usage mismatch. It is interesting that my sentiment is exactly the opposite. Mathematicians (the applied kind, not the pure) tend to be quite sloppy about their notation relaying _a lot_ on the context and the fact that people know what they are talking about. I find that this makes their explanations harder to understand rather than easier.
First, consider a syntax for other component-wise function application? For example, it would be convenient to have (f,g) @ (x,y) be (f x, g y). In some languages [with dynamic typing], one can even do (f,g) (x,y) :-) I am not sure what you mean, but you can define exactly that (except for the @ sign which is special syntax already): (f,g) `ap2` (x,y) = (f x, g y)
Yes, I am aware that this untypeable in Haskell, because polymorphism is straight-jacketed by structural rules. What do you mean by this: ap2 :: (a -> b, c - d) -> (a,c) -> (b,d)
But
in mathematics, it is convenient and extremely common to: 1) look at the type a -> b as being a *subtype* of b (though I have never seen it phrased that way in print) I don't think subtyping is what you are thinking of. Perhaps you are referring to the correspondance between b-expressions with a free a-variable, and functions a->b. In Haskell we don't have expressions with free variables as first class values. There might be some interesting things to work out for such beasts.
2) to consider every 'natural transformation' available and use it (tuples of functions <-> obvious function on tuples) Are you suggesting that natural transformations should be applied automatically? There are many problems with this: which operations should happen automatically? Type inference (if it works at all) will be hard to do, but perhaps these are technical details. Fundamentally the "sloppy" math notation assumes that what the author wrote is correct and should simply be given "the obvious" meaning. This is pretty much the idea with dynamic typing. I find the restrictions of static typing very convenient (well most of the time :-) as they capture "stupid' errors.
Seeing a -> b as a subtype of b allows one to take any function f : b -> b -> c (like + ) and 'lift' it to ff : (a -> b) -> (a -> b) -> (a -> c) via the obvious definition ff g h = \x -> f (g x) (h x) This is done routinely in mathematics. The meaning of the expression (sin + cos) should be immediately obvious. Such 'lifting' is used even in first-year calculus courses, but not in Haskell.
Apart from the fact that this does not seem like a good idea, you can already do things like this in Haskell: liftOp op f g x = f x `op` f x This is all you need, especially if you are prepared to stick to the "usual" mathematical convention of not using curried functions, but instead using tuples to pass multiple arguments.
The "same" phenomenon is what allows one to 'see' that there is an obvious function apply_tuple:: (a->b,c->d) -> (a,c) -> (b,d) so that (f,g) (a,b) only has one possible meaning, if the built-in apply function were to be overloaded.
There are many interesting things one could do if application was to be overloaded but this hardly seems like a motivating example. You can already use the 'ap2' combinator from above. I can see that it is annoying to have ap2, ap3, etc. You can define a generic "ap_n" using dependent types, or using the encoding tomasz posted, but that hardly seems to be a "deep" problem.
Similarly, there are "obvious" maps apply_T :: T (a->b) -> a -> T b for any datatype constructor T (as has been noticed by all who do 'generic' programming).
Is this really true? Consider: data T a = T (a -> Int) What shoud apply_T do? applyT :: (essentially) ((a -> b) -> Int) -> a -> (b -> Int) perhaps: applyT (T f) a = T (\b -> f (const b)) but that probably does not behave as you would like it. Of course you could restrict the datatypes for which apply_T can be derived. This means that
[f, g, h] x == [f x, g x, h x] but also (in mathematics at least) {f, g, h} x == {f x, g x, h x} where {} denotes a set, and the equality above is extensional equality, and so on. I agree that it would be nice to automatically derive functions such as apply_T, etc. This (as you point out) is the idea of generic programming. This however does not suggest that we should replace application...
Note that some computer algebra systems use #1 and #2 above all the time to define the operational meaning of a lot of syntactic constructs [where I personally extended the use of such rules in the implementation & operational semantics of Maple]. What hope is there to be able to do something 'similar' in a Haskell-like language? I think it is already possible to do simillar things. I am sure there is room for improvement, but it is important that the consequences of changing a language are worked out in details. It is time for me to go to school :-) bye Iavor

On Fri, 28 Jan 2005, Iavor Diatchki wrote:
Hello,
On Fri, 28 Jan 2005 10:01:33 -0500, Jacques Carette
wrote: The previous post on record syntax reminded me of some 'problems' I had noticed where Haskell and mathematics have a (deep) usage mismatch. It is interesting that my sentiment is exactly the opposite. Mathematicians (the applied kind, not the pure) tend to be quite sloppy about their notation relaying _a lot_ on the context and the fact that people know what they are talking about. I find that this makes their explanations harder to understand rather than easier.
This seems to be related to what I wrote yesterday http://www.haskell.org/pipermail/haskell-cafe/2005-January/008893.html I've collected some examples of abuse and bad mathematical notation: http://www.math.uni-bremen.de/~thielema/Research/notation.pdf

[I have now subscribed to haskell-cafe]
Henning Thielemann
This seems to be related to what I wrote yesterday http://www.haskell.org/pipermail/haskell-cafe/2005-January/008893.html
Yes, very much. Except that rather than trying to tell mathematicians that they are *wrong*, I am trying to see which of their notations I can explain (in a typed way). There will be some 'leftovers', where the notation is simply bad.
I've collected some examples of abuse and bad mathematical notation: http://www.math.uni-bremen.de/~thielema/Research/notation.pdf
Some of what you point out there is bad notation. Other bits point to some misunderstanding of the issues. Starting from your original post:
f(x) \in L(\R) where f \in L(\R) is meant
F(x) = \int f(x) \dif x where x shouldn't be visible outside the integral
First, mathematicians like to write f(x) to indicate clearly that they are denoting a function. This is equivalent to writing down (f \in -> ) with domain/range omitted. Second, every mathematician knows that \int f(x) \dif x == \int f(y) \dif y (ie alpha conversion), so that combined with the previous convention, there is no confusion in writing F(x) = \int f(x) \dif x. It is just as well-defined as (\x.x x) (\x.x x) which requires alpha-conversion too for proper understanding. You also write
O(n) which should be O(\n -> n) (a remark by Simon Thompson in The Craft of Functional Programming) but the only reason for this is that computer scientists don't like open terms. Since the argument to O must be a univariate function with range the Reals, then whatever is written there must *denote* such a function. The term ``n'' can only denote only one such function, \n -> n. So the mathematician's notation is in fact much more pleasing.
However, you have to remember one crucial fact: a set of typographical symbols are meant to *denote* a value, they are not values in themselves. So there is always a function around which is the ``denotes'' function that is implicitly applied. Russell's work in the early 1900, "sense and denotation" are worth reading if you want to learn more about this. What is definitely confusing is the use of = with O notation. The denotation there is much more complex - and borders on incorrect.
a < b < c which is a short-cut of a < b \land b < c
That is plain confusion between the concepts of ``denotation'' and ``value''. Where < is a denotation of a binary function from bool x bool -> bool, _ < _ < _ is a mixfix denotation of a constraint, which could be denoted in a long-winded fashion by p a b c = a b \in )a,c( where I am using mathematical notation for the body above. On your ``notation.pdf'' (link above), you have some other mis-interpretations. On p.10 you seem to think that Mathematica is a lazy language, when it is in fact an eager language. So your interpretation does not ``make sense''. Not that your observation is incorrect. In Maple, there are two functions, eval(expr, x=pt) and subs(x=pt, expr) which do ``similar'' things. But subs is pure textual substitution (ie the CS thing), whereas 2-argument eval means "evaluate the function that \x -> expr denotes at the point pt" (ie the math thing). The interesting thing is that "the function that \x -> expr denotes" is allowed to remove (removeable) singularities in its ``denotation'' map. However,
subs(x=2,'diff'(ln(x),x)) ; diff(ln(2),2) where the '' quotes mean to delay evaluation of the underlying function. On the other hand eval('diff'(ln(x),x),x=2) ; eval('diff'(ln(x),x),x=2) because it makes no sense to evaluate an open term which introduces a (temporary) binding for one of its variables.
Note that without access to terms, it is not possible to write a function like diff (or derive as you phrase it, or D as Mathematica calls it). Mathematician's diff looks like it is has signature diff: function -> function, but it in fact is treated more often as having signature diff: function_denotation -> function_denotation. But you can see a post by Oleg in December on the haskell list how it is possible (with type classes) in Haskell to automatically pair up function and function_denotation. You also seem to assume that set theory is the only axiomatization of mathematics that counts (on p.31). I do not see functions A -> B as somehow being subsets of powerset(A x B). That to me is one possible 'implementation' of functions. This identification is just as faulty as the one you point out on p.14 of the naturals not ``really'' being a subset of the rationals. In both cases, there is a natural embedding taking place, but it is not the identity. You also have the signature of a number of functions not-quite-right. ``indefinite'' integration does not map functions to functions, but functions to equivalence classes of functions. Fourier transforms (and other integral transforms) map into functionals, not functions. I hope your audience (for things like slide 35) was made of computer scientists - it is so amazingly condescending to thousands of mathematicians, it is amazing you would not get immediately booted out! On p.37, you have polynomials backwards. Polynomials are formal objects, but they (uniquely) denote a function. So polynomials CANNOT be evaluated, but they denote a unique function which can. On p.51 where you speak of ``hidden'' quantifiers, you yourself omit the thereexists quantifiers that are implicit on the last 2 lines -- why? The above was meant to be constructive -- I apologize if it has come across otherwise. This is an under-studied area, and you should continue to look at it. But you should try a little harder to not assume that thousands of mathematicians for a couple of hundred years (at least) are that easily ``wrong''. Redundant, sub-optimal, sure. Jacques

On Fri, 28 Jan 2005, Jacques Carette wrote:
Henning Thielemann
wrote: This seems to be related to what I wrote yesterday http://www.haskell.org/pipermail/haskell-cafe/2005-January/008893.html
Yes, very much. Except that rather than trying to tell mathematicians that they are *wrong*, I am trying to see which of their notations I can explain (in a typed way). There will be some 'leftovers', where the notation is simply bad.
I know that's problematically to say that a notation is "wrong" since no-one is authorized to say, what's "right". Though, if notation is contradictory, I don't know what's wrong with calling it wrong. Let me motivate a bit more, why I'm interested in mathematical notation. I (and certainly every mathematician) learnt that for instance equations are a powerful tool to solve problems. You can model many problems with a mathematical equation, you can transform the equation using some accepted rules and eventually you get some simplified equation like 'x=2'. The problem is abstracted, you do the transformations without thinking about an interpretation in the real world. You know that you can trust the rules. If someone shows you that the simplified equation is wrong you know that you have applied some rule in the wrong way or you invented an improper rule. There are many tools in the past which have been introduced to make mathematics more convenient, more safe, somehow better. Arabian numbers almost replaced Roman numbers because they are more convenient for computations. Variables and formulas were introduced as a more precise alternative to phrases in natural languages. Axioms replaced intuitive definitions. But there are still issues which are unsatisfying. E.g. on the one hand calculus and functional analysis "degrades" functions to ordinary objects of mathematics, thus reaching a new level of complexity, but on the other hand the notation is not as properly developed as the theory itself, in my opinion. Let me give an example: The teacher says, solving a simple linear equation is rather easy by using separation of variables. Then he transforms a differential equation formally by some rules he don't explain in detail and I'm curios whether there _is_ a good explanation: y' = y y' / y = 1 | integrate C + ln y = x ln y = x-C y = exp (x-C) The student thinks: That's nice, let's try it with some problem I couldn't solve recently: cos x = x^2 | differentiate -sin x = 2*x So, x=0 is an obvious solution to the second equation but not to the first one. The student concludes: If the teacher applies the method, it's right, if I apply it, it's wrong. :-( Then he asks the teacher why the method fails for cos x = x^2 . The teacher says, of course what you did cannot work because y' = y is a differential equation and cos x = x^2 is not. Does this answer help the student? Why doesn't the formula show this difference? What means the difference between 'differential equation' and 'normal equation' essentially? The technique of differentation led to lots of notations. I think this is so because many of the existing notations don't model the idea of differentation very well, thus new notations are introduced when the known notations fail. d x, \partial x, ', \dot{x}, f_x, D I wonder if these notations can be unified, thus simplified. I wonder what formalism I can trust. E.g. I'm sceptic about notations using variables. What is the value of d x for certain values of x? What is the difference between d x and d y? Is f_x(x,y) and f_x(y,x) the same? My proposal: Consider the derivation as higher order function which maps a function to a function. So in the Haskell notation it would be derive :: (a -> a) -> (a -> a) More generally if we have functions (b -> a) as vectors and want to derive a function mapping from vector arguments to scalar values ((b -> a) -> a) (also called functionals), it would have signature deriveVector :: ((b -> a) -> a) -> ((b -> a) -> (b -> a)) E.g. deriveVector f x 1 evaluates the partial derivative of f with respect to the first component at the point x. deriveVector f x is the total derivative or gradient at point x. I know that after introducing higher dimensional differentation using \partial x, \partial y, d x and d y students are uncertain about what the total derivative actually is. Let's turn back to the example and view it through the functional glasses. The differential equation y' = y now means, that y is a function, (') is the differentation operator. With liftF2 I lift (/) to function values and I divide both sides with the lifted (/) by y, noting that this is only possible, if y is nowhere zero. derive y = y liftF2 (/) (derive y) y = liftF2 (/) y y liftF2 (/) (derive y) y = const 1 | integrate c (integrate fulfills fint = integrate c f => fint 0 = c, though it is problematic if the result has a singularity at 0) (c1+) . ln . y = (c+) . id (with c1 such that ln (y 0) = c) ln . y = ((c-c1)+) | (exp .) exp . ln . y = exp . ((c-c1)+) y = exp . ((c-c1)+) This is certainly not intuitive if you are familar with the common notation but it is a method where I can explain every step, it is a formalism I can trust. And it is important to have such formalism at hand if you encounter a problem with the common abbreviated mathematical style. Now it is clear, why cos x = x^2 can't be solved using differentation: This is an equation of scalar values, whereas differential equations are function equations.
I've collected some examples of abuse and bad mathematical notation: http://www.math.uni-bremen.de/~thielema/Research/notation.pdf
Some of what you point out there is bad notation. Other bits point to some misunderstanding of the issues.
Starting from your original post:
f(x) \in L(\R) where f \in L(\R) is meant
F(x) = \int f(x) \dif x where x shouldn't be visible outside the integral
First, mathematicians like to write f(x) to indicate clearly that they are denoting a function.
f(x) \in L(\R) clearly denotes that f(x) is a function, that is f(x)(y) is a scalar value. There are situations where it is sensible to claim f(x) \in L(\R) so why making them more complicated by sticking to that notation in the wrong cases? f \in L(\R) is one of the examples where the _right_ notation is shorter, thus easier to read, than the _wrong_ one. Yes, I call f(x) \in L(\R) _wrong_, because it is commonly agreed that L(\R) is a set of functions and that each element of such a set is a function, thus if you write f(x) \in L(\R) then you cannot treat f(x) as a scalar value later on.
You also write
O(n) which should be O(\n -> n) (a remark by Simon Thompson in The Craft of Functional Programming)
but the only reason for this is that computer scientists don't like open terms. Since the argument to O must be a univariate function with range the Reals, then whatever is written there must *denote* such a function. The term ``n'' can only denote only one such function, \n -> n.
What about (\m -> n) ? What means O(n^m) ? O(\n -> n^m) or O(\m -> n^m) ?
Russell's work in the early 1900, "sense and denotation" are worth reading if you want to learn more about this.
Nice to know.
On your ``notation.pdf'' (link above), you have some other mis-interpretations. On p.10 you seem to think that Mathematica is a lazy language, when it is in fact an eager language.
I know that Mathematica is both eager and a mix of functional and imperative programming.
However,
subs(x=2,'diff'(ln(x),x)) ;
diff(ln(2),2) where the '' quotes mean to delay evaluation of the underlying function. On the other hand
eval('diff'(ln(x),x),x=2) ;
eval('diff'(ln(x),x),x=2) because it makes no sense to evaluate an open term which introduces a (temporary) binding for one of its variables.
That's what I dislike: To overcome particular problems, patches like quotes and eval are introduced. They are rather hard to combine, hard to nest. I spent hours to explain Mathematica expressions that I had clearly in my mind. We can have a very clean solution without quotes and subs and eval by just considering functions as regular mathematical objects.
Note that without access to terms, it is not possible to write a function like diff (or derive as you phrase it, or D as Mathematica calls it). Mathematician's diff looks like it is has signature diff: function -> function, but it in fact is treated more often as having signature diff: function_denotation -> function_denotation.
Fine, let's consider this example in detail: If I write + in Mathematica, I know that the value of the resulting expression will be the sum of the values of the expressions of the operands. This is true independent from the particular expressions of the operands. This constraint makes working with algebraic formulas very handy. If I write f(4) or f(2+2) in mathematics doesn't matter, both denote the same value independent of f because 4 and 2+2 denote the same value. If x=4 then f(x) denotes the same value as f(4). In Haskell this constraint is fulfilled and it is essential for the simplification rules that GHC applies. I like to have this behaviour for derivation, too. So of what type must be the parameter of 'derive'? A scalar expression with a free variable or a function expression? The designers of Mathematica and Maple decided for the first alternative. But the derivation function D breaks the constraint of free combineability. If x=4 then D[x,x] is 0 or an error (I can't check it currently), if x is not bound to a value it is 1. What I suggest is that derivation should be limitted to functions. If two expressions denote the same function, then D will derive two expressions, that may not be equal but denote the same function. Indeed, the implementation of D must know the expression of the function to give the result as a formula, but if it only knows the set of argument-value-pairs it would work, too. This is the view on functions which Fourier brought to us. Before Fourier only functions given as analytical terms were considered as functions. The definition of the differential quotient don't need the particular expression for a function, it doesn't even require that there is an explicit expression. This means, the derived function will depend only on the function to be derived! Following this paradigm, Simplify must implement the identity function (identity with respect to the value, not identity with respect to the expression) and ReplaceAll should be replaced by function application.
You also seem to assume that set theory is the only axiomatization of mathematics that counts (on p.31).
I'm also aware of Russell's type theory, but my impression is that set theory is the most popular one.
I do not see functions A -> B as somehow being subsets of powerset(A x B). That to me is one possible 'implementation' of functions. This identification is just as faulty as the one you point out on p.14 of the naturals not ``really'' being a subset of the rationals. In both cases, there is a natural embedding taking place, but it is not the identity.
I agree, though I'm not aware of some symbols to express that. :-(
You also have the signature of a number of functions not-quite-right. ``indefinite'' integration does not map functions to functions, but functions to equivalence classes of functions.
Yes, I should add a parameter for the integration constant. An explicit integration constant seems to be the better choice instead of a set value, since then it is clear that for every real number there is one function.
Fourier transforms (and other integral transforms) map into functionals, not functions.
The Fourier transform I had in mind is defined as F(w) = \int_{\R} e^{-i w x} f(x) d x So F is clearly a function, though I agree that the Fourier transform defined this way is not a total function.
I hope your audience (for things like slide 35) was made of computer scientists - it is so amazingly condescending to thousands of mathematicians, it is amazing you would not get immediately booted out!
The audience were exclusively mathematicians - my colleagues. Being booted out by people who never thought intensively about what they write but who only defend their habit, wouldn't be a problem for me. The same colleagues ran into troubles before by trusting formalisms which are odd, though I believe they did not often enough, so far. But I can wait ... :-)
On p.37, you have polynomials backwards. Polynomials are formal objects, but they (uniquely) denote a function. So polynomials CANNOT be evaluated, but they denote a unique function which can.
Yes, I should better stick to the application homomorphism \varphi which maps a polynomial to a function. That is p is a polynomial, \varphi(p) is a function and \varphi(p)(x) is a scalar value.
On p.51 where you speak of ``hidden'' quantifiers, you yourself omit the thereexists quantifiers that are implicit on the last 2 lines -- why?
What do you mean exactly? In the mean-time I found that is better not to speak about equations but about the set of solutions. That is writing, or at least thinking, {x : 0 = x^2 + 2*x + 1} and {y : forall x . y'(x) = x + sin x} makes clear, what we search for and it explains, what care has to be taken for every equation transformation.
The above was meant to be constructive
Thanks for the comments!

A comment of a long text... Henning Thielemann writes: ... some examples of transparency of notation based on 2+2=4 ...
I like to have this behaviour for derivation, too. So of what type must be the parameter of 'derive'? A scalar expression with a free variable or a function expression? The designers of Mathematica and Maple decided for the first alternative. But the derivation function D breaks the constraint of free combineability. If x=4 then D[x,x] is 0 or an error (I can't check it currently), if x is not bound to a value it is 1. What I suggest is that derivation should be limitted to functions. If two expressions denote the same function, then D will derive two expressions, that may not be equal but denote the same function. Indeed, the implementation of D must know the expression of the function to give the result as a formula, but if it only knows the set of argument-value-pairs it would work, too. This is the view on functions which Fourier brought to us.
Now: Please don't abuse the examples based on differentiation in order to point out the difference between 'expressions with variables' and 'functions'. This is simply NOT TRUE that only functions can be differentiated. The differentiation is an ALGEBRAIC procedure, people acquainted with the differential algebra know that. The differentiation (derivation) operator is a linear operator which fulfils the Leibniz rule when acting on products. If the underlying algebra (a ring, a field, etc.) fulfils some other constraints, if it has some topology, this is ALL. People who do the so called 'automatic differentiation' know that thanks to this it is possible to differentiate formally the numerical programs WITHOUT any symbolic manipulation of the program text; a "variable" 'x' is a GENERATOR of an appropriate differential algebra. In such a way you can write in Haskell - what I did some years ago - a program which differentiates purely numerical infinite lists, representing a value of an expression followed by a tower of all its relatives wrt. a specified generator. Math is not a question of notation, but of structure. And - please - stop (this is not addressed to Henning) calling this or that "stupid". Sometimes thing you don't like inspire people to find some nice implementable solutions, even if outside a classical typed framework. The double ordering: a < b < c is a very well known contraption in Icon. The operator "<" may confirm the inegality or FAIL. The failure is something which propagates across all embedded expressions until it is caught. If it succeeds, then it returns the SECOND argument. In such a way, going from left to right, "<" gives the correct result. This is not 'functional' but I like it. Jerzy Karczmarczuk

On Mon, 31 Jan 2005 karczma@info.unicaen.fr wrote:
Now:
Please don't abuse the examples based on differentiation in order to point out the difference between 'expressions with variables' and 'functions'. This is simply NOT TRUE that only functions can be differentiated. The differentiation is an ALGEBRAIC procedure, people acquainted with the differential algebra know that.
Ok, then generalize 'function' to other mathematical objects you like. E.g. polynomials are a good example. For a polynomial p (a tuple) I can define D point-wise as D p = (p_1, 2*p_2, 3*p_3, ..., n*p_n) Then D fulfills the Leibnitz rule D (p0*p1) = D p0 * p1 + p0 * D p1 this is probably what you mean. But I still doubt that it is necessary to accept 'expressions' as mathematical objects.
The differentiation (derivation) operator is a linear operator which fulfils the Leibniz rule when acting on products.
Linearity of D means for me: D (k*p) = k * D p D (p0+p1) = D p0 + D p1 But what is the linearity of an operator acting on an _expression_? For instance the Simplify function, which is rather an Identity function, is not linear with respect to expression: Simplify [3*4] = 12 but if it were linear with respect to expressions it had to be Simplify [3*4] = 3 * Simplify[4] Simplify [3*4] = 3 * 4
Math is not a question of notation, but of structure.
Notation _is_ written structure. As my examples should show there is a benefit if notation matches the structure of the considered problem.
The double ordering: a < b < c is a very well known contraption in Icon. The operator "<" may confirm the inegality or FAIL. The failure is something which propagates across all embedded expressions until it is caught. If it succeeds, then it returns the SECOND argument. In such a way, going from left to right, "<" gives the correct result. This is not 'functional' but I like it.
I also use this short-cut, but I don't expect that to make it work in a programming language, because you need a lot of explanation to make it work, thus making the compiler and source code processors more complicated. All that just for sticking to a habit! By the way there is a nice functional implementation of a < b < c: zapWith :: [a -> a -> b] -> [a] -> [b] zapWith fs xs = zipWith3 id fs xs (tail xs) usage: and zapWith [ (<), (<) ] [a, b, c]

On Mon, 31 Jan 2005, Henning Thielemann wrote:
On Mon, 31 Jan 2005 karczma@info.unicaen.fr wrote:
The double ordering: a < b < c
By the way there is a nice functional implementation of a < b < c:
zapWith :: [a -> a -> b] -> [a] -> [b] zapWith fs xs = zipWith3 id fs xs (tail xs)
usage:
and zapWith [ (<), (<) ] [a, b, c]
sorry, must be and $ zapWith [ (<), (<) ] [a, b, c]

[My apologies for not seeing the related discussion yesterday on haskell-cafe, I was not subscribed until just now]
Iavor Diatchki
On Fri, 28 Jan 2005 10:01:33 -0500, Jacques Carette
wrote: The previous post on record syntax reminded me of some 'problems' I had noticed where Haskell and mathematics have a (deep) usage mismatch. It is interesting that my sentiment is exactly the opposite. Mathematicians (the applied kind, not the pure) tend to be quite sloppy about their notation relaying _a lot_ on the context and the fact that people know what they are talking about. I find that this makes their explanations harder to understand rather than easier.
But every program you write in Haskell depends on a lot of context too! It depends on the (complete) semantics of the programming language, which includes quite a lot of type theory. And more to the point, it includes the context of all the builtin operators as well as the (large) set of functions in Prelude. That is a lot of context. The 'only' difference with mathematics is that Haskell forces you to use 'import' statements to bring in more context.
First, consider a syntax for other component-wise function application? For example, it would be convenient to have (f,g) @ (x,y) be (f x, g y). In some languages [with dynamic typing], one can even do (f,g) (x,y) :-) I am not sure what you mean, but you can define exactly that (except for the @ sign which is special syntax already): (f,g) `ap2` (x,y) = (f x, g y)
[I should have looked harder for better syntax] I meant to a) overload application b) do it for all tuples at once Building all of Nat into the names of my functions does not appeal to me...
Yes, I am aware that this untypeable in Haskell, because polymorphism is straight-jacketed by structural rules. What do you mean by this: ap2 :: (a -> b, c - d) -> (a,c) -> (b,d)
I mean that you cannot have ap2, ap3, ap4, etc *and* apply all share the 'same' type. But shouldn't they?
1) look at the type a -> b as being a *subtype* of b (though I have never seen it phrased that way in print) I don't think subtyping is what you are thinking of. Perhaps you are referring to the correspondance between b-expressions with a free a-variable, and functions a->b. In Haskell we don't have expressions with free variables as first class values. There might be some interesting things to work out for such beasts.
Expressions with free variables as first-class values is indeed an obvious 'next' thing to want, if one is to parallel mathematics. But I really mean to look at any p-producing function as being of b-type, and that includes a -> b.
2) to consider every 'natural transformation' available and use it (tuples of functions <-> obvious function on tuples) Are you suggesting that natural transformations should be applied automatically? There are many problems with this: which operations should happen automatically?
At least the ones that you get 'for free' for structural reasons.
Type inference (if it works at all) will be hard to do, but perhaps these are technical details. Fundamentally the "sloppy" math notation assumes that what the author wrote is correct and should simply be given "the obvious" meaning.
Haskell's type inference engine does the same thing -- it assumes that the code you wrote is meaningful, and thus gives it the *only* type that makes sense. Care has been taken so that in Haskell "the obvious" == "the only", which is certainly a good property to have. Mathematicians who are a bit careful with their notation, do the same. They make sure that "obvious" and "only" match when they drop sub/superscripts, some environment dependence, etc.
This is pretty much the idea with dynamic typing. I find the restrictions of static typing very convenient (well most of the time :-) as they capture "stupid' errors.
I disagree. I wrote tens of thousands of lines of code in a (decent) dynamically typed language. Now I am writing thousands of lines of code in Haskell. I also appreciate all my "stupid" errors being caught - this is a welcome advantage, and I rather like it. But I really really hate having to duplicate code for really dumb reasons. So what I want is not dynamic typing, just more powerful static typing.
Seeing a -> b as a subtype of b allows one to take any function f : b -> b -> c (like + ) and 'lift' it to ff : (a -> b) -> (a -> b) -> (a -> c) via the obvious definition ff g h = \x -> f (g x) (h x) This is done routinely in mathematics. The meaning of the expression (sin + cos) should be immediately obvious. Such 'lifting' is used even in first-year calculus courses, but not in Haskell. Apart from the fact that this does not seem like a good idea, you can already do things like this in Haskell: liftOp op f g x = f x `op` f x
Of course. My point is that you need to explicitly use liftOp. I want it to be implicit. Note that this does NOT break type inference, if type inference is done via constraint solving and a->b is a subtype of b, though it does make it harder.
This is all you need, especially if you are prepared to stick to the "usual" mathematical convention of not using curried functions, but instead using tuples to pass multiple arguments.
Oh, but curried functions are way too nice to give up!
Similarly, there are "obvious" maps apply_T :: T (a->b) -> a -> T b for any datatype constructor T (as has been noticed by all who do 'generic' programming). Is this really true? Consider: data T a = T (a -> Int) What shoud apply_T do? applyT :: (essentially) ((a -> b) -> Int) -> a -> (b -> Int) perhaps: applyT (T f) a = T (\b -> f (const b)) but that probably does not behave as you would like it. Of course you could restrict the datatypes for which apply_T can be derived.
Good point - since my intuition was built using covariant types, it does break down on contravariant types. I meant that it would be ok for applyT :: T (a -> (b -> Int)) -> a -> T (b -> Int) But I did talk about 'polynomial functors' (or was that a different email?), and your example is most definitely not.
I agree that it would be nice to automatically derive functions such as apply_T, etc. This (as you point out) is the idea of generic programming. This however does not suggest that we should replace application...
Not replace, augment/generalize. Jacques

On Fri, 28 Jan 2005, Jacques Carette wrote:
First, consider a syntax for other component-wise function application? For example, it would be convenient to have (f,g) @ (x,y) be (f x, g y). In some languages [with dynamic typing], one can even do (f,g) (x,y) :-) I am not sure what you mean, but you can define exactly that (except for the @ sign which is special syntax already): (f,g) `ap2` (x,y) = (f x, g y)
[I should have looked harder for better syntax] I meant to a) overload application b) do it for all tuples at once Building all of Nat into the names of my functions does not appeal to me...
There was some similar discussion about allowing function application notation for other objects than functions, e.g. FiniteMaps. This would be rather like treating polynomials like functions, a problem you pointed out in my slides. I don't feel good about such implicit liftings. Haskell already allows in many cases to factor out common work, but letting it do that automatically creates a lot of new problems, you can't estimate currently. I've worked with MatLab which does not distinguish between scalar values, single element vectors and 1x1 matrices - it's just awful, because earlier or later you start writing functions to workaround MatLabs automatisms just to get a consistent behaviour for 1 dimensional vectors and 1x1 matrices.
Expressions with free variables as first-class values is indeed an obvious 'next' thing to want, if one is to parallel mathematics.
Like the expressions which D and Lim and Plot in Mathematica require as arguments? I think they are responsable for all the mess in Mathematica and Maple.
This is pretty much the idea with dynamic typing. I find the restrictions of static typing very convenient (well most of the time :-) as they capture "stupid' errors.
I disagree. I wrote tens of thousands of lines of code in a (decent) dynamically typed language. Now I am writing thousands of lines of code in Haskell. I also appreciate all my "stupid" errors being caught - this is a welcome advantage, and I rather like it. But I really really hate having to duplicate code for really dumb reasons. So what I want is not dynamic typing, just more powerful static typing.
At this point I wonder why it is necessary to have these features in Haskell. Aren't there other languages which are closer to what you want?
participants (4)
-
Henning Thielemann
-
Iavor Diatchki
-
Jacques Carette
-
karczma@info.unicaen.fr