
Jerry wrote
However, my problems are: * I still don't understand most of the codes I found, like the various haskell libraries Practice is the only answer to this problem, as Keith Wansbrough says.
* I still have no clue of most (ok, almost all) of what is being discussed in this mailing list Though I am a CS student, being highly interested in Haskell and practising a lot, I didn't understand most of the discussions on this mailing list. I started understanding them only after I got involved in implementing a Haskell compiler. Therefore, don't worry about this point. A haskell-user doesn't need to know the details of the haskell compiler ins and outs.
Rijk-Jan van Haaften

Unfortunately I think I have to take issue with this statement :).
Therefore, don't worry about this point. A haskell-user doesn't need to know the details of the haskell compiler ins and outs.
I think that at the beginning, a huser doesn't need to know the ins and outs, but from personal experience, the more complicated your programs become, and the more you begin worrying about speed, the more you need to know about the compiler. Personally, I find this quite aggrevating and I think it is one of the least often expressed disadvantages to Haskell. I can see arguments against me immediately, so I'll save you time. For instance, you could argue that it is easier to get "up and running" (well, more like walking in Haskell) in Haskell than say C. I would probably agree. But lets say it comes time to fine tune my program. In C I have (admittedly made up example, but see my recent post the the mailing list on this subject): void sumProdTo(int n, int*s, int*p) { int sum = 0; int i; for (i = 1; i <= n; i++) sum += i; &s = sum; int prod = 1; for (i = 2; i <= n; i++) prod *= i; &p = prod; } and in Haskell: sumProdTo n = (sumTo n, prodTo n) where sumTo 1 = 1 sumTo n = n + sumTo (n-1) prodTo... etc. Now, I want to optimize. In C, I can fuse the loops so I'm only traversing once, which would give me moderate speed improcement. In Haskell, I could also "fuse" the loops and say: sumProdTo 1 = (1,1) sumProdTo n = (n+s,n*p) where (s,p) = sumProdTo (n-1) but this is actually *slower* but there's no way for me to know this without actually running it. in fact, in the discussion about this on the list, no one was really able to give a definitive answer as to why. there was a lot of handwaving about pointers to the heap to pointers to other things, and obviously it has to do with boxing and unboxing and stuff like that, but I think this is a problem. i'm not sure what the moral is. obviously, each of these, in theory, could have equal speed given a good enough optimizing compiler (and, perhaps, no issues of undecidability). however, since in C it's much more clear exactly what is happeneing, you know when you write your code that one thing is going to be faster than another (except when you get to uber complex things when you're worrying about cache sizes, etc, but that's out of the scope of this -- you could never even try to worry about something like that in haskell). i think the only real solution would be to put up a web page or something that contains things like "this looks like it would speed up your program but it will actually slow it down." or at least some better tips on getting your programs to run quickly. okay, i'm done. that's not to discourage the original poster to learn haskell - i use it almost exclusively; it's just that i don't think it's fair to say you don't have to understand what the compiler is doing to write code. - hal -- Hal Daume III "Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume On Thu, 28 Feb 2002, Rijk J. C. van Haaften wrote:
Jerry wrote
However, my problems are: * I still don't understand most of the codes I found, like the various haskell libraries Practice is the only answer to this problem, as Keith Wansbrough says.
* I still have no clue of most (ok, almost all) of what is being discussed in this mailing list Though I am a CS student, being highly interested in Haskell and practising a lot, I didn't understand most of the discussions on this mailing list. I started understanding them only after I got involved in implementing a Haskell compiler.
Rijk-Jan van Haaften
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hal Daume III about Haskell optimization problems: ...
and in Haskell:
sumProdTo n = (sumTo n, prodTo n) where sumTo 1 = 1 sumTo n = n + sumTo (n-1) prodTo...
etc.
Now, I want to optimize. In C, I can fuse the loops so I'm only traversing once, which would give me moderate speed improcement. In Haskell, I could also "fuse" the loops and say:
sumProdTo 1 = (1,1) sumProdTo n = (n+s,n*p) where (s,p) = sumProdTo (n-1)
but this is actually *slower* but there's no way for me to know this without actually running it. in fact, in the discussion about this on the list, no one was really able to give a definitive answer as to why. there was a lot of handwaving about pointers to the heap to pointers to other things, and obviously it has to do with boxing and unboxing and stuff like that, but I think this is a problem.
I didn't follow that discussion, but let's be serious. Really. Your second version constructs and destroys plenty of tuples, of ephemeric data structures which live one step only, and this is obviously costly. "No way to know this?" Are you sure? WHAT is a problem? I see only one: a Haskell user should master the essentials of a reasonably professional functional style. sumProdTo n = spt n 1 1 where spt 1 s p = (s,p) spt n s p = spt (n-1) (n+s) (n*p) The only touchy point in such circumstances may be the laziness which can be dealt with. Then, the above is more equivalent to the optimized C code than your version.
i'm not sure what the moral is. ...
i think the only real solution would be to put up a web page or something that contains things like "this looks like it would speed up your program but it will actually slow it down." or at least some better tips on getting your programs to run quickly.
For me the moral is: I disagree with
... it's just that i don't think it's fair to say you don't have to understand what the compiler is doing to write code.
This is not the question of this or that *compiler*, but of understanding the basics of data processing independent of the language. I am abhorred by the idea of putting down: "this looks like it would speed up your program..." in cases where it is rather clear that it might not. Please do the same experience in C with dynamically allocated tuples. Jerzy Karczmarczuk

On Thu, 28 Feb 2002, Jerzy Karczmarczuk wrote:
I didn't follow that discussion, but let's be serious. Really. Your second version constructs and destroys plenty of tuples, of ephemeric data structures which live one step only, and this is obviously costly. "No way to know this?" Are you sure?
And yet there's no reason I shouldn't get update-in-place on those tuples. What's more, if I create my own data type for tuples which is strict in both of its arguments and use a function which is strict in the tuple (yes, even if i use proper tail recursion), it's still slower.
WHAT is a problem? I see only one: a Haskell user should master the essentials of a reasonably professional functional style.
sumProdTo n = spt n 1 1 where spt 1 s p = (s,p) spt n s p = spt (n-1) (n+s) (n*p)
This is obviously the preferred solution, but it seems there should be no performance difference between this and: sumProdTo n = spt n (1,1) where spt 1 acc = acc spt n (s,p) = spt (n-1) (n+s, n*p) but there is a huge difference, even if i put seqs in to force evaluation of the sum and product before creating the tuple. but this is obviously a made-up situation. (more down further)
... it's just that i don't think it's fair to say you don't have to understand what the compiler is doing to write code.
This is not the question of this or that *compiler*, but of understanding the basics of data processing independent of the language. I am abhorred by the idea of putting down: "this looks like it would speed up your program..." in cases where it is rather clear that it might not. Please do the same experience in C with dynamically allocated tuples.
so constructing and tearing apart tuples, you should say then, displays a lack of understanding of the basics of data processing in haskell. let's look at the definition of the standard library function mapAccumL:
mapAccumL :: (a -> b -> (a, c)) -> a -> [b] -> (a, [c]) mapAccumL f s [] = (s, []) mapAccumL f s (x:xs) = (s'',y:ys) where (s', y ) = f s x (s'',ys) = mapAccumL f s' xs
this is clearly the same problem. it's constantly creating and tearing apart tuples. or unfoldr:
unfoldr f b = case f b of Nothing -> [] Just (a,b) -> a : unfoldr f b
the list goes on. clearly creating small intermediate structures (like tuples) is very central to haskell. in fact, for speed reasons i have frequently written my own versions of the above functions which remove the tuple creation because it simply makes it too slow. this is *not* at all what haskell is about. it's about writing functions which are small and modular and have good reuse. that's why this functions are in the standard libraries. you can also observe the frequent use of functions which take a state and return a (state,value) pair. using functions like these pushes the creation and destruction of tuples very far. given the importance tuples and other small data structures have in haskell, i found it hard to believe that using them would cause me to suffer such a severe performance penalty. i *assumed* that since they were so widely used and so integral to the language, that the compilers would do a much better job that they do with being intelligent about using them. i found this assumption to be incorrect, and that's the primary reason i think you need to know more about what's going on in the compiler to write fast programs. - Hal

First off, let me say that I'm not near the expert that Simon Peyton Jones, Simon Marlow, John Hughes (Related to Sarah the olympic gold metalist? No probaby not.) and many, many others on these haskell lists are. I think many of those experts have Masters or PhD Degrees relating to the subject. (I wouldn't mind earning one myself!) I just came upon this language (through some help with another person on this list), found it to be exteremely expressive and orthogonal in design, and frankly, fell in love with it. Now then: On Thu, 28 Feb 2002, Hal Daume III wrote:
Unfortunately I think I have to take issue with this statement :).
Therefore, don't worry about this point. A haskell-user doesn't need to know the details of the haskell compiler ins and outs.
I think that at the beginning, a huser doesn't need to know the ins and outs, but from personal experience, the more complicated your programs become, and the more you begin worrying about speed, the more you need to know about the compiler. Personally, I find this quite aggrevating and I think it is one of the least often expressed disadvantages to Haskell. I can see arguments against me immediately, so I'll save you time.
well, your arguments can be applied to any high level language. (That's kinda the point.) I do wish there were more to help the user learn more about details of compiler and optimization techniques and such, (like what about these rewrite rules? How would one (sanely) write them? Does Haskell implement any kind of deforestation? How about this strictness analysis business) But part of the problem, I think, is we dont have a concise, well developed set of rules and when to apply them. That's part of the research that's going on. <snip>
Now, I want to optimize. In C, I can fuse the loops so I'm only traversing once, which would give me moderate speed improcement. In Haskell, I could also "fuse" the loops and say:
sumProdTo 1 = (1,1) sumProdTo n = (n+s,n*p) where (s,p) = sumProdTo (n-1)
but this is actually *slower* but there's no way for me to know this without actually running it. in fact, in the discussion about this on the list, no one was really able to give a definitive answer as to why. there was a lot of handwaving about pointers to the heap to pointers to other things, and obviously it has to do with boxing and unboxing and stuff like that, but I think this is a problem.
yes, but then again, you could probably say same kinds of statements about C (because perhaps for some code A the compiler might be smart enough to allocate a register for a variable, etc. but for other code B, the rule of thumb that was fired for the first code doesn't fire for B. No, I can't give examples since I am not a C compiler expert either.)
i'm not sure what the moral is. obviously, each of these, in theory, could have equal speed given a good enough optimizing compiler (and, perhaps, no issues of undecidability). however, since in C it's much more clear exactly what is happeneing, you know when you write your code that one thing is going to be faster than another (except when you get to uber complex things when you're worrying about cache sizes, etc, but that's out of the scope of this -- you could never even try to worry about something like that in haskell).
I've heard of times when some C (that or C++) compiler might inline and partially execute code when using optimization flags. (point being you may not know what's going on with that compiler either)
i think the only real solution would be to put up a web page or something that contains things like "this looks like it would speed up your program but it will actually slow it down." or at least some better tips on getting your programs to run quickly.
I guess that's what the Haskell Wiki is attempting to do. We do need some central repository of haskell information, tips, and the like, I agree. Is the Haskell Wiki what everybody agrees upon? Jay Cox
participants (4)
-
Hal Daume III
-
Jay Cox
-
Jerzy Karczmarczuk
-
Rijk J. C. van Haaften