Re: Perspectives on learning and using Haskell

On Tue, 23 Dec 2003 17:26:20 +0000
Graham Klyne
I've spent part of the past few months learning Haskell and developing a moderately sized application. I came to this from a long background (20 years or so) of "conventional" programming in a variety of languages (from Fortran and Algol W to Java and Python). For me, learning Haskell has been one of the steepest learning curves of any new language that I have ever learned. Before this project, I was aware of some aspects of functional programming, but had never previously done any "in anger" (i.e. for real).
Well, the obvious question is: after climbing partway up that curve, what do you think of the view? Was it worth learning? Is it worth continuing to use? Does the code seem 'better' than what you might produce in other languages you've used?
Throughout this period, I've been accumulating some notes about some things that I found challenging along the way. The notes are not organized in any way, and they're certainly not complete. I've published them on my web site [1] in case the perspective might be useful to any "old hands" here.
[1] http://www.ninebynine.org/Software/Learning-Haskell-Notes.html
When I saw this page earlier, I was thinking about suggesting to you to add it and any new thoughts to the wiki so that it's more readily accessible to other Haskell newbies and can be annotated with pointers to resources and comments; kind of like an annotated unidirectional version of "A Newbie's on-going tutorial" on the Squeak wiki (http://minnow.cc.gatech.edu/squeak/1928).
Also on the topic of perspectives:
In recent conversation with a colleague, he mentioned to me that the term "functional programming" has an image problem. He suggested that the term conveys an impression of an approach that is staid, non-progressive or lacking novelty, and is prone to elicit a response of "been there, done that" from programmers who don't realize the full significance of the term "functional". I've also noticed that when I talk about "functional programming", some people tend to think I'm talking about using techniques like functions in C or Pascal (which is course is very desirable, but old hat and not worthy of great excitement).
Yes. People often use the term 'procedural' to describe the C style break down of a problem. Of course, some people use procedural to refer to functional and others functional to refer to procedural. Then, of course, there are also dynamic typers who equate static typing with C/Pascal. On the OO side, there are the people who equate OO with C++/Java much to the chagrin of Smalltalkers and CLOS users. At any rate, functional programming is a pretty well established term so there isn't much Haskell can do about it. Perhaps throwing in 'higher-order' will help, and certainly laziness/purity can't be said to be lacking novelty.

At 17:44 23/12/03 -0500, Derek Elkins wrote:
On Tue, 23 Dec 2003 17:26:20 +0000 Graham Klyne
wrote: (moved to Haskell-Cafe as this reply might generate several more)
I've spent part of the past few months learning Haskell and developing a moderately sized application. I came to this from a long background (20 years or so) of "conventional" programming in a variety of languages (from Fortran and Algol W to Java and Python). For me, learning Haskell has been one of the steepest learning curves of any new language that I have ever learned. Before this project, I was aware of some aspects of functional programming, but had never previously done any "in anger" (i.e. for real).
Well, the obvious question is: after climbing partway up that curve, what do you think of the view? Was it worth learning? Is it worth continuing to use? Does the code seem 'better' than what you might produce in other languages you've used?
I think the view is good :-) Others have written about the advantages of FP, and I won't try to repeat that. My personal observations include: (1) I notice that it's very easy to pick up and use third-party functions; there doesn't seem to be the square peg/round hole effect that one gets with third-party libraries in conventional programming languages. I don't know why this is, but two possibly contributing factors may be: (a) many Haskell expressions are just values, so in some respects they're closer to data than to code. There isn't a procedural aspect to get in the way (e.g. no need to coordinate passage through the "von Neumann bottleneck"?) (b) the type system permits, even encourages, typing details that are not relevant to some function to be left unspecified. (2) I find that I spend a far greater portion of my time *thinking* about what I'm trying to do than actually coding it. There used to be an adage in the software industry that programmer productivity in lines-of-code per day was independent of programming language: with Haskell, I suspect that old rule breaks down, because it seems that actually coding has become a relatively small part of the overall exercise. (3) it has often been noted that despite the influx of "new" programming languages, little is truly new -- the act of programming is pretty much unchanged since, er, the 60's? I know the basic ideas of FP have been around for several decades, but I do think there is some truly new thinking being applied in the area of FP, compared with conventional languages. I think I do see some truly different approaches being explored in the FP community, and some of those do seem to have some real value. (e.g. the way that Monads turn out to be such a flexible idea. I haven't yet started to look at arrows.) Related to this, it seems to me that Haskell's lazy functional model is capable of subsuming many other programming models. One that I discovered quite early was mimicking the test-and-backtrack style of Prolog. More recently, looking at the like of "scrap your boilerplate" ideas, it seems that FP is capable of tackling the separation of "cross-cutting concerns" that seem to motivate the more recent Aspect Oriented Programming ideas, without really adding any new language features. (4) I have noticed that it's often very easy to write some inefficient code (sometimes *very* inefficient) to do some task in Haskell, and then to optimize it. My current project is currently riddled with horrendous inefficiencies, but I don't care because I am confident that I can optimize as and when I need to. (The GHC profiling system is one reason for this confidence, and also the availability of off-the-shelf libraries to improve many of my very inefficient data structures.) (5) working as I do in the area of Internet and Web protocols, I like the idea that I can (often) write code that is closely and "obviously" related to a specification that I'm trying to implement. (This is an area that I'd like to explore further.) (6) I have found that Haskell seems to a particularly effective platform for pursuing some ideas of extreme programming, and in particular test-led development. (I do this very imperfectly, but I'm learning ;-) For example, Dean Herington's unit testing framework seems to fit more comfortably into the host language than, say, the JUnit framework from which it takes its inspiration. Adding new styles of test-case has been a breeze compared with my experience of using JUnit or PyUnit (and I have found those frameworks to be very satisfactory). (7) For the first time since learning Pascal, I'm impressed at how reliable my Haskell code (messy as much of it may be by Haskell standards) has turned out to be. If I sit here longer, I will probably think of more. But, in summary, I think a combination of: - improved language implementations - improved hardware - new ideas about how to use/deploy FP - the nature of some Internet/Web applications mean that FP in general and Haskell in particular can be used to good effect in real-world "industrial strength" projects. There are some more of my comments about using Haskell here: http://www.ninebynine.org/RDFNotes/Swish/Intro.html#WhyHaskell (these are directed more toward an audience not familiar with Haskell or FP.)
Throughout this period, I've been accumulating some notes about some things that I found challenging along the way. The notes are not organized in any way, and they're certainly not complete. I've published them on my web site [1] in case the perspective might be useful to any "old hands" here.
[1] http://www.ninebynine.org/Software/Learning-Haskell-Notes.html
When I saw this page earlier, I was thinking about suggesting to you to add it and any new thoughts to the wiki so that it's more readily accessible to other Haskell newbies and can be annotated with pointers to resources and comments; kind of like an annotated unidirectional version of "A Newbie's on-going tutorial" on the Squeak wiki (http://minnow.cc.gatech.edu/squeak/1928).
I wondered about that... I'm more than happy for any or all of this material to be served by the Wiki. As a relative outsider/newcomer to this community, I wasn't sure if my offerings would be felt to be useful or welcome there. Your suggestion sounds good... if it's felt to be helpful, I'll ponder it and think about how I might present it. The page you cite seems to have the right kind of tone. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

G'day all.
Quoting Graham Klyne
(2) I find that I spend a far greater portion of my time *thinking* about what I'm trying to do than actually coding it. There used to be an adage in the software industry that programmer productivity in lines-of-code per day was independent of programming language: with Haskell, I suspect that old rule breaks down, because it seems that actually coding has become a relatively small part of the overall exercise.
In theory, that's true of other languages too. Programmers are *supposed* to spend more time doing non-coding than coding (where I don't, for example, count writing unit tests as "coding"). I suspect that this is actually an artifact of your level of experience with Haskell. I know that I used to spend more time thinking than coding. When you get familiar enough, you get to the stage where you can think on your feet.
(4) I have noticed that it's often very easy to write some inefficient code (sometimes *very* inefficient) to do some task in Haskell, and then to optimize it.
This, I believe, is one of the most unappreciated (by non-declarative programmers, anyway) aspects of declarative programming. It's easy to write could-be-slow code that works first time, once you've fixed all the bugs caught by the compiler. Then, it's straightforward to swap out implementations and replace them with more efficient ones as the need arises. This is possible to do in non-declarative languages. It's even encouraged in "agile" OO methodologies, like XP. However, it's so much less painful in a declarative language, because you're far less tempted to introduce hidden dependencies to begin with. In fact, I find that my "working but slow" code has almost no weird dependencies. I find myself having to _introduce_ them during the "get it faster" phrase. From an engineering point of view, this is almost exactly the right thing, because that way you (in theory) end up with as few dependencies as possible.
(6) I have found that Haskell seems to a particularly effective platform for pursuing some ideas of extreme programming,
There you go. :-) There is only one problem I've found with test-driven development in Haskell. In C++, it's possible to break the "module" abstraction (yes, I know, C++ doesn't have modules; it has classes, which are really instantiable modules) by using "friend". In Haskell, I find myself occasionally having to expose parts of a module which I would prefer not to, in order for the unit tests suite to do their job effectively. I wonder if there might be a way to fix this, say, by allowing modules to selectively expose parts of their interface depending on who wants to use it. Cheers, Andrew Bromage

On Thu, 1 Jan 2004 21:07:00 -0500 ajb@spamcop.net wrote:
(6) I have found that Haskell seems to a particularly effective platform for pursuing some ideas of extreme programming,
There you go. :-)
There is only one problem I've found with test-driven development in Haskell. In C++, it's possible to break the "module" abstraction (yes, I know, C++ doesn't have modules; it has classes, which are really instantiable modules) by using "friend". In Haskell, I find myself occasionally having to expose parts of a module which I would prefer not to, in order for the unit tests suite to do their job effectively.
I wonder if there might be a way to fix this, say, by allowing modules to selectively expose parts of their interface depending on who wants to use it.
Well, the quick and dirty solution (at least with GHC) is to use GHCi and interpret the modules, which should keep some internals readily accessible. For example, I use the new -e option of GHC to run my unit tests. The "nicer" way*, though it could use less typing and "extraneous" files, is simply use multiple modules. module FooInternals where publicFoo :: Foo -> Bar publicFoo x = privateFrob x privateFrob x :: Foo -> Bar privateFrob x = ... debugFoo :: (Foo -> Bar) -> Foo -> Bar debugFoo f x = ... module Foo ( publicFoo ) where import FooInternals module FooDebug ( publicFoo, debugFoo ) where import FooInternals * Okay, so it doesn't really solve the "problem", it's just a way of structuring that avoids it.

G'day all.
Quoting Derek Elkins
The "nicer" way*, though it could use less typing and "extraneous" files, is simply use multiple modules.
Yes, this was my solution, too. Nested modules might make this even nicer. Of course, Haskell's module system could do with a redesign, but that's another topic. :-) Cheers, Andrew Bromage

G'day all.
One small note on style while I think of it.
Quoting Derek Elkins
module FooInternals where
publicFoo :: Foo -> Bar publicFoo x = privateFrob x
privateFrob x :: Foo -> Bar privateFrob x = ...
debugFoo :: (Foo -> Bar) -> Foo -> Bar debugFoo f x = ...
module Foo ( publicFoo ) where import FooInternals
module FooDebug ( publicFoo, debugFoo ) where import FooInternals
I would put debugFoo in FooDebug, and not export publicFoo. The former is an advantage because unit test code doesn't end up in the executable, and the latter is an advantage because it turns a triple maintenance problem into a double maintenance problem. (Should you add a new public function to FooInternals, you only have to mention it in Foo and not in FooDebug as well.) The reason why this is inelegant is that in Haskell, the unit of abstraction (i.e. the module) is the same as the unit of compilation (i.e. the file). If there are some budding researchers who are looking for a topic, I'd love a way to split a module across files without the situation degenerating into C-style textual inclusion. Cheers, Andrew Bromage

On 2004-01-01 at 21:07EST ajb@spamcop.net wrote:
There is only one problem I've found with test-driven development in Haskell. In C++, it's possible to break the "module" abstraction (yes, I know, C++ doesn't have modules; it has classes, which are really instantiable modules) by using "friend". In Haskell, I find myself occasionally having to expose parts of a module which I would prefer not to, in order for the unit tests suite to do their job effectively.
I wonder if there might be a way to fix this, say, by allowing modules to selectively expose parts of their interface depending on who wants to use it.
One of my unexplored ideas is to make tests part of the code of a module (probably best done with some sort of typed include mechanism for test data), linked in some way with the type of an entity. So one might write something like f :: Integer -> Integer |? f 0 == 1 && f 3 == 6 The compiler would then (optionally?) run the tests as part of the compilation. This would bind the tests more tightly to the programme than is now possible. As I say, I haven't explored this, so perhaps some of those agile minds out there could run with it? -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

At 12:23 02/01/04 +0000, Jon Fairbairn wrote:
The compiler would then (optionally?) run the tests as part of the compilation. This would bind the tests more tightly to the programme than is now possible.
Ooh! There's an interesting idea. I guess it's like a kind of 'assert' that get's optimized out at compile-time? #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

At 21:07 01/01/04 -0500, ajb@spamcop.net wrote:
In Haskell, I find myself occasionally having to expose parts of a module which I would prefer not to, in order for the unit tests suite to do their job effectively.
Yes, I've found that too. But I also wonder if this is a sign that the XP approach to test-led development isn't being followed faithfully. Theoretically (as I understand XP), the tests *are* the specification. And things that aren't exposed can't be part of the specification, can they? In practice, I think that there's a slight tension here: tests may embody the specification, but they also embody some knowledge of the way the code works, and (I occasionally find) some may be created to provide a finer granularity of information about how the code is *mis*functioning. I'm finding my development strategy is evolving to make more use of separate "spike" modules to test code fragments, so there's less need for this kind of white box influence in the test code. Which leads to a question: I've been thinking that the "white box" tests may be better served by test expressions coded *within* the module concerned. In many cases, I create these, then en-comment them when the code is working. I would expect that when using GHC to compile a stand-alone Haskell program, any expressions that are not referenced are not included in the final object program, so leaving these test cases uncommented would be harmless: is this so? #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

On Sun, 2004-01-04 at 10:20, Graham Klyne wrote:
Which leads to a question: I've been thinking that the "white box" tests may be better served by test expressions coded *within* the module concerned. In many cases, I create these, then en-comment them when the code is working. I would expect that when using GHC to compile a stand-alone Haskell program, any expressions that are not referenced are not included in the final object program, so leaving these test cases uncommented would be harmless: is this so?
If your test functions are not exported, I would expect that this is the case. I normally include test code in the same module. That way, I'm more likely to notice when I change/break the interface, as the test code then fails to compile. Then I can load each module up in ghci/hugs and run the tests. If the tests are sufficiently automated I stick in a main function which runs all the tests. Jon Fairbairn mentioned that it would be nice to automatically run the tests every time the module was compiled. You could do that with Template Haskell if the staging restrictions were relaxed a bit; at the moment you are not allowed to call functions defined in the same module. On a side note, I've found QuickCheck to be great for these kinds of unit tests. As an example, I was able to turn someone else's code for a trie data structure into a multi-trie (like a bag is to a set) without fully understanding the code and still be fairly confident that the code was correct! I caught a couple subtle bugs that I would never have found otherwise. I guess it worked well in that case because the properties of the data structure were few and easy to describe. Duncan

G'day all.
Quoting Duncan Coutts
On a side note, I've found QuickCheck to be great for these kinds of unit tests. As an example, I was able to turn someone else's code for a trie data structure into a multi-trie (like a bag is to a set) without fully understanding the code and still be fairly confident that the code was correct!
I agree that QuickCheck is great for this kind of task, but I still find myself exporting unnecessary functions. For example, suppose you are writing a balanced binary search tree-based FiniteMap data structure, and want to test it with QuickCheck. As well as testing the interface, you need to test at least three internal invariants after each operation: - that the tree is balanced with respect to the balance condition, - that the meta-data used to maintain the balance condition is correct, and - that the tree is a binary search tree (i.e. that the keys are in sorted order if you do an inorder traversal). For this simple example, one extra exported function should do the trick: structuralInvariantFM :: (Ord k) => FiniteMap k v -> Bool However, a more complex data structure with multiple invariants would require even more exported functions, because it would become more important to know which invariant was broken. Cheers, Andrew Bromage

Duncan Coutts wrote:
On Sun, 2004-01-04 at 10:20, Graham Klyne wrote:
[...] I would expect that when using GHC to compile a stand-alone Haskell program, any expressions that are not referenced are not included in the final object program, so leaving these test cases uncommented would be harmless: is this so?
If your test functions are not exported, I would expect that this is the case. [...]
Yes, unused functions which are not exported are nuked during compilation, even without using the -O flag. But don't guess, just ask GHC itself via its -ddump-occur-anal flag. (DISCLAIMER: I'm not responsible for the, well, slightly obscure name of this flag! :-) There are a lot more flags of this kind, see: http://haskell.org/ghc/docs/latest/html/users_guide/options-debugging.html#D... When you are *really* curious, use -v5. Simon^2: The -ddump-all and -ddump-most flags mentioned on the page above are not working anymore, -v5 / -v4 seem to do their job now. Should the documentation be fixed or GHC? Cheers, S.

G'day all.
Quoting Graham Klyne
But I also wonder if this is a sign that the XP approach to test-led development isn't being followed faithfully. Theoretically (as I understand XP), the tests *are* the specification. And things that aren't exposed can't be part of the specification, can they?
I don't understand XP this way. The tests are, at best, part of the specification of a module or component. However, the specification of the system as a whole is partially embodied in use cases and, ultimately, the on-site customer.
In practice, I think that there's a slight tension here: tests may embody the specification, but they also embody some knowledge of the way the code works, and (I occasionally find) some may be created to provide a finer granularity of information about how the code is *mis*functioning.
Don't you sometimes wish you had a Haskell coverage analysis tool? :-)
Which leads to a question: I've been thinking that the "white box" tests may be better served by test expressions coded *within* the module concerned.
That's not a bad idea, but there is a problem with this, IMO. When programs get big, the cost of compiling becomes very significant. Assume for a moment that the size of a program grows linearly over its lifetime, that the cost of compiling a program is at least linear in the size of the source code, that recompilations happen regularly and that hardware is not upgraded. (The last assumption is true in many real-world cases; often, build machines are kept "frozen" to minimise the risk of OS/compiler/etc changes introducing changes in the produced software.) A little thought shows that the cost of compiling this program over its lifetime is quadratic in its final size. It's a subtle cost because it's amortised over the lifetime of the software, but reducing the cost of a single compilation pass really does pay off. It follows that introducing code which is immediately thrown away in most cases is probably the wrong thing for many software development tasks. Cheers, Andrew Bromage

ajb@spamcop.net writes:
There is only one problem I've found with test-driven development in Haskell. In C++, it's possible to break the "module" abstraction (yes, I know, C++ doesn't have modules; it has classes, which are really instantiable modules) by using "friend". In Haskell, I find myself occasionally having to expose parts of a module which I would prefer not to, in order for the unit tests suite to do their job effectively.
My one problem with test-driven Haskell is, how to do it with QuickCheck tests? It's easy enough with HUnit, but I'd like to try it with QuickCheck, any suggestions?
I wonder if there might be a way to fix this, say, by allowing modules to selectively expose parts of their interface depending on who wants to use it.
What about GHC's new -main-is flag to specify a test main function? Then you may be able to write test code without exporting internal functions. As for tighter integration of tests with code, I wrote an example of one-button unit-testing in emacs on the HaskellMode page on the HaWiki, and the Programatica editor, as demonstrated at Haskell Workshop 2003 has the ability to embed 'certificates' that can be proofs, unit tests, etc. Check out the Evidence Management section here: http://www.cse.ogi.edu/~hallgren/Programatica/HW2003/demoabstract.html There's also the darcs_test parts of darcs, you can assign a script to run tests after a variety of darcs commands. None of these run the tests at compile time, but it's better than manually running the tests. -- Shae Matijs Erisson - 2 days older than RFC0226 #haskell on irc.freenode.net - We Put the Funk in Funktion 10 PRINT "HELLO" 20 GOTO 10 ; putStr $ fix ("HELLO\n"++)
participants (8)
-
ajb@spamcop.net
-
Derek Elkins
-
Duncan Coutts
-
Graham Klyne
-
Graham Klyne
-
Jon Fairbairn
-
Shae Matijs Erisson
-
Sven Panne