Type System vs Test Driven Development

Jonathan Geddes

5 Jan 2011 5 Jan '11

9:44 a.m.

Cafe, In every language I program in, I try to be as disciplined as possible and use Test-Driven Development. That is, every language except Haskell. There are a few great benefits that come from having a comprehensive test suite with your application: 1. Refactoring is safer/easier 2. You have higher confidence in your code 3. You have a sort of 'beacon' to show where code breakage occurs Admittedly, I don't believe there is any magical benefit that comes from writing your tests before your code. But I find that when I don't write tests first, it is incredibly hard to go back and write them for 'completed' code. But as mentioned, I don't write unit tests in Haskell. Here's why not. When I write Haskell code, I write functions (and monadic actions) that are either a) so trivial that writing any kind of unit/property test seems silly, or are b) composed of other trivial functions using equally-trivial combinators. So, am I missing the benefits of TDD in my Haskell code? Is the refactoring I do in Haskell less safe? I don't think so. I would assert that there is no such thing as refactoring with the style of Haskell I described: the code is already super-factored, so any code reorganization would be better described as "recomposition." When "recomposing" a program, its incredibly rare for the type system to miss an introduced error, in my experience. Am I less confidence in my Haskell code? On the contrary. In general, I feel more confident in Haskell code WITHOUT unit tests than code in other languages WITH unit tests! Finally, am I missing the "error beacon" when things break? Again I feel like the type system has got me covered here. One of the things that immediately appealed to me about Haskell is that the strong type system gives the feeling of writing code against a solid test base. The irony is that the type system (specifically the IO monad) force you to structure code that would be very easy to test because logic code is generally separated from IO code. I explained these thoughts to a fellow programmer who is not familiar with Haskell and his response was essentially that any language that discourages you from writing unit tests is a very poor language. He (mis)quoted: "compilation [is] the weakest form of unit testing" [0]. I vehemently disagreed, stating that invariants embedded in the type system are stronger than any other form of assuring correctness I know of. I know that much of my code could benefit from a property test or two on the more complex parts, but other than that I can't think that unit testing will improve my Haskell code/programming practice. Am I putting too much faith in the type system? [0] http://blog.jayfields.com/2008/02/static-typing-considered-harmful.html

Show replies by date

Erik de Castro Lopo

5 Jan 5 Jan

10:05 a.m.

Jonathan Geddes wrote: <snip>

...

So, am I missing the benefits of TDD in my Haskell code?

Probably. I work on a project which has 40000+ lines of haskell code (a compiler written in haskell) and has a huge test suite that is a vital to continued development. I've also written relatively small functions (eg a function to find if a graph has cycles) that was wrong first time I wrote it. During debugging I wrote a test that I'm keeping as part of the unit tests. Furthermore tests are also useful for preventing regressions (something the programmer is doing today, breaks something that was working 6 months ago). Without tests, that breakage may go un-noticed.

...

I explained these thoughts to a fellow programmer who is not familiar with Haskell and his response was essentially that any language that discourages you from writing unit tests is a very poor language.

Haskell most certainly does not discourage anyone from writing tests. One simply needs to look at the testing category of hackage: http://hackage.haskell.org/package/#cat:testing to find 36 packages for doing testing.

...

Am I putting too much faith in the type system?

Probably.

...

[0] http://blog.jayfields.com/2008/02/static-typing-considered-harmful.html

Complete bollocks! Good type systems combined with good testing leads to better code than either good type systems or good testing alone. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Sönke Hahn

2:30 p.m.

Erik de Castro Lopo wrote:

...

Jonathan Geddes wrote:

<snip>

...
So, am I missing the benefits of TDD in my Haskell code?

Probably. I work on a project which has 40000+ lines of haskell code (a compiler written in haskell) and has a huge test suite that is a vital to continued development.

<snip> If I may, I would like to agree with you both: A test suite should ideally cover all the aspects of the tested program, that are not checked statically by the compiler. So in python, I end up writing test cases that check for runtime type errors; in haskell, I don't. In both languages, it's good advice to write a test suite that checks the correctness of calculated values. Haskell's static type system feels to me like an automatically generated, somehow dumb test suite. It does not replace a full-flegded hand-written one, but it does replace a big part of it (that is, of what you would have to write in a dynamic language). And it runs much faster. I also tend to write test suites when I feel, the code exceeds a certain level of complexity. This level is language dependent and in haskell's case, it's pretty high. (I should probably lower that level and write more test cases, but that seems to be true for all languages.) And yes, haskell has great support for writing test suites.

Jake McArthur

5:12 p.m.

On 01/05/2011 03:44 AM, Jonathan Geddes wrote:

...

When I write Haskell code, I write functions (and monadic actions) that are either a) so trivial that writing any kind of unit/property test seems silly, or are b) composed of other trivial functions using equally-trivial combinators.

"There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies." -- C.A.R. Hoare If you actually manage to do the former, I'd say you don't need to test those parts in isolation. That said, I disagree with you overall. The Haskell type system is simply not rich enough to guarantee everything you might need. Even if it was, it would take a lot of work to encode all your invariants, probably more work than writing tests would have been (although there are obvious advantages to the former as far as having a high level of assurance that your code is correct). Haskell has some awesome testing tool, and I highly recommend getting acquainted with them. In particular, you should definitely learn how to use QuickCheck, which allows you to easily check high-level properties about your code; this is beyond what most traditional unit tests could hope to achieve. I tend to use QuickCheck, SmallCheck, *and* LazySmallCheck in my test suites, as I feel that they complement each other well. HUnit is probably the main one for traditional unit tests. I admit I have never used it, and I'm not sure whether I'm missing out on anything. There are also some pretty nice test frameworks out there to help you manage all your tests, although they could probably use a little more work overall. - Jake

Jonathan Geddes

8:02 p.m.

...

The Haskell type system is simply not rich enough to guarantee everything you might need.

That's true, and after giving this a bit more thought, I realized it's not JUST the type system that I'm talking about here. There are a few other features that make it hard for me to want to use unit/property tests. For example, say (for the sake of simplicity and familiarity) that I'm writing the foldl function. If I were writing this function in any other language, this would be my process: first I'd write a test to check that foldl returns the original accumulator when the list is empty. Then I would write code until the test passed. Then I would move on to the next property of foldl and write a test for it. Rinse repeat. But in Haskell, I would just write the code:

...

foldl _ acc [] = acc

The function is obviously correct for my (missing) test. So I move on to the next parts of the function:

...

foldl _ acc [] = acc foldl f acc (x:xs) = foldl f (f acc x) xs

and this is equally obviously correct. I can't think of a test that would increase my confidence in this code. I might drop into the ghci repl to manually test it once, but not a full unit test. I said that writing Haskell code feels like "writing code against a solid test base." But I think there's more to it than this. Writing Haskell code feels like writing unit tests and letting the machine generate the actual code from those tests. Declarative code for the win. Despite all this, I suspect that since Haskell is at a higher level of abstraction than other languages, the tests in Haskell must be at a correspondingly higher level than the tests in other languages. I can see that such tests would give great benefits to the development process. I am convinced that I should try to write such tests. But I still think that Haskell makes a huge class of tests unnecessary.

...

Haskell has some awesome testing tool, and I highly recommend getting acquainted with them.

I will certainly take your advice here. Like I said, I use TDD in other languages but mysteriously don't feel its absence in Haskell. I probably need to get into better habits. --Jonathan

Anthony Cowley

8:30 p.m.

On Wed, Jan 5, 2011 at 3:02 PM, Jonathan Geddes wrote:

...

...
The Haskell type system is simply not rich enough to guarantee everything you might need. Despite all this, I suspect that since Haskell is at a higher level of abstraction than other languages, the tests in Haskell must be at a correspondingly higher level than the tests in other languages. I can see that such tests would give great benefits to the development process. I am convinced that I should try to write such tests. But I still think that Haskell makes a huge class of tests unnecessary.

The way I think about this is that you want to write tests for things that can not be usefully represented in the type system. If you have a parametrically typed function, then the type system is doing a lot of useful "testing" for you. If you want to make sure that you properly parse documents in a given format, then having a bunch of examples that feed into unit tests is a smart move. Anthony

Gregory Collins

9:27 p.m.

On Wed, Jan 5, 2011 at 9:02 PM, Jonathan Geddes wrote:

...

Despite all this, I suspect that since Haskell is at a higher level of abstraction than other languages, the tests in Haskell must be at a correspondingly higher level than the tests in other languages. I can see that such tests would give great benefits to the development process. I am convinced that I should try to write such tests. But I still think that Haskell makes a huge class of tests unnecessary.

The testing stuff available in Haskell is top-notch, as others have pointed out. One of the biggest PITAs with testing in other languages is having to come up with a set of test cases to fully exercise your code. If you don't keep code coverage at 100% or close to it, it is quite easy to test only the inputs you are *expecting* to see (because programmers are lazy) and end up with something which is quite broken or even insecure w.r.t. buffer overruns, etc. (Of course we don't usually have those in Haskell either.) QuickCheck especially is great because it automates this tedious work: it fuzzes out the input for you and you get to think in terms of higher-level invariants when testing your code. Since about six months ago with the introduction of JUnit XML support in test-framework, we also have plug-in instrumentation support with continuous integration tools like Hudson: http://buildbot.snapframework.com/job/snap-core/ http://buildbot.snapframework.com/job/snap-server/ It's also not difficult to set up automated code coverage reports: http://buildbot.snapframework.com/job/snap-core/HPC_Test_Coverage_Report/ http://buildbot.snapframework.com/job/snap-server/HPC_Test_Coverage_Report/ Once I had written the test harness, I spent literally less than a half-hour setting this up. Highly recommended, even if it is a (blech) Java program. Testing is one of the few areas where I think our "software engineering" tooling is on par with or exceeds that which is available in other languages. G -- Gregory Collins

Iustin Pop

9:32 p.m.

On Wed, Jan 05, 2011 at 10:27:29PM +0100, Gregory Collins wrote:

...

Once I had written the test harness, I spent literally less than a half-hour setting this up. Highly recommended, even if it is a (blech) Java program. Testing is one of the few areas where I think our "software engineering" tooling is on par with or exceeds that which is available in other languages.

Indeed, I have found this to be true as well, and have been trying to explain it to non-Haskellers. Though I would also rank the memory/space profiler very high compared to what is available for some other languages. And note, it's also easy to integrate with the Python-based buildbot, if one doesn't want to run Java :) regards, iustin

John Zabroski

11:41 p.m.

These are some heuristics & memories I have for myself, and you can feel free to take whatever usefulness you can get out of it. 1. Don't confuse TDD with writing tests, in general. 2. Studies show that if you do TDD, you can write more tests than if you write tests after you write the code. Therefore, TDD is the most productive way to test your code. 3. TDD has nothing to do with what language you are using; if you want a great book on TDD, I'd recommend Nat Pryce and Steve Freeman's Growing Object-Oriented Software; it has nothing to do with Haskell but everything to do with attitude towards software process. A language is not enough to dramatically improve quality, you need a sane process. Picking a good language is just as important as picking a sane process, and the two go hand-in-hand in creating great results. 4. Haskell's type system gives you confidence, not certainty, that you are correct. Understand the difference. The value of TDD is that the tests force you to think through your functional and non-functional requirements, before you write the code. 5. I have a hard time understanding statements like "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property." Difficulty in unit testing OO code is best documented in Robert Binder's tome [1], which is easily the best text on testing I've ever read and never gets cited by bloggers and other Internet programmarazzi (after all, who has time to read 1,500 pages on testing when you have to maintain a blog). Moreover, you should not be mocking objects. That will lead to a combinatorial explosion in tests and likely reveal that your object model leaks encapsulation details (think about it). Mock the role the object plays in the system instead; this is kind of a silly way to say "use abstraction" but I've found most people need to hear a good idea 3 different ways in 3 different contexts before they can apply it beyond one playground trick. 6. If you care about individual objects, use design by contract and try to write your code as stateless as possible; design by contract is significantly different from TDD. 7. Difficulty in testing objects depends on how you describe object behavior, and has nothing to do with any properties of objects as compared with abstract data types! For example, if object actions are governed by an event system, then to test an interaction, you simply mock the event queue manager. This is because you've isolated your test to three variants: (A) the state prior to an atomic action, (B) the state after that action, and (C) any events the action generates. This is really not any more complicated than using QuickCheck, but unfortunately most programmers are only familiar with using xUnit libraries for unit testing and they have subdued the concept of "unit testing" to a common API that is not particularly powerful. Also, note earlier my dislike of the argument that "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."; under this testing methodology, there is no "particular property" to test, since the state of the application is defined in terms of all the object's attributes *after* the action has been processed. It doesn't make much sense to just test one property. It's called unit testing, not property testing. 8. If you've got a complicated problem, TDD will force you to decompose it before trying to solve it. This is sort of a silly point, again, since more naturally good programmers don't waste their time writing random code first and then trying to debug it. Most naturally good programmers will think through the requirements and write it correctly the first time. TDD is in some sense just a bondage & discipline slogan for the rest of us mere mortals; you get no safety word, however. You just have to keep at it. 9. Watch John Hughes' Functional Programming Secret Weapon talk [2]. I'd recommend watching it if you haven't already. 10. Watch and learn. Google "QuickCheck TDD" [3] and see what comes up. Maybe you can be inspired by real world examples? [1] http://www.amazon.com/dp/0201809389 [2] http://video.google.com/videoplay?docid=4655369445141008672 [3] http://www.google.com/search?q=QuickCheck+TDD

John Zabroski

6 Jan 6 Jan

12:20 a.m.

On Wed, Jan 5, 2011 at 6:41 PM, John Zabroski wrote:

...

5. I have a hard time understanding statements like "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property." Difficulty in unit testing OO code is best documented in Robert Binder's tome [1], which is easily the best text on testing I've ever read and never gets cited by bloggers and other Internet programmarazzi (after all, who has time to read 1,500 pages on testing when you have to maintain a blog). Moreover, you should not be mocking objects. That will lead to a combinatorial explosion in tests and likely reveal that your object model leaks encapsulation details (think about it). Mock the role the object plays in the system instead; this is kind of a silly way to say "use abstraction" but I've found most people need to hear a good idea 3 different ways in 3 different contexts before they can apply it beyond one playground trick.

7. Difficulty in testing objects depends on how you describe object behavior, and has nothing to do with any properties of objects as compared with abstract data types! For example, if object actions are governed by an event system, then to test an interaction, you simply mock the event queue manager. This is because you've isolated your test to three variants: (A) the state prior to an atomic action, (B) the state after that action, and (C) any events the action generates. This is really not any more complicated than using QuickCheck, but unfortunately most programmers are only familiar with using xUnit libraries for unit testing and they have subdued the concept of "unit testing" to a common API that is not particularly powerful. Also, note earlier my dislike of the argument that "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."; under this testing methodology, there is no "particular property" to test, since the state of the application is defined in terms of all the object's attributes *after* the action has been processed. It doesn't make much sense to just test one property. It's called unit testing, not property testing.

One small update for pedagogic purposes: Testing properties is really just a form of testing called negative testing; testing that something doesn't do something it shouldn't do. The testing I covered above describes positive testing. Negative testing is always going to be difficult, regardless of how you abstract your system and what language you use. Think about it!

Erik de Castro Lopo

12:26 a.m.

John Zabroski wrote:

...

5. I have a hard time understanding statements like "The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property."

This is my direct experience of inheriting code written by others without any tests and trying to add tests before doing more serious work on extending and enhancing the code base.

...

Difficulty in unit testing OO code is best documented in Robert Binder's tome [1],

I'm sure thats a fine book for testing OO code. I'm trying to avoid OO code as much as possible :-). My main point was that testing pure functions is easy and obvious in comparison to test objects with internal state. Cheers, Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Arnaud Bailly

7:55 a.m.

I would supplement this excellent list of advices with an emphasis on the first one: Test-Driven Development is *not* testing, TDD is a *design* process. Like you said, it is a discipline of thought that forces you first to express your intent with a test, second to write the simplest thing that can possibly succeed, third to remove duplication and refactor your code. It happens that this process is somewhat different in Haskell than in say Java, and actually much more fun and interesting thanks to the high signal-to-noise ratio syntax of Haskell (once you get acquainted with it of course) and its excellent support for abstaction, duplication removal, generalization and more generally refactoring (tool support may be better though...). For example, if I were to develop map in TDD (which I did actually...), I could start with the following unit test:

...

map id [] ~?= []

which I would make pass very simply by copy and pasting, only changing one symbol.

...

map id [] = []

Then I would add a failing test case:

...

TestList [ map id [] ~?= [] , map id [1] ~?= [1] ]

which I would make pass with, once again simple copy-pasting:

...

map id [] = [] map id [1] = [1]

Next test could be :

...

TestList [ map id [] ~?= [] , map id [1] ~?= [1], , map id [2] ~?= [2] ]

Which of course would pass with:

...

map id [] = [] map id [1] = [1] map id [2] = [2]

then I would notice an opportunity for refactoring:

...

map id [] = [] map id [x] = [x]

etc, etc...Sound silly? Sure it is at first sight, and any self-respecting haskeller would write such a piece, just like you said, without feeling the need to write the tests, simply by stating the equations about map. The nice thing with haskell is that it has a few features that helps in making those bigger steps in TDD, whereas less gifted languages and platforms requires a lot more experience and self-confidence to "start running": - writing types delivers some of the design burden off the tests, while keeping their intent (being executable instead of laying in dead trees), - quickcheck and friends help you express a whole class of those unit tests in one invariant expression, while keeping the spirit of TDD as one can use the counter-examples produced to drive the code-writing process. <plug> Some people might be interested in http://www.oqube.net/projects/beyond-tdd/ (a session I co-presented at SPA2009) which was an experiment to try bringing the benefits of TDD with quickcheck in haskell to the java world. </plug> Regards, Arnaud PS: On

Serguey Zefirov

10:21 a.m.

2011/1/6 Arnaud Bailly :

...

I would supplement this excellent list of advices with an emphasis on the first one: Test-Driven Development is *not* testing, TDD is a *design* process. Like you said, it is a discipline of thought that forces you first to express your intent with a test, second to write the simplest thing that can possibly succeed, third to remove duplication and refactor your code.

Change T in TDD from Test to Type and you still get a valid description like "It is a discipline of thought that forces you first to express your intent with a type, second to write the simplest thing that can possibly succeed, third to remove duplication and refactor your code." As for me, I prefer testing in the largest possible. I write functions, experiment with them in REPL, combine them and check combination result in REPL and when I cannot specify experiment in one line of ghci, I write test.

Evan Laforge

2:26 a.m.

On Wed, Jan 5, 2011 at 1:27 PM, Gregory Collins wrote:

...

On Wed, Jan 5, 2011 at 9:02 PM, Jonathan Geddes wrote:

...
Despite all this, I suspect that since Haskell is at a higher level of abstraction than other languages, the tests in Haskell must be at a correspondingly higher level than the tests in other languages. I can see that such tests would give great benefits to the development process. I am convinced that I should try to write such tests. But I still think that Haskell makes a huge class of tests unnecessary.

I write plenty of tests. Where static typing helps is that of course I don't write tests for type errors, and more things are type errors than might be in other languages (such as incomplete cases). But I write plenty of tests to verify high level relations: with this input, I expect this kind of output. A cheap analogue to "test driven" that I often do is "type driven", I write down the types and functions with the hard bits filled in with 'undefined'. Then I :reload the module until it typechecks. Then I write tests against the hard bits, and run the test in ghci until it passes. However:

...

QuickCheck especially is great because it automates this tedious work: it fuzzes out the input for you and you get to think in terms of higher-level invariants when testing your code. Since about six months ago with the introduction of JUnit XML support in test-framework, we also have plug-in instrumentation support with continuous integration tools like Hudson:

Incidentally, I've never been able to figure out how to use QuickCheck. Maybe it has more to do with my particular app, but QuickCheck seems to expect simple input data and simple properties that should hold relating the input and output, and in my experience that's almost never true. For instance, I want to ascertain that a function is true for "compatible" signals and false for "incompatible" ones, where the definition of compatible is quirky and complex. I can make quickcheck generate lots of random signals, but to make sure the "compatible" is right means reimplementing the "compatible" function. Or I just pick a few example inputs and expected outputs. To get abstract enough that I'm not simply reimplementing the function under test, I have to move to a higher level, and say that notes that have incompatible signals should be distributed among synthesizers so they don't make each other sound funny. But now it's too high level: I need a definition of "sound funny" and a model of a synthesizer... way too much work and it's fuzzy anyway. And at this level the input data is complex enough that I'd have to spend a lot of time writing and tweaking (and testing!) the data generator to verify it's covering the part of the state space I want to verify. I keep trying to think of ways to use QuickCheck, and keep failing. In my experience, the main work of testing devolves to a library of functions to create the input data, occasionally very complex, and a library of functions to extract the interesting bits from the output data, which is often also very complex. Then it's just a matter of 'equal (extract (function (generate input data))) "abstract representation of output data"'. This is how I do testing in python too, so I don't think it's particularly haskell-specific. I initially tried to use the test-framework stuff and HUnit, but for some reason it was really complicated and confusing to me, so I gave up and wrote my own that just runs all functions starting with 'test_'. It means I don't get to use the fancy tools, but I'm not sure I need them. A standard profile output to go into a tool to draw some nice graphs of performance after each commit would be nice though, surely there is such a thing out there?

Chung-chieh Shan

3:31 a.m.

Evan Laforge wrote in article in gmane.comp.lang.haskell.cafe:

...

Incidentally, I've never been able to figure out how to use QuickCheck. Maybe it has more to do with my particular app, but QuickCheck seems to expect simple input data and simple properties that should hold relating the input and output, and in my experience that's almost never true. For instance, I want to ascertain that a function is true for "compatible" signals and false for "incompatible" ones, where the definition of compatible is quirky and complex. I can make quickcheck generate lots of random signals, but to make sure the "compatible" is right means reimplementing the "compatible" function. Or I just pick a few example inputs and expected outputs.

Besides those example inputs and expected outputs, what about: If two signals are (in)compatible then after applying some simple transformations to both they remain (in)compatible? A certain family of signals is always compatible with another family of signals? Silence is compatible with every signal? Every non-silent signal is (in)compatible with itself (perhaps after applying a transformation)? -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig <INSERT PARTISAN STATEMENT HERE>

Evan Laforge

8 Jan 8 Jan

4:16 a.m.

On Wed, Jan 5, 2011 at 7:31 PM, Chung-chieh Shan wrote:

...

Besides those example inputs and expected outputs, what about: If two signals are (in)compatible then after applying some simple transformations to both they remain (in)compatible? A certain family of signals is always compatible with another family of signals? Silence is compatible with every signal? Every non-silent signal is (in)compatible with itself (perhaps after applying a transformation)?

Well, signals are never transformed. Silence is, in fact, not specially compatible. The most I can say is that signals that don't overlap are always compatible. So you're correct in that it's possible to extract properties. However, this particular property, being simple, is also expressed in a simple way directly in the code, so past a couple tests to make sure I didn't reverse any (>)s, I don't feel like it needs the exhaustive testing that quickcheck brings to bear. And basically it's just reimplementing a part of the original function, in this case the first guard... I suppose you could say if I typoed the (>)s in the original definition, maybe I won't in the test version. But this is too low level, what I care about is if the whole thing has the conceptually simple but computationally complex result that I expect. The interesting bug is when the first guard shadows an exception later on, so it turns out it's *not* totally true that non-overlapping signals must be compatible, or maybe my definition of "overlapping" is not sufficiently defined, or defined different ways in different places, or needs to be adjusted, or.... I suppose input fuzzing should be able to flush out things like fuzzy definitions of overlapping I can also say weak things about complex outputs, that they will be returned in sorted order, that they won't overlap, etc. But I those are rarely the interesting complicated things that I really want to test. Even my "signal compatibility" example is relatively amenable to extracting properties, picking some other examples: - Having a certain kind of syntax error will result in a certain kind of error msg, and surrounding expressions will continue to be included in the output. The error msg will include the proper location. So I'd need an Arbitrary to generate the right structure with an error or two and then have code to figure out the reported location from the location in the data structure, and debug all that. There's actually a fair amount of stuff that wants to look for a log msg, like "hit a cache, the only sign of which is a log msg of a certain format". Certainly caches can be tested by asserting that you get the same results with the cache turned off, that's an easy property. - Hitting a certain key sequence results in certain data being entered in the UI. There's nothing particularly "property" like about this, it's too ad-hoc... this seems to apply for all UI-level tests. - There's also a large class of "integration" type tests: I've tested the signal compatibility function separately, but the final proof is that the high-level user input of this shape results in this output, due to do signal compatibility. These are the ones whose failure is the most valuable because they test the emergent behaviour of a set of interacting systems, and that's ultimately the user-visible behaviour and also the place where the results are the most subtle. But those are also the ones that have huge state spaces and, similar to the UI tests, basically ad-hoc relationships between input and output. - Testing for laziness of course doesn't work either. Or timed things. As far as performance goes, some can be tested with tests ("taking the first output doesn't force the entire input" or "a new key cancels the old threads and starts new ones") but some must be tested with profiling and eyeballing the results. QuickCheck seems to fit well when you have small input and output spaces, but complicated stuff in the middle, but still simple relations between the input and output. I think that's why data structures are so easy to QuickCheck. I suppose I should look around for more use of QuickCheck for non-data structures... the examples I've seen have been trivial stuff like 'reverse . reverse = id'.

Serguey Zefirov

6 Jan 6 Jan

10:36 a.m.

2011/1/6 Evan Laforge :

...

...
QuickCheck especially is great because it automates this tedious work: it fuzzes out the input for you and you get to think in terms of higher-level invariants when testing your code. Since about six months ago with the introduction of JUnit XML support in test-framework, we also have plug-in instrumentation support with continuous integration tools like Hudson: Incidentally, I've never been able to figure out how to use QuickCheck. Maybe it has more to do with my particular app, but QuickCheck seems to expect simple input data and simple properties that should hold relating the input and output, and in my experience that's almost never true. For instance, I want to ascertain that a function is true for "compatible" signals and false for "incompatible" ones, where the definition of compatible is quirky and complex. I can make quickcheck generate lots of random signals, but to make sure the "compatible" is right means reimplementing the "compatible" function.

I should say that this reimplementation would be good. If you can compare two implementations (one in plain Haskell and second in declarative QuickCheck rules) you will be better that with only one. We did that when testing implementations of commands in CPU model. Our model was built to specification and we have to be sure we implement it right. One problem was in CPU flags setup, specification was defined in terms of bit manipulation, we wrote tests that did the same but with ordinary arithmetic. Like carry = (a+b) `shirtR` 8 was compared with carry = bit operandA 7 && bit operandB 7 && not (bit result 7). We found errors in our implementation, we fixed them and there was almost no errors found after that. Doing two implementation for testing purposes can be boldly likened to code review with only one person.

Felipe Almeida Lessa

7 Jan 7 Jan

2:02 p.m.

Seeing all the good discussion on this thread, I think we are missing a TDD page on our Haskell.org wiki. =) Cheers, -- Felipe.

Evan Laforge

8 Jan 8 Jan

5:11 a.m.

...

I should say that this reimplementation would be good. If you can compare two implementations (one in plain Haskell and second in declarative QuickCheck rules) you will be better that with only one.

This presumes I know how to write a simple but slow version. Clearly, that's an excellent situation, since you can trust your simple but slow version more than the complex but fast one. Unfortunately, I'm usually hard enough pressed to write just the slow version. If I could think of a simpler way to write it I'd be really set, but I'm already writing things in the simplest possible way I know how.

...

Doing two implementation for testing purposes can be boldly likened to code review with only one person.

Indeed, but unfortunately it still all comes from the same brain. So if it's too low level, I'll make the same wrong assumptions about the input. If it's too high level, then writing a whole new program is too much work. I think you make a good point, but one that's only applicable in certain situations.

Edward Z. Yang

5 Jan 5 Jan

8:35 p.m.

Haskell's type system makes large classes of traditional "unit tests" irrelevant. Here are some examples: - Tests that simply "run" code to make sure there are no syntax errors or typos, - Tests that exercise simple input validation that is handled by the type system, i.e. passing an integer to a function when it expects a string, But, as many other people have mentioned, that doesn't nearly cover all unit tests (although when I look at some people's unit tests, one might think this was the case.) Cheers, Edward

Erik de Castro Lopo

11:02 p.m.

Jonathan Geddes wrote:

...

I know that much of my code could benefit from a property test or two on the more complex parts, but other than that I can't think that unit testing will improve my Haskell code/programming practice.

One other thing I should mention that is that since a lot of Haskell code is purely functional its actually easier to test than imperative code and particularly OO code. The difficulties in unit testing OO code is coaxing objects into the correct state to test a particular property. Usually this means a whole bunch of extra code to implement mock objects to feed the right data to the object under test. By contrast, much Haskell code is purely functional. With pure functions there is no state that needs to be set up. For testing pure functions, its just a matter of collecting a set of representative inputs and making sure the correct output is generated by each input. For example, Don Stewart reported that the XMonad developers conciously made as much of the XMonad code pure so it was more easily testable. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Jesse Schalken

6 Jan 6 Jan

4:45 a.m.

You need both. A good static type system will tell you whether or not the code is type-correct. It will not tell you whether or not it does what it's supposed to do. Consider: sort :: [a] -> [a] If you change sort to be: sort = id It will still type check, but it obviously doesn't do what it's supposed to do anymore. You need tests to verify that. If you then change sort to be: sort _ = 5 Now it's also type-incorrect. Static typing will catch it at compile time (eg. Haskell will now infer the type as "Num b => a -> b" which will not unify with "[a] -> [a]"), and dynamic typing will likely throw some sort of type error at run time in the places it was previously used. (Any error thrown by the language itself, like PHP's "Cannot call method on non-object" or Python's "TypeError" or even Java's "NullPointerException" or C++'s "Segmentation Fault" can be considered a type error.) So with static typing, the machine will verify type-correctness, but you still need tests to verify the program meets its specification. With dynamic typing, you need tests to verify that the program meets both its specification *and* doesn't throw any type errors - so you need to test more. The fact that most errors in programming are type errors and that Haskell programs therefore tend to "just work" once you can get them past the type checker may lead you to believe you don't need to test at all. But you still do for the reasons above, you just need to test a hell of a lot less. On Wed, Jan 5, 2011 at 8:44 PM, Jonathan Geddes wrote:

...

Cafe,

In every language I program in, I try to be as disciplined as possible and use Test-Driven Development. That is, every language except Haskell.

There are a few great benefits that come from having a comprehensive test suite with your application:

1. Refactoring is safer/easier 2. You have higher confidence in your code 3. You have a sort of 'beacon' to show where code breakage occurs

Admittedly, I don't believe there is any magical benefit that comes from writing your tests before your code. But I find that when I don't write tests first, it is incredibly hard to go back and write them for 'completed' code.

But as mentioned, I don't write unit tests in Haskell. Here's why not.

When I write Haskell code, I write functions (and monadic actions) that are either a) so trivial that writing any kind of unit/property test seems silly, or are b) composed of other trivial functions using equally-trivial combinators.

So, am I missing the benefits of TDD in my Haskell code?

Is the refactoring I do in Haskell less safe? I don't think so. I would assert that there is no such thing as refactoring with the style of Haskell I described: the code is already super-factored, so any code reorganization would be better described as "recomposition." When "recomposing" a program, its incredibly rare for the type system to miss an introduced error, in my experience.

Am I less confidence in my Haskell code? On the contrary. In general, I feel more confident in Haskell code WITHOUT unit tests than code in other languages WITH unit tests!

Finally, am I missing the "error beacon" when things break? Again I feel like the type system has got me covered here. One of the things that immediately appealed to me about Haskell is that the strong type system gives the feeling of writing code against a solid test base.

The irony is that the type system (specifically the IO monad) force you to structure code that would be very easy to test because logic code is generally separated from IO code.

I explained these thoughts to a fellow programmer who is not familiar with Haskell and his response was essentially that any language that discourages you from writing unit tests is a very poor language. He (mis)quoted: "compilation [is] the weakest form of unit testing" [0]. I vehemently disagreed, stating that invariants embedded in the type system are stronger than any other form of assuring correctness I know of.

I know that much of my code could benefit from a property test or two on the more complex parts, but other than that I can't think that unit testing will improve my Haskell code/programming practice. Am I putting too much faith in the type system?

[0] http://blog.jayfields.com/2008/02/static-typing-considered-harmful.html

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Florian Weimer

7 Jan 7 Jan

7:36 p.m.

* Jonathan Geddes:

...

When I write Haskell code, I write functions (and monadic actions) that are either a) so trivial that writing any kind of unit/property test seems silly, or are b) composed of other trivial functions using equally-trivial combinators.

You can write in this style in any language which has good support for functional composition (which means some sort of garbage collection and perhaps closures, but strong support for higher-order functions is probably not so important). But this doesn't mean that you don't have bugs. There are a few error patterns I've seen when following this style (albeit not in Haskell): While traversing a data structure (or parsing some input), you fail to make progress and end up in an infinite loop. You confuse right and left (or swap two parameters of the same type), leading to wrong results. All the usual stuff about border conditions still applies. Input and output is more often untyped than typed. Typos in magic string constants (such as SQL statements) happen frequently. Therefore, I think that you cannot really avoid extensive testing for a large class of programming tasks.

...

I vehemently disagreed, stating that invariants embedded in the type system are stronger than any other form of assuring correctness I know of.

But there are very interesting invariants you cannot easily express in the type system, such as "this list is of finite length". It also seems to me that most Haskell programmers do not bother to turn the typechecker into some sort of proof checker. (Just pick a few standard data structures on hackage and see if they perform such encoding. 8-)

Heinrich Apfelmus

8 Jan 8 Jan

1:33 p.m.

Florian Weimer wrote:

...

* Jonathan Geddes:

...
When I write Haskell code, I write functions (and monadic actions) that are either a) so trivial that writing any kind of unit/property test seems silly, or are b) composed of other trivial functions using equally-trivial combinators.

You can write in this style in any language which has good support for functional composition (which means some sort of garbage collection and perhaps closures, but strong support for higher-order functions is probably not so important). But this doesn't mean that you don't have bugs. There are a few error patterns I've seen when following this style (albeit not in Haskell):

As you mention, the bugs below can all be avoided in Haskell by using the type system and the right abstractions and combinators. You can't put everything in the type system - at some point, you do have to write actual code - but you can isolate potential bugs to the point that their correctness becomes obvious.

...

While traversing a data structure (or parsing some input), you fail to make progress and end up in an infinite loop.

Remedy: favor higher-order combinators like fold and map over primitive recursion.

...

You confuse right and left (or swap two parameters of the same type), leading to wrong results.

Remedy: use more descriptive types, for instance by putting Int into an appropriate newtype. Use infix notation source `link` target.

...

All the usual stuff about border conditions still applies.

Partial remedy: choose natural boundary conditions, for example and [] = True, or [] = False. But I would agree that this is one of the main use cases for QuickCheck.

...

Input and output is more often untyped than typed. Typos in magic string constants (such as SQL statements) happen frequently.

Remedy: write magic string only once. Put them in a type-safe combinator library.

...

Therefore, I think that you cannot really avoid extensive testing for a large class of programming tasks.

Hopefully, the class is not so large anymore. ;)

...

...
I vehemently disagreed, stating that invariants embedded in the type system are stronger than any other form of assuring correctness I know of.

But there are very interesting invariants you cannot easily express in the type system, such as "this list is of finite length".

Sure, you can. data FiniteList a = Nil | Cons a !(FiniteList a) I never needed to know whether a list is finite, though. It is more interesting to know whether a list is infinite. data InfiniteList a = a :> InfiniteList a

...

It also seems to me that most Haskell programmers do not bother to turn the typechecker into some sort of proof checker. (Just pick a few standard data structures on hackage and see if they perform such encoding. 8-)

I at least regularly encode properties in the types, even if it's only a type synonym. I also try to avoid classes of bugs by "making them obvious", i.e. organizing my code in such a way that correctness becomes obvious. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

5291

Age (days ago)

5294

Last active (days ago)

List overview

Download

23 comments

17 participants

participants (17)

Anthony Cowley
Arnaud Bailly
Chung-chieh Shan
Edward Z. Yang
Erik de Castro Lopo
Evan Laforge
Felipe Almeida Lessa
Florian Weimer
Gregory Collins
Heinrich Apfelmus
Iustin Pop
Jake McArthur
Jesse Schalken
John Zabroski
Jonathan Geddes
Serguey Zefirov
Sönke Hahn