
I would think that with 100% laziness, nothing would happen until the Haskell program needed to output data to, e.g. the console.
In many cases, that's exactly what it's like.
Quite obviously that's not it. So how is laziness defined in Haskell?
In fact, Haskell is not defined as lazy, it is defined as non-strict.
From what I understand, non-strict semantics are (very informally) about choosing an order of evaluation that, where possible, avoids non-termination or error. Laziness is about evaluating only what's needed. In any case, I think all of the mainstream Haskell compilers and interpreters implement the non-strict semantics by means of lazy evaluation, so unless you're working on a radical new Haskell implementation, you probably don't need to worry about the difference. There's a wiki stub about this, though it doesn't say much more than what I've just said:
http://haskell.org/haskellwiki/Lazy_vs._non-strict
I remember vaguely someone saying that pattern matching on a value forces it to be evaluated. Is that right? What else?
Yes, but you need to be more precise about when this pattern matching occurs, what you mean by "it", and the extent to which "it" is evaluated. See below about "weak head normal form".
This is one of the things that just boggles my mind everytime I try to wrap it around this thing called Haskell ;)
If it's any comfort, you're not alone. It takes some discipline to forget your previous notions of computation, and to start thinking lazily. Get familiar with the notion of the "thunk": a record which the implementation stores on the heap in place of the result of some computation which has not yet been performed. A thunk typically records a reference to a function and some arguments, but remember that the arguments (and indeed, the function) in all likelihood haven't been evaluated yet, so they might well be thunks too. A thunk sits on the heap until either the garbage collector decides it's no longer needed (in which case the computation is never performed), or until the implementation finds it needs the value (or part of it). Also get familiar with "weak head normal form" (WHNF). I'm not 100% on this, so hopefully someone will correct me if I'm wrong. Let's say the implementation gets to a point where it needs to perform a pattern match on a value. So far, the value is just an unevaluated thunk on the heap. To perform the pattern match, the implementation needs to evaluate that thunk to its top-level data constructor. It doesn't need to go any further than that just yet (unless the pattern also includes a sub-pattern inside the top-level data constructor). So a value that is evaluated to its top-level data constructor is said to be in "weak head normal form". The result of the pattern match (aside from allowing the implementation to choose between alternative patterns based on the data constructor) is typically that one or more variables are bound to values inside the data constructor. Of course, those values are most likely unevaluated thunks, at least until further pattern matching is necessary. Hopefully, that's clear enough to be of some use.