
How about simply changing `sieve` to `trialDiv`? It's not that I don't like the given example, because it gives a very small use case for laziness that is difficult enough to reproduce in an eagerly evaluated language.
Is it really so difficult to reproduce in a strict language? Here is that Haskell example in OCaml
let primes = let rec trialDiv (Cons (p,xs)) = Cons (p, lazy (trialDiv @@ filter (fun x -> x mod p <> 0) @@ Lazy.force xs)) in trialDiv @@ iota 2
[...]
OCaml, given `lazy`, is not eagerly evaluated. You are using lazy evaluation. In many modern languages laziness can be reproduced by abusing first class functions. However, you are using another important feature unconsciously: sharing. That one is not as easy to reproduce in a language that doesn't give you built-in laziness like Haskell or (apparently) OCaml. You would need to write a wrapper type with destructive update first.
That's really all there is to it. I should stress that the typechecker won't lets us forget about lazy and force!
I have very little experience with OCaml, but that's probably because the language does not consider lazily evaluated values to be semantically equivalent to their eager counterparts. If you believe that `lazy x` and `x` represent the same value semantically, then it's fine not to be explicit about forcing. This gives you the advantage that non-strict functions are lazily evaluated by default and you can "strictify" them. The other direction is not possible.
The stress on laziness in Haskell is difficult to understand given how easy it is to use laziness in essentially any language if needed.
It depends on your paradigms and idioms. After almost 7 years of Haskell I'm clearly in the lazy-by-default mindset so much that I find it difficult to program in an eager-by-default language. If I'd write OCaml code it would probably be cluttered with lazy wrappers.
Incidentally, given below is a real sieve of Eratosthenes, written as a *very* concurrent program, where all the concurrency primitives (including Haskell-like mvars) are implemented with delimcc. (The example is an elaboration of the code kindly sent by Christophe Deleuze, July 18, 2012). The full code is part of the delimcc distribution.
[...]
Interesting program. Something (semantically) similar can be implemented using laziness, but the result is very slow compared to a sieve implemented by bit operations. I can't judge the efficiency of your program without trying it, but I expect it to be similar. If you want the stream to be infinite, partial sieves are almost as fast as regular sieve (and probably faster due to better cache behaviour) and run in constant memory. Greets, Ertugrul