Re: [Haskell-cafe] Question on optimization

15 Dec 2020

      On Tue, Dec 15, 2020 at 02:05:13PM -0800, Todd Wilson wrote:
...
In code of the form
    case as ++ bs of
      [] -> ...
      x:xs -> ...
under what circumstances is the overhead of cons-cell creation for the
elements of `as` that would happen in an eager language avoided?
In Haskell, even without optimisation, I'd expect the above to
run in constant space and time regardless of the length of `as`.
...
Always? Is the answer the same if `f` is defined by simple list
recursion and we call `f (as ++ bs)`? What if `f` is defined in a
different module (or comes from the Prelude)?
For example, ghci with no optimisation:

    λ> let f n = [n..n*n]
    λ> case f 1000000 ++ f 10 of { [] -> "empty"; x:xs -> show x }
    "1000000"
    (0.01 secs, 69,680 bytes)
    λ> case f 10000000000000 ++ f 10 of { [] -> "empty"; x:xs -> show x }
    "10000000000000"
    (0.01 secs, 75,384 bytes)

    λ> let x = (1 : undefined) ++ [2]
    (0.01 secs, 63,200 bytes)
    λ> case x of { [] -> "empty"; x:xs -> show x }
    "1"
    (0.01 secs, 64,808 bytes)

So at least with "++" the result is expected to be lazy in the tail of
`as`.  If you interpose some list-generating function:

    f :: [a] -> [b]

called as `f (as ++ bs)`, the result rather depends on how strict `f` is
in the tail of its input.  If `f` can return an initial segment of its
output from just an initial segment of its input, then the same sort
of constant space/time behaviour would be expected from `f`.
...
by looking at the (intermediate) code generated by the compiler, but
more generally, are there some succinct principles that everyday
programmers can hang on their cubicle walls that summarize a lot of
what they can expect from the compiler in terms of optimizations?
This feels more like strict/lazy question to me, than optimised vs.
unoptimised.  Haskell promises lazy evaluation even when unoptimised.

    λ> head $ map (+1) $ (1 : undefined) ++ [2]
    2

Here, `map` is but one example of an `f` that is lazy in the tail of its
input:

    map _ [] = []
    map f (x:xs) = f x : map f xs

Generally, any list-to-list function that returns incrementally
consumable output given an infinite list, or a usable initial segment
from a list with a divergent term is one that necessarily only
evaluates some finite initial segment of its input, and is not
sensitive to what lies beyond.

A naive mental model for (++) is then:

    [] ++ bs = bs
    (a:as) ++ bs = a : (as ++ bs)

which is easily seen to be lazy in `as`.

-- 
    Viktor.

Re: [Haskell-cafe] Question on optimization

Viktor Dukhovni