
On Tue, Dec 15, 2020 at 02:05:13PM -0800, Todd Wilson wrote:
In code of the form case as ++ bs of [] -> ... x:xs -> ...
under what circumstances is the overhead of cons-cell creation for the elements of `as` that would happen in an eager language avoided?
In Haskell, even without optimisation, I'd expect the above to run in constant space and time regardless of the length of `as`.
Always? Is the answer the same if `f` is defined by simple list recursion and we call `f (as ++ bs)`? What if `f` is defined in a different module (or comes from the Prelude)?
For example, ghci with no optimisation: λ> let f n = [n..n*n] λ> case f 1000000 ++ f 10 of { [] -> "empty"; x:xs -> show x } "1000000" (0.01 secs, 69,680 bytes) λ> case f 10000000000000 ++ f 10 of { [] -> "empty"; x:xs -> show x } "10000000000000" (0.01 secs, 75,384 bytes) λ> let x = (1 : undefined) ++ [2] (0.01 secs, 63,200 bytes) λ> case x of { [] -> "empty"; x:xs -> show x } "1" (0.01 secs, 64,808 bytes) So at least with "++" the result is expected to be lazy in the tail of `as`. If you interpose some list-generating function: f :: [a] -> [b] called as `f (as ++ bs)`, the result rather depends on how strict `f` is in the tail of its input. If `f` can return an initial segment of its output from just an initial segment of its input, then the same sort of constant space/time behaviour would be expected from `f`.
by looking at the (intermediate) code generated by the compiler, but more generally, are there some succinct principles that everyday programmers can hang on their cubicle walls that summarize a lot of what they can expect from the compiler in terms of optimizations?
This feels more like strict/lazy question to me, than optimised vs. unoptimised. Haskell promises lazy evaluation even when unoptimised. λ> head $ map (+1) $ (1 : undefined) ++ [2] 2 Here, `map` is but one example of an `f` that is lazy in the tail of its input: map _ [] = [] map f (x:xs) = f x : map f xs Generally, any list-to-list function that returns incrementally consumable output given an infinite list, or a usable initial segment from a list with a divergent term is one that necessarily only evaluates some finite initial segment of its input, and is not sensitive to what lies beyond. A naive mental model for (++) is then: [] ++ bs = bs (a:as) ++ bs = a : (as ++ bs) which is easily seen to be lazy in `as`. -- Viktor.