
On Sat, Apr 04, 2020 at 11:31:41AM +0300, Georgi Lyubenov wrote:
For example, consider this idList: ``` idList :: [a] -> [a] idList [] = [] idList (x:xs) = x : idList xs ``` Semantically this is the same as (a stricter) id, but we're deconstructing values and then constructing other ones (putting them in new boxes), so in effect, we're "copying" the input list.
But the function is not in fact strict, and so if one uses the value of "idList xs" no more than once to traverse the list (print it, fold it, ...) the list *as a whole* is never copied, and total memory used remains small even for very long lists (that are lazily generated). That said, interposing idList betweent a fold and the input list increates the amount of GC work that ultimately takes place to cleanup the per-element thunks. The elements themselves are not copied, but the enclosing lazy thunks are somewhat larger as a result of hiding the real list behind idList.
I'm not sure if ghc will do some magic optimisation to transform this particular function into an operation with no copying, but you can imagine something more complicated like map/foldr/reverse etc where it isn't possible (although other stuff like fusion is, to eliminate intermediate results from composed maps for example).
Construction of copies of lists typically happens when you traverse them more than once. Otherwise you just force each element in turn (possibly via a more complex thunk that still produces that element). -- Viktor. --- Folding 100 million Ints without idList: 8,000,045,248 bytes allocated in the heap 771,488 bytes copied during GC 44,504 bytes maximum residency (2 sample(s)) 29,224 bytes maximum slop 0 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 7658 colls, 0 par 0.022s 0.022s 0.0000s 0.0018s Gen 1 2 colls, 0 par 0.001s 0.001s 0.0004s 0.0007s INIT time 0.001s ( 0.001s elapsed) MUT time 0.981s ( 0.982s elapsed) GC time 0.023s ( 0.023s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 1.005s ( 1.005s elapsed) %GC time 0.0% (0.0% elapsed) Alloc rate 8,153,859,656 bytes per MUT second Productivity 97.6% of total user, 97.7% of total elapsed --- Folding 100 million Ints with idList: 12,800,039,208 bytes allocated in the heap 1,701,952 bytes copied during GC 44,504 bytes maximum residency (2 sample(s)) 29,224 bytes maximum slop 0 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 12206 colls, 0 par 0.030s 0.029s 0.0000s 0.0007s Gen 1 2 colls, 0 par 0.001s 0.001s 0.0004s 0.0007s INIT time 0.000s ( 0.000s elapsed) MUT time 1.484s ( 1.485s elapsed) GC time 0.031s ( 0.030s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 1.516s ( 1.516s elapsed) %GC time 0.0% (0.0% elapsed) Alloc rate 8,622,777,676 bytes per MUT second Productivity 97.9% of total user, 98.0% of total elapsed