Musings on lists

Deaf Cafe, I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe. One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right? But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right? Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)? Thanks for your indulgence, Todd Wilson

xs ++ ys is just a thunk, but only the first time you "request" it. Afterwards it is a real list. More correctly, due to the lazy nature of Haskell, it's actually part list, part "thunk". Basically, if you do [1,2,3] ++ [4,5,6], and then "request" some elements from it, it's going to be like this: thunk (++) {arguments: 1:2:3:[] and 4:5:6:[]} | | Requesting first element v 1 : thunk (++) {arguments: 2:3:[] and 4:5:6:[]} | | Requesting second element v 1 : 2 : thunk (++) {arguments: 3:[] and 4:5:6:[]} | | Requesting first element again v 1 : 2 : thunk (++) {arguments: 3:[] and 4:5:6:[]} -- no change | | Requesting fifth element v 1 : 2 : 3 : 4 : 5: 6 : [] -- at this point those "4:5:6:[]" are exactly the same as "4:5:6:[]" from the second argument.
On 13 Jul 2023, at 00:06, Todd Wilson
wrote: Deaf Cafe,
I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe.
One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right?
But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right?
Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)?
Thanks for your indulgence,
Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

You are correct that (++) is associative, but this is only a statement about the equality of the lists, not the complexity of the algorithm used to calculate the lists in order to compare them. If you expect to be concatenating many lists in a left-associative way, it is more efficient to opt for an alternative representation, the difference list: https://wiki.haskell.org/Difference_list Instead of storing the list per se, one stores instead a function, that would prepend all of its elements in the proper order to an existing list. The isomorphism to a regular list is the application to []. But this representation turns concatenation into function application: instead of a chain of (++) we obtain a chain of (:), which as you pointed out is much faster than repeatedly traversing the same list to get to the ever-receding end. Isomorphisms like these are useful for changing the algorithmic complexity of operations without affecting their correctness. In the same vein, it’s more effective to combine several folds of the same list into a single accumulator, and traverse it only once. There are tools (such as the lens library) for converting between isomorphic representations, but I am not aware of any automagical solutions to this kind of problem (except, perhaps, rewrite rules, but that’s compiler wizardry). Cheers Melanie Brown https://wiki.haskell.org/Difference_list On Wed, Jul 12, 2023 at 18:06, Todd Wilson <[twilson@csufresno.edu](mailto:On Wed, Jul 12, 2023 at 18:06, Todd Wilson <<a href=)> wrote:
Deaf Cafe,
I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe.
One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right?
But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right?
Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)?
Thanks for your indulgence,
Todd Wilson

Lazy evaluation does make it more subtle to actually define things like algorithmic complexity. So regarding this:
isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places.
It helps to start with the definition of ++: (++) :: [a] -> [a] -> [a] (++) [] ys = ys (++) (x:xs) ys = x : xs ++ ys But to be even more explicit, let's make our own list type so that the constructors have normal names: data List a = Cons a (List a) | Empty and now this definition directly translates to: concat :: List a -> List a -> List a concat Empty ys = ys concat (Cons z zs) ys = Cons z (concat zs ys) But let's be even more explicit and simplify (desugar) this definition further: concat :: List a -> List a -> List a concat xs ys = case xs of Empty -> ys Cons z zs -> let temp = concat zs ys in Cons z temp The point of all of this is to emphasize that when you evaluate `concat xs ys`, in the non-empty xs case you will allocate a new data structure, the Cons holding the head element of xs, and the `temp` thunk representing the tail of the result. So that's the additional overhead--you have to allocate a new cell, because although the head is an existing element, the tail is not--it has to capture the parameters necessary for the recursive call, which is new information. If (and only if) you evaluate the tail, you will force evaluation of the `temp` thunk, which will allocate another cell if zs is not empty (via the recursive call). So if you traverse the entire list, you will allocate a new cell for each element of xs--once these run out, you will recurse into the Empty case, and return ys, and you won't allocate new cells for the elements of ys. But you only perform this allocation as you traverse the list (for the first time), so if you never actually traverse it then indeed you will only ever allocate one new cell in total. So you are right that saying that "xs ++ ys copies xs" is misleading--that "copy" only happens during subsequent traversal, and it would be better described as "allocates a new cell for each element of xs, as it traverses". Also, in this notation, (:) becomes: prepend :: a -> List a -> List a prepend x xs = Cons x xs Here, you allocate one new cell, but that's it--there is no recursive call, just a constructor application to existing values. That was a bit long-winded but I hope it was clear. Important note: This is different from (for example) the Java case, where List is an interface, and you can create an implementation with a custom iterator which just serves out elements from two existing lists without having to do any O(n) allocation. In the Haskell case, a list is a concrete data type, and it's expressed directly as a linked list of cells, and retrieving an element requires creation (at some point) of a cell holding that element (as well as the tail representing the rest of the list.) Jeff
On Jul 12, 2023, at 3:07 PM, Todd Wilson
wrote: Deaf Cafe,
I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe.
One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right?
But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right?
Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)?
Thanks for your indulgence,
Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Three points. 1. xs ++ ys never copies the *elements* of xs at all. What it copies is the SPINE of xs (think of each cons node as a vertebra), the cons cells. It's important to be clear about this, because students are only to ready to believe that the elements themselves are copied. 2. xs ++ ys does copy the spine of xs. In a strict language like ML or F#, it does so right away. In a non-strict language like Haskell, it does so, but later. If you ever explore the result of xs ++ ys far enough to reach ys, you will by then have copied the spine of xs. This means that xs ++ [a] is still more expensive than a : xs, it's just a matter of when the payment is made. 3. If you actually need to build a sequence from both ends for some reason, Data.Sequence exists. Or you can build a tree and then flatten it. On Thu, 13 Jul 2023 at 13:12, Jeff Clites via Haskell-Cafe < haskell-cafe@haskell.org> wrote:
Lazy evaluation does make it more subtle to actually define things like algorithmic complexity. So regarding this:
isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places.
It helps to start with the definition of ++:
(++) :: [a] -> [a] -> [a] (++) [] ys = ys (++) (x:xs) ys = x : xs ++ ys
But to be even more explicit, let's make our own list type so that the constructors have normal names:
data List a = Cons a (List a) | Empty
and now this definition directly translates to:
concat :: List a -> List a -> List a concat Empty ys = ys concat (Cons z zs) ys = Cons z (concat zs ys)
But let's be even more explicit and simplify (desugar) this definition further:
concat :: List a -> List a -> List a concat xs ys = case xs of Empty -> ys Cons z zs -> let temp = concat zs ys in Cons z temp
The point of all of this is to emphasize that when you evaluate `concat xs ys`, in the non-empty xs case you will allocate a new data structure, the Cons holding the head element of xs, and the `temp` thunk representing the tail of the result.
So that's the additional overhead--you have to allocate a new cell, because although the head is an existing element, the tail is not--it has to capture the parameters necessary for the recursive call, which is new information. If (and only if) you evaluate the tail, you will force evaluation of the `temp` thunk, which will allocate another cell if zs is not empty (via the recursive call). So if you traverse the entire list, you will allocate a new cell for each element of xs--once these run out, you will recurse into the Empty case, and return ys, and you won't allocate new cells for the elements of ys. But you only perform this allocation as you traverse the list (for the first time), so if you never actually traverse it then indeed you will only ever allocate one new cell in total. So you are right that saying that "xs ++ ys copies xs" is misleading--that "copy" only happens during subsequent traversal, and it would be better described as "allocates a new cell for each element of xs, as it traverses".
Also, in this notation, (:) becomes:
prepend :: a -> List a -> List a prepend x xs = Cons x xs
Here, you allocate one new cell, but that's it--there is no recursive call, just a constructor application to existing values.
That was a bit long-winded but I hope it was clear.
Important note: This is different from (for example) the Java case, where List is an interface, and you can create an implementation with a custom iterator which just serves out elements from two existing lists without having to do any O(n) allocation. In the Haskell case, a list is a concrete data type, and it's expressed directly as a linked list of cells, and retrieving an element requires creation (at some point) of a cell holding that element (as well as the tail representing the rest of the list.)
Jeff
On Jul 12, 2023, at 3:07 PM, Todd Wilson
wrote: Deaf Cafe,
I'm an educator using Haskell, and I've been telling my students certain stories about Haskell evaluation that I'm now starting to question, so I'm writing for some clarification. I've written before about similar things (e.g., "Basic list exercise" from Mar 16), so it seems my own education on this topic is not yet complete! My apologies if this is too trivial for the Cafe.
One story I tell students is that list concatenation, xs ++ ys, ends up "copying" the elements of xs but not the elements of ys, and in particular an expression like xs ++ [a], putting a new element at the end of a list, is particularly expensive, at least compared to x : xs, putting one at the beginning, which seems obvious, right?
But here is my question: isn't xs ++ ys just a thunk that, as elements are "requested" from it, produces them first from xs and then, when those run out, defers entirely to ys from that point on? If so, then the elements of xs aren't "copied" any more than the elements of ys are; they are just produced from different places. Even the expression xs ++ [a] isn't problematic in this light: just act like xs, but with one more "card up the sleeve" when you are done. Have I gotten that right?
Second, if the above is correct, then if there is a nesting, (xs ++ ys) ++ zs, and we want an element from it, there will be two steps before we can get that element from xs. And this will continue until xs is exhausted, at which point we get new elements in one step from ys, and finally in zero steps from zs. But ++ is associative, and so this list is also xs ++ (ys ++ zs), which requires less work getting the elements from xs. Is this kind of optimization something that is or can be done (automatically)?
Thanks for your indulgence,
Todd Wilson _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Thu, Jul 13, 2023 at 03:39:08PM +1200, Richard O'Keefe wrote:
2. xs ++ ys does copy the spine of xs. In a strict language like ML or F#, it does so right away. In a non-strict language like Haskell, it does so, but later. If you ever explore the result of xs ++ ys far enough to reach ys, you will by then have copied the spine of xs. This means that xs ++ [a] is still more expensive than a : xs, it's just a matter of when the payment is made.
Yes, if the list head is used more than once. When "xs ++ ys" is used in a fold and never seen again, no copying happens, but a thunk is allocated. In particular a single list append that's folded away is free, and the below runs in constant space, regardless of the specified length. module Main (main) where import System.Environment import Data.Foldable main :: IO () main = do fmap (read . head) getArgs >>= print . sum . xs where xs :: Int -> [Int] xs n = [1..n] ++ [1..n] The real problem is with, e.g., iterated construction of a list by appending one element at a time, depth first: module Main (main) where import System.Environment import Data.Foldable main :: IO () main = do fmap (read . head) getArgs >>= print . sum . xs where xs :: Int -> [Int] xs n = go [] [1..n] where go ys [] = ys go ys (x : xs) = go (ys ++ [x]) xs The space cost here is at least quadratic: 2 x list length -> 5.1 x allocations 3 x list length -> 12.5 x allocations 4 x list length -> 23 x allocations ... -- Viktor.

Thanks to those who have contributed answers to my questions. I can see how, in a context like let zs = xs ++ ys in .... with a persistent reference to the concatenation, the spine (not the elements) of xs would have to be copied, because zs might need to be traversed multiple times. But what about something like f (xs ++ ys) for a function f that makes a single pass through its argument list (like length or sum)? --Todd

Yes, the allocation still happens. In the "single-pass" case, all of the newly-allocated data structures don't have to persist in memory simultaneously (that is, some become available for GC while other are still being created), but they still happen. Consider: xs = 1 : (2 : (3: [] ) ) ys = 4 : (5 : (6: [] ) ) zs = xs ++ ys xs is a data structure with two field--a head and a tail. The head of xs contains 1 and the tail contains the list [2, 3]. On the other hand, the head of zs also contains 1, but the tail contains [2, 3, 4, 5, 6]. It doesn't matter that the tail may be initially represented as a thunk, the important part is that since xs and zs contain different tails, they have to be different physical data structures. So zs has to have been a new allocation. Continuing with the tails of xs and zs: (tail xs) has a head containing 2, and a tail containing [3]. (tail zs) has a head containing 2, and a tail containing [3, 4, 5, 6]. So again, these have to be two physically different data structures. ((tail (tail xs)) has a head containing 3 and a tail containing []. ((tail (tail zs)) has a head containing 3 and a tail containing [4, 5, 6]. So again, these have to be two physically different data structures. But now: (tail ((tail (tail zs))) has a head containing 4, and a tail containing [5, 6]. ys has a head containing 4 and a tail containing [5, 6]. Those are the same, so they can be physically the same data structure. So zs will require 3 additional allocations, one for each element of xs. Jef
On Jul 12, 2023, at 9:18 PM, Todd Wilson
wrote: Thanks to those who have contributed answers to my questions. I can see how, in a context like let zs = xs ++ ysin .... with a persistent reference to the concatenation, the spine (not the elements) of xs would have to be copied, because zs might need to be traversed multiple times. But what about something like f (xs ++ ys) for a function f that makes a single pass through its argument list (like length or sum)?
--Todd
participants (6)
-
Jeff Clites
-
Melanie Brown
-
MigMit
-
Richard O'Keefe
-
Todd Wilson
-
Viktor Dukhovni