Computing lazy and strict list operations at the same time

Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function? There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list. init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = [] last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.

On Jun 19, 2006, at 11:24 AM, C Rodrigues wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.
initlast :: [a] -> ([a],a) initlast (x:xs) = f x xs id where f x (y:ys) g = f y ys (g . (x:)) f x [] g = (g [],x) Its within the letter, if maybe not the spirit of the rules. The accumulated function could arguably be considered to be traversing the list again. FYI, the technique is a fairly well known one for overcoming the quadratic behavior of repeated (++). Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

On Jun 19, 2006, at 11:24 AM, C Rodrigues wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once?
After Robert Dockins, Jon Fairbarn and Duncan Coutts, one more, nothing original...: initlast (x:xs) = inl x xs where inl x (a:as) = (x:q,y) where (q,y) = inl a as inl x [] = ([],x) Such tricks become your second nature, when you take the solution (lazy) of the "repmin" problem by Richard Bird, you put it under your pillow, and sleep for one week with your head close to it. Jerzy Karczmarczuk

Where's the solution and what is the repmin problem? On Jun 19, 2006, at 5:21 PM, Jerzy Karczmarczuk wrote:
Such tricks become your second nature, when you take the solution (lazy) of the "repmin" problem by Richard Bird, you put it under your pillow, and sleep for one week with your head close to it.

Joel Reymont writes:
Where's the solution and what is the repmin problem?
On Jun 19, 2006, at 5:21 PM, Jerzy Karczmarczuk wrote:
Such tricks become your second nature, when you take the solution (lazy) of the "repmin" problem by Richard Bird, you put it under your pillow, and sleep for one week with your head close to it.
Well, the Functionalist True Lazy Church considers this to be a part of the Holy Scriptures... R.S. Bird. Using circular programs to eliminate multiple traversals of data. Acta Informatica, 21, pp. 239--250, 1984. Traverse a binary tree ONCE, and replace all the elements by the minimum of all leaves (i.e., construct a new tree, topologically equivalent, but with all leaf nodes being the minimum value within the original source. A one pass algorithm postpones the binding of an argument until the minimum is found... data Tree a = L a | B (Tree a) (Tree a) rpMin :: (Tree Int, Int) -> (Tree Int, Int) rpMin (L a, m) = (L m, a) rpMin (B l r, m) = let (l', ml) = rpMin (l, m) (r', mr) = rpMin (r, m) in (B l' r', ml `min` mr) replaceMin :: Tree Int -> Tree Int replaceMin t = let (t', m) = rpMin (t, m) in t' Google, your not-so-humble friend will find you some dozen references... For example, Levent Erkök: http://www.cse.ogi.edu/PacSoft/projects/rmb/repMin.html Jerzy Karczmarczuk

On 2006-06-19 at 15:24-0000 "C Rodrigues" wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.
il [] = error "foo" il [x] = ([], x) il (x:xs) = cof x (il xs) where cof x ~(a,b) = (x:a, b) -- ! Should do it, I think. -- Jón Fairbairn Jon.Fairbairn at cl.cam.ac.uk

On Mon, 2006-06-19 at 17:03 +0100, Jon Fairbairn wrote:
On 2006-06-19 at 15:24-0000 "C Rodrigues" wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.
il [] = error "foo" il [x] = ([], x) il (x:xs) = cof x (il xs) where cof x ~(a,b) = (x:a, b) -- !
From a quick test, it looks like none of our suggested solutions actually work in constant space.
main = interact $ \s -> case il s of (xs, x) -> let l = length xs in l `seq` show (l,x) using ghc: ghc -O foo.hs -o foo ./foo +RTS -M10m -RTS < 50mb.data using runhugs: runhugs foo.hs < 50mb.data in both cases and for each of the three solutions we've suggested the prog runs out of heap space where the spec asked for constant heap use. So what's wrong? In my test I was trying to follow my advice that we should consume the init before consuming the last element. Was that wrong? Is there another way of consuming the result of initlast that will work in constant space? Note that by changing discarding the x we do get constant space use: main = interact $ \s -> case il s of (xs, x) -> let l = length xs in l `seq` show l -- rather than 'show (l,x)' Why does holding onto 'x' retain 'xs' (or the initial input or some other structure with linear space use)? Duncan

On Jun 19, 2006, at 12:50 PM, Duncan Coutts wrote:
On Mon, 2006-06-19 at 17:03 +0100, Jon Fairbairn wrote:
On 2006-06-19 at 15:24-0000 "C Rodrigues" wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.
il [] = error "foo" il [x] = ([], x) il (x:xs) = cof x (il xs) where cof x ~(a,b) = (x:a, b) -- !
From a quick test, it looks like none of our suggested solutions actually work in constant space.
main = interact $ \s -> case il s of (xs, x) -> let l = length xs in l `seq` show (l,x)
using ghc: ghc -O foo.hs -o foo ./foo +RTS -M10m -RTS < 50mb.data
using runhugs: runhugs foo.hs < 50mb.data
in both cases and for each of the three solutions we've suggested the prog runs out of heap space where the spec asked for constant heap use.
Actually, the OP asked for constant stack space, which is quite different and much easier to achieve.
So what's wrong? In my test I was trying to follow my advice that we should consume the init before consuming the last element. Was that wrong? Is there another way of consuming the result of initlast that will work in constant space?
That is, nonetheless, an interesting question.
Note that by changing discarding the x we do get constant space use: main = interact $ \s -> case il s of (xs, x) -> let l = length xs in l `seq` show l -- rather than 'show (l,x)'
Why does holding onto 'x' retain 'xs' (or the initial input or some other structure with linear space use)?
Duncan
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

On Mon, Jun 19, 2006 at 05:50:13PM +0100, Duncan Coutts wrote:
On Mon, 2006-06-19 at 17:03 +0100, Jon Fairbairn wrote:
il [] = error "foo" il [x] = ([], x) il (x:xs) = cof x (il xs) where cof x ~(a,b) = (x:a, b) -- !
From a quick test, it looks like none of our suggested solutions actually work in constant space.
main = interact $ \s -> case il s of (xs, x) -> let l = length xs in l `seq` show (l,x)
I was hoping to have enlightenment served to me, but since nobody has
explained this, I took a crack at it. I still can't explain it, but I
got some data that maybe somebody else will understand. My code:
initlast :: [a] -> ([a], a)
initlast [x] = ([], x)
initlast (x:xs) = let (init, last) = initlast xs
in (x:init, {-# SCC "last" #-} last)
lenshow n (_:xs) last = let n1 = n+1 in n1 `seq` lenshow n1 xs last
lenshow n [] last = show (n,last)
main = interact $ \s -> case initlast s of
(xs, x) -> lenshow 0 xs x
lenshow is just "show (length xs, x)", written out so I can tweak it
later. This exhibits the runaway space usage with a large input that
Duncan described. If you throw away "last" in lenshow and just "show
n", it runs in constant space.
It seems that the reference to "last" that I annotated as a cost center
is holding a chain of trivial thunks--trivial because "last" is just
being copied from the result of the recursive call to initlast. I
thought maybe I could get rid of them by returning an unboxed tuple from
initlast, but this turned out to make no difference.
Profiling gave a couple hints. Retainer set profiling (-hr) showed the
retainer set holding all the memory was
{

This is a follow-up to a thread from June-July[1]. The question was how to write the function initlast :: [a] -> ([a], a) initlast xs = (init xs, last xs) so that it can be consumed in fixed space: main = print $ case initlast [0..1000000000] of (init, last) -> (length init, last) Attempts were along the lines of initlast :: [a] -> ([a], a) initlast [x] = ([], x) initlast (x:xs) = let (init, last) = initlast xs in (x:init, last) I seemed obvious to me at first (and for a long while) that ghc should force both computations in parallel; but finally at the hackathon (thanks to Simon Marlow) I realized I was expecting magic: The elements of the pair are simply independent thunks, and there's no way to "partly force" the second (ie, last) without forcing it all the way. Simon Peyton Jones graciously offered that it is "embarrassing" that we can't write this in Haskell, so to make him less embarrassed (and despite my adamance on the mailing list that the implementation be functional), I wrote an imperative version with the desired space behavior. Borrowing the insight that unsafePerform and unsafeInterleave can be thought of as hooks into the evaluator, this shows more or less what I would wish for ghc to do automatically. initlastST :: [a] -> ([a], a) initlastST xs = runST (m xs) where m xs = do r <- newSTRef undefined init <- init' r xs last <- unsafeInterleaveST (readSTRef r) return (init, last) init' r [x] = do writeSTRef r x return [] init' r (x:xs) = do writeSTRef r (last xs) liftM (x:) (unsafeInterleaveST (init' r xs)) Andrew [1] http://haskell.org/pipermail/haskell-cafe/2006-June/016171.html http://haskell.org/pipermail/haskell-cafe/2006-July/016709.html

Andrew Pimlott wrote:
This is a follow-up to a thread from June-July[1]. The question was how to write the function
initlast :: [a] -> ([a], a) initlast xs = (init xs, last xs)
so that it can be consumed in fixed space:
main = print $ case initlast [0..1000000000] of (init, last) -> (length init, last)
Attempts were along the lines of
initlast :: [a] -> ([a], a) initlast [x] = ([], x) initlast (x:xs) = let (init, last) = initlast xs in (x:init, last)
I seemed obvious to me at first (and for a long while) that ghc should force both computations in parallel; but finally at the hackathon (thanks to Simon Marlow) I realized I was expecting magic: The elements of the pair are simply independent thunks, and there's no way to "partly force" the second (ie, last) without forcing it all the way. According to the stuff about "selector thunks", it seems this should work
initlast [x] = ([],[x]) initlast (x:xs) = let ~(init,last) = initlast xs in (x:init, last) It does, at least when I build with -ddump-simpl. Other builds, I get a program that overflows. Attached is a heap profile for a run with the main (10M rather than 100M as above - that just takes too long) main = print $ case initlast [0..100000000] of (init, last) -> (length init, last) Brandon

This is a follow-up to a thread from June-July[1]. The question was how to write the function
initlast :: [a] -> ([a], a) initlast xs = (init xs, last xs)
so that it can be consumed in fixed space:
main = print $ case initlast [0..1000000000] of (init, last) -> (length init, last)
if space is the main issue, you could try to avoid the sharing (so that each part of the computation unfolds and throws away its own copy of the input list): initlast :: (()->[a]) -> ([a], a) initlast xs = (init (xs ()), last (xs ())) main = print $ case initlast (\()->[0..1000000000]) of (init, last) -> (length init, last) hth, claus ps. it would occasionally be nice to be able to undo sharing instead of having to avoid it, by asking for an independent, equally unevaluated, copy of some expression. that would make it easy to write initlast with your original type. pps. compiling with -O recreates the space problem? (ghc-6.5)

Andrew Pimlott wrote:
This is a follow-up to a thread from June-July[1]. The question was how to write the function
initlast :: [a] -> ([a], a) initlast xs = (init xs, last xs)
so that it can be consumed in fixed space:
main = print $ case initlast [0..1000000000] of (init, last) -> (length init, last)
Attempts were along the lines of
initlast :: [a] -> ([a], a) initlast [x] = ([], x) initlast (x:xs) = let (init, last) = initlast xs in (x:init, last)
I seemed obvious to me at first (and for a long while) that ghc should force both computations in parallel; but finally at the hackathon (thanks to Simon Marlow) I realized I was expecting magic: The elements of the pair are simply independent thunks, and there's no way to "partly force" the second (ie, last) without forcing it all the way. According to the stuff about "selector thunks", it seems this should work
initlast [x] = ([],[x]) initlast (x:xs) = let ~(init,last) = initlast xs in (x:init, last) Sometimes it does compile to a program that runs in constant space, sometimes it doesn't! I've sent a message to the list with a heap profile of a run on 10M numbers, but it's being held for moderation because it's too big. Brandon

On Mon, 2006-06-19 at 15:24 +0000, C Rodrigues wrote:
Here's a puzzle I haven't been able to solve. Is it possible to write the initlast function?
There are functions "init" and "last" that take constant stack space and traverse the list at most once. You can think of traversing the list as deconstructing all the (:) [] constructors in list.
init (x:xs) = init' x xs where init' x (y:ys) = x:init' y ys init' _ [] = []
last (x:xs) = last' x xs where last' _ (y:ys) = last' y ys last' x [] = x
Now, is there a way to write initlast :: [a] -> ([a], a) that returns the result of init and the result of last, takes constant stack space, and traverses the list only once? Calling reverse traverses the list again. I couldn't think of a way to do it, but I couldn't figure out why it would be impossible.
initlast :: [a] -> ([a],a) initlast [x] = ([], x) initlast (x:xs) = (x : xs', x') where (xs', x') = initlast xs It depends how you use it I think. If you look at the last element immediately then you'll get a linear collection of thunks for the init. However if you consume the init and then look at the last then I think that will use constant space. Duncan

Ah, thanks for the replies. I like the approach that uses lazy tuples of intermediate values because it has a recognizable similarity to the original two functions.
participants (10)
-
Andrew Pimlott
-
Brandon Moore
-
C Rodrigues
-
Claus Reinke
-
Duncan Coutts
-
Jerzy Karczmarczuk
-
jerzy.karczmarczuk@info.unicaen.fr
-
Joel Reymont
-
Jon Fairbairn
-
Robert Dockins