
Neil Mitchell wrote:
Hi
If you mean an example of coding style and choice of stack vs. heap, I already have..
http://www.haskell.org/pipermail/haskell-cafe/2008-January/038832.html
I'm at a loss as why you want a strict version of take. It's clearly not for performance, as it performs slower. I'd say both the gobbler programs have a bug, namely that they are not sufficiently lazy.
I have already partially scanned the list looking for the first element that satisfies some condition, using a tail recursive search. If such an element is found I want to split the list at that point. If such an element is not found the entire list has been scanned without using any "extra" stack or heap (other than that used by the list itself and the condition test). I could build the reversed list accumulator on the heap as I did the search, but I don't because this will be completely wasted effort in the case where such an element is not found. So instead I just use an unboxed Int to count how far I get and have the search return this and the unsearched suffix (in the case where a matching element is found). But the lifetimes of the list prefix and suffix from this point on are completely unrelated so I don't want the prefix thunk to be hanging on to the unknown sized suffix. As I already know that the list has been evaluated at least up to the point that it gets "chopped off", I choose to use a strict (eager) take.
As an aside, my version of this function would be:
neilGobbler :: Int -> [x] -> [x] neilGobbler n xs = length res `seq` res where res = take n xs
I have no idea if it takes the heap or the stack, or if it performs faster or slower. If you still have whatever test you used on the gobbler, perhaps you could tell us.
My guess is it will use O(1) stack and burn O(n) heap (in addition that actually used by the result), so assymptotic complexity wise same as heapGobbler, but probably higher constant factors with ghc due to lazy building of take thunks and subsequent reduction and indirection costs.
If you mean an example of it biting with lazy code, this is discussed so often you'd be spoiled for choice if you search the mailing list archives. Here's a good one..
http://www.haskell.org/pipermail/haskell-cafe/2005-December/013521.html
This example actually shows the problem twice. In one case it's solvable by using foldl' instead of foldl.
Which reduces the memory from O(n) to O(1).
Are you sure about that? Using foldl' here eliminates one of the two possible sources of stack overflow, but it doesn't eliminate a space leak. It's O(n) either way. Using strict Map insertion will eliminate a space leak (in this case) and also a possible source stack overflow.
Surely thats a good thing,
Would be if it was true :-)
and the code before had a space leak. Space leak is bad, therefore telling people about it is good.
There are plenty of space leaks that won't cause stack overflows, and plenty of stack overflows that aren't caused by space leaks (see above for one example). Again I have to say that even if true, I think this is a pretty lame justification for the current implementation. The *default* behaviour of any useful program should surely be to make best effort to carry on working (and in due course deliver an answer or whatever), even if there is unexpectedly high stack use for some reason (that may or may not be a "bug").
I think its sensible to let people set their own stack bound (which is possible),
I have no objection to people bounding their stack if that's their choice. I can't imagine why anybody who stopped to think about this would actually want this feature, but it's free world. What I object to is it being bounded by default to something other than overall program memory limit. I know that I could probably achieve this effect myself with +RTS options, but I also want to be able to write libraries that other people are going to use safely without having to add a appropriate warning in the documentation to the effect that some parts use O(n) stack space deliberately, by design. But of course this all assumes that underlying implementation is sufficiently robust to make "unbounded" stacks safe (at least as safe as any other "unbounded" data structure). Unfortunately it seems this isn't the case at present if what various folk have told me is true.
but that clearly just from taking an informal poll of respondants to this thread, the current default should indeed be the default. You seem to be the only person clamouring for an unlimited stack,
Yes, this is strange. Same thing happened in the "global variables" debate despite it being obvious to any thinking person that I was correct. Denial of the reality of some very simple examples of the problem was typical of that debate too. :-) Regards -- Adrian Hey