As I was getting "Control stack overflow" while attempting to generate large (>1MB) data files using Hugs, I recompiled Hugs with 10-fold increased value of the corresponding parameter (was 16K). Then I was able to do the necessary computations. However, I wonder if it was safe to so, i.e. that this change didn't break functionality of the Hugs system. Hugs documentation sounds as if one gets "Control stack overflow" then it surely must be an infinite recursion; while in my case it's just concatenation of long (>20K) strings. I guess it would've been nice if documentation said something more about that, and explained ways of increasing stack size. IMHO, some internal Hugs part uses the stack (thus, something that is not controllable by the command line options) for data that ought to be held in the heap (whose size is controllable by command line options)... Best regards, Dmitrii Pasechnik http://ssor.twi.tudelft.nl/~dima/
As I was getting "Control stack overflow" while attempting to generate large (>1MB) data files using Hugs, I recompiled Hugs with 10-fold increased value of the corresponding parameter (was 16K). Then I was able to do the necessary computations. However, I wonder if it was safe to so, i.e. that this change didn't break functionality of the Hugs system.
It should be fine. The only risk is that, having eliminated the control-stack limit, you might run into the C-stack limit. This tends to show itself as some kind of segmentation fault.
Hugs documentation sounds as if one gets "Control stack overflow" then it surely must be an infinite recursion; while in my case it's just concatenation of long (>20K) strings.
Well, that is the usual case. When it's not infinite recursion, it's usually one of: 1) Strange operational behaviour due to laziness. For example, this expression: foldl (+) 0 [1..1000000] would require constant space if strictly evaluated. (It corresponds to a simple for loop in C.) When evaluated lazily though, it first builds this thunk: ((((0+1)+2)+3)+...+1000000 (which takes a lot of heap space) and then evaluates it which requires about 1000000 stack frames. 2) The quadratic concat problem. Evaluating this: [1] ++ ([2] ++ ([3] ++ [4])) takes linear time in the length of the list. (As you'd expect.) And requires constant stack space. Evaluating this: ((([1] ++ [2]) ++ [3]) ++ [4] takes time quadratic in the length of the list and requires stack space which is linear in the length of the list (I think - I haven't checked the stack space claim and laziness makes it hard to just guess.) The reason for the difference is that ++ takes time proportional to the length of its first argument. The first version appends a singleton list onto a bigger list N times. The second version appends an increasingly long list onto a singleton list N times. Of course, Haskell defines that ++ associates to the right so the problem doesn't show up in single expressions. But it appears very rapidly if you write a recursive function which constructs a list by appending the result of recursive calls to itself. For example: data Tree a = Leaf a | Branch (Tree a) (Tree a) showTree :: Show a => Tree a -> String showTree (Leaf x) = show x showTree (Branch l r) = "(" ++ showTree l ++ "," ++ showTree r ++ ")" run that on big enough a tree and you'll be doing an awful lot of work. The usual fix is to avoid calling ++ with unbounded size arguments using an extra argument which you want the result to be appended to. That is, we define a new function "showTree'" which satisfies this equation: showTree' t s = showTree t ++ s but whose definition is: showTree' :: Show a => Tree a -> String -> String showTree' (Leaf x) s = show x ++ r showTree' (Branch l r) s = "(" ++ showTree' l ("," ++ showTree' r (")" ++ s) This still calls ++ but the result of showTree' is never used as the first argument of ++. The result runs in time proportional to the size of the output and (probably) stack space proportional to the height of the tree. Simon Peyton Jones' book "Implementing Functional Languages" has a longer discussion of this in chapter 1. (Many other books discuss it too and might be more useful to a Haskell user - but that's the one I know.) Hope this helps, Alastair Reid
I guess it would've been nice if documentation said something more about that, and explained ways of increasing stack size.
IMHO, some internal Hugs part uses the stack (thus, something that is not controllable by the command line options) for data that ought to be held in the heap (whose size is controllable by command line options)...
Best regards, Dmitrii Pasechnik http://ssor.twi.tudelft.nl/~dima/
_______________________________________________ Hugs-Bugs mailing list Hugs-Bugs@haskell.org http://www.haskell.org/mailman/listinfo/hugs-bugs
participants (2)
-
Alastair Reid -
Dima Pasechnik