Why does this blow the stack? - Haskell-Cafe - Haskell.org

newer
Comments on reading two ints off...

Why does this blow the stack?

older
Fwd: Re: [Haskell-cafe] Functions...

Justin Bailey

21 Dec 2007 21 Dec '07

5:13 p.m.

Given this function: dropTest n = head . drop n $ [1..] I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it? Justin

Reply

Sign in to reply online Use email software

Show replies by date

Derek Elkins

21 Dec 21 Dec

5:18 p.m.

On Fri, 2007-12-21 at 09:13 -0800, Justin Bailey wrote:

Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

A similar example is discussed on http://www.haskell.org/haskellwiki/Stack_overflow at the bottom.

Reply

Sign in to reply online Use email software

Brad Larsen

5:48 p.m.

On Fri, 21 Dec 2007 12:13:04 -0500, Justin Bailey wrote:

Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

Justin

I'm curious as well. My first thought was to try the (!!) operator. Typing Prelude> [1..] !! 550000 overflows the stack on my computer, as does dropTest 550000. The code for (!!) is apparently as follows: xs !! n | n < 0 = error "Prelude.!!: negative index" [] !! _ = error "Prelude.!!: index too large" (x:_) !! 0 = x (_:xs) !! n = xs !! (n-1) Isn't this tail recursive? What is eating up the stack? Brad Larsen

Reply

Sign in to reply online Use email software

Justin Bailey

5:51 p.m.

On Dec 21, 2007 9:48 AM, Brad Larsen wrote:

I'm curious as well. My first thought was to try the (!!) operator. Typing

Prelude> [1..] !! 550000

overflows the stack on my computer, as does dropTest 550000.

I think its [1..] which is building up the unevaluated thunk. Using this definition of dropTest does not blow the stack: dropTest n = head . drop n $ repeat 1 Justin

Reply

Sign in to reply online Use email software

David Benbennick

5:56 p.m.

On Dec 21, 2007 9:51 AM, Justin Bailey wrote:

I think its [1..] which is building up the unevaluated thunk. Using this definition of dropTest does not blow the stack:

It also works if you do [(1::Int) ..] !! n, but not with [(1::Integer) ..] !! n Sounds like GHC is being smart about strictness for Ints, but doesn't know that Integer is equally strict. If that's right, it's a bug in GHC.

Reply

Sign in to reply online Use email software

Derek Elkins

6:02 p.m.

On Fri, 2007-12-21 at 09:56 -0800, David Benbennick wrote:

On Dec 21, 2007 9:51 AM, Justin Bailey wrote:

...
I think its [1..] which is building up the unevaluated thunk. Using this definition of dropTest does not blow the stack:

It also works if you do [(1::Int) ..] !! n, but not with [(1::Integer) ..] !! n

Sounds like GHC is being smart about strictness for Ints, but doesn't know that Integer is equally strict. If that's right, it's a bug in GHC.

It is a bug in GHC. From http://darcs.haskell.org/packages/base/GHC/Enum.lhs enumFrom (I# x) = eftInt x maxInt# where I# maxInt# = maxInt -- Blarg: technically I guess enumFrom isn't strict! ... eftInt x y | x ># y = [] | otherwise = go x where go x = I# x : if x ==# y then [] else go (x +# 1#)

Reply

Sign in to reply online Use email software

Don Stewart

6:29 p.m.

derek.a.elkins:

On Fri, 2007-12-21 at 09:56 -0800, David Benbennick wrote:

...
On Dec 21, 2007 9:51 AM, Justin Bailey wrote:

...
I think its [1..] which is building up the unevaluated thunk. Using this definition of dropTest does not blow the stack:

It also works if you do [(1::Int) ..] !! n, but not with [(1::Integer) ..] !! n

Sounds like GHC is being smart about strictness for Ints, but doesn't know that Integer is equally strict. If that's right, it's a bug in GHC.

It is a bug in GHC. From http://darcs.haskell.org/packages/base/GHC/Enum.lhs

enumFrom (I# x) = eftInt x maxInt# where I# maxInt# = maxInt -- Blarg: technically I guess enumFrom isn't strict!

...

eftInt x y | x ># y = [] | otherwise = go x where go x = I# x : if x ==# y then [] else go (x +# 1#)

As usual, this is an issue with the irregular numeric-operation strictness applied to Integer. Consider this Integer enumFrom: main = print x where x = head (drop 1000000 (enumFrom' 1)) ------------------------------------------------------------------------ enumFrom' :: Integer -> [Integer] enumFrom' x = enumDeltaInteger x 1 enumDeltaInteger :: Integer -> Integer -> [Integer] enumDeltaInteger x d = x `seq` x : enumDeltaInteger (x+d) d Is fine. The Int instance is already strict like this. I'll file a patch. I hate these slightly too lazy issues with Integer, that aren't present with Int. The atomic strictness of Integer is only irregularly applied through the base libraries. For example, in Data.List, this was considered acceptable: maximum :: (Ord a) => [a] -> a maximum [] = errorEmptyList "maximum" maximum xs = foldl1 max xs {-# RULES "maximumInt" maximum = (strictMaximum :: [Int] -> Int); "maximumInteger" maximum = (strictMaximum :: [Integer] -> Integer) #-} Really, if we let Int be strict on (+) and (*)-style operations, and Integer sometimes strict on those, we should just declare that these atomic numeric types (and Word, Word8,..) should be strict on (+) and so on. -- Don

Reply

Sign in to reply online Use email software

Albert Y. C. Lai

5:56 p.m.

Justin Bailey wrote:

Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

Just for fun, throw in dropTest :: Int -> Int and experiment again! :)

Reply

Sign in to reply online Use email software

Marc A. Ziegert

7:56 p.m.

Am Freitag, 21. Dezember 2007 schrieb Justin Bailey:

Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

Justin

[1..] equals [1, (1)+1, (1+1)+1, (1+1+1)+1, (1+1+1+1)+1, ...] where the brackets are a reference to the previous entry in the list. so, if you want to reach the nth element in [1..], then the garbage collector automagically collects the unused list entries... but there are no unused. [1..]!!10 = ((((((((((1)+1)+1)+1)+1)+1)+1)+1)+1)+1)+1 now, that one long formula needs to be completely build in the stack prior to evaluation. to prevent this, each value has to be evaluated before stepping deeper into the list. i.e. like with this bang pattern: let ((!x):xs)!!!i = case i of {0->x;_->xs!!!pred i} in [1..]!!!10 or simply like this trick: [1..maxBound]!!10 this causes every single entry to be checked against the list-end-condition, just before the calculation of the next list entry. - marc

Reply

Sign in to reply online Use email software

Don Stewart

8:02 p.m.

coeus:

Am Freitag, 21. Dezember 2007 schrieb Justin Bailey:

...
Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

Justin

[1..] equals [1, (1)+1, (1+1)+1, (1+1+1)+1, (1+1+1+1)+1, ...] where the brackets are a reference to the previous entry in the list. so, if you want to reach the nth element in [1..], then the garbage collector automagically collects the unused list entries... but there are no unused.

[1..]!!10 = ((((((((((1)+1)+1)+1)+1)+1)+1)+1)+1)+1)+1 now, that one long formula needs to be completely build in the stack prior to evaluation. to prevent this, each value has to be evaluated before stepping deeper into the list. i.e. like with this bang pattern:

let ((!x):xs)!!!i = case i of {0->x;_->xs!!!pred i} in [1..]!!!10

or simply like this trick: [1..maxBound]!!10 this causes every single entry to be checked against the list-end-condition, just before the calculation of the next list entry.

There's no good reason for the accumulator for Integer to be lazy, especially when you see that adding an upper bound (enumFromTo) or using Int uses a strict accumulator. I've filled a bug report and fix for this. http://hackage.haskell.org/trac/ghc/ticket/1997 there's an ad hoc sprinkling of strict and lazy Num ops for Integer through the base library, while Int is fairly consistently strict. -- Don

Reply

Sign in to reply online Use email software

David Benbennick

10:19 p.m.

On Dec 21, 2007 12:02 PM, Don Stewart wrote:

There's no good reason for the accumulator for Integer to be lazy, especially when you see that adding an upper bound (enumFromTo) or using Int uses a strict accumulator.

I've filled a bug report and fix for this.

http://hackage.haskell.org/trac/ghc/ticket/1997

there's an ad hoc sprinkling of strict and lazy Num ops for Integer through the base library, while Int is fairly consistently strict.

Thanks for fixing this. But doesn't GHC have strictness analysis? Even if there was consistent strictness for Integer in the base library, that wouldn't help for code not in the base library. In other words, I want to be able to define mysum :: (Num a) => [a] -> a mysum = foldl (+) 0 in my own programs, and have GHC automatically make it strict if "a" happens to be Int or Integer. Is there any chance of GHC getting that ability?

Reply

Sign in to reply online Use email software

Don Stewart

10:30 p.m.

dbenbenn:

On Dec 21, 2007 12:02 PM, Don Stewart wrote:

...
There's no good reason for the accumulator for Integer to be lazy, especially when you see that adding an upper bound (enumFromTo) or using Int uses a strict accumulator.

I've filled a bug report and fix for this.

http://hackage.haskell.org/trac/ghc/ticket/1997

there's an ad hoc sprinkling of strict and lazy Num ops for Integer through the base library, while Int is fairly consistently strict.

Thanks for fixing this. But doesn't GHC have strictness analysis?

Sure does!

Even if there was consistent strictness for Integer in the base library, that wouldn't help for code not in the base library. In other words, I want to be able to define

mysum :: (Num a) => [a] -> a mysum = foldl (+) 0

in my own programs, and have GHC automatically make it strict if "a" happens to be Int or Integer. Is there any chance of GHC getting that ability?

Thankfully, GHC does that already :) mysum :: (Num a) => [a] -> a mysum = foldl (+) 0 main = print (mysum [1..10000000]) In ghc 6.6, $ time ./A +RTS -M20M Heap exhausted; Current maximum heap size is 19996672 bytes (19 Mb); use `+RTS -M<size>' to increase it. and in GHC 6.8, ghc can see through to the strictness of (+) $ time ./A +RTS -M20M 50000005000000 ./A +RTS -M20M 0.95s user 0.00s system 99% cpu 0.959 total The problem here was an explicit recusive loop though, with just not enough for the strictness analyser to get going. -- Don

Reply

Sign in to reply online Use email software

David Benbennick

11:16 p.m.

On Dec 21, 2007 2:30 PM, Don Stewart wrote:

dbenbenn:

...
Thanks for fixing this. But doesn't GHC have strictness analysis?

Sure does!

The problem here was an explicit recusive loop though, with just not enough for the strictness analyser to get going.

The explicit loop you're talking about is: enumDeltaInteger :: Integer -> Integer -> [Integer] enumDeltaInteger x d = x : enumDeltaInteger (x+d) d That code isn't very complicated, and I would hope to be able to write code like that in my own programs without having to worry about strictness. Given that the compiler even has an explicit signature, why can't it transform that code to enumDeltaInteger x d = let s = x + d in x : (seq s $ enumDeltaInteger s d) since it knows that (Integer+Integer) is strict? Of course, improving the strictness analysis is harder, but it pays off more, too.

Reply

Sign in to reply online Use email software

Stefan O'Rear

22 Dec 22 Dec

12:06 a.m.

On Fri, Dec 21, 2007 at 03:16:17PM -0800, David Benbennick wrote:

On Dec 21, 2007 2:30 PM, Don Stewart wrote:

...
dbenbenn:

...
Thanks for fixing this. But doesn't GHC have strictness analysis?

Sure does!

The problem here was an explicit recusive loop though, with just not enough for the strictness analyser to get going.

The explicit loop you're talking about is: enumDeltaInteger :: Integer -> Integer -> [Integer] enumDeltaInteger x d = x : enumDeltaInteger (x+d) d That code isn't very complicated, and I would hope to be able to write code like that in my own programs without having to worry about strictness. Given that the compiler even has an explicit signature, why can't it transform that code to enumDeltaInteger x d = let s = x + d in x : (seq s $ enumDeltaInteger s d) since it knows that (Integer+Integer) is strict? Of course, improving the strictness analysis is harder, but it pays off more, too.

Because they simply aren't the same. Try applying your functions to undefined undefined. Stefan

Reply

Sign in to reply online Use email software

Luke Palmer

8:15 a.m.

On Dec 22, 2007 12:06 AM, Stefan O'Rear > > The explicit loop you're talking about is:

...
enumDeltaInteger :: Integer -> Integer -> [Integer] enumDeltaInteger x d = x : enumDeltaInteger (x+d) d That code isn't very complicated, and I would hope to be able to write code like that in my own programs without having to worry about strictness. Given that the compiler even has an explicit signature, why can't it transform that code to enumDeltaInteger x d = let s = x + d in x : (seq s $ enumDeltaInteger s d) since it knows that (Integer+Integer) is strict? Of course, improving the strictness analysis is harder, but it pays off more, too.

Because they simply aren't the same.

Try applying your functions to undefined undefined.

This took a little work for me to see. Here it is for the interested: Prelude> let edi :: Integer -> Integer -> [Integer]; edi x d = x : edi (x+d) d Prelude> let edi' :: Integer -> Integer -> [Integer]; edi' x d = let s = x + d in x : (seq s $ edi s d) Prelude> _:_:_ <- return $ edi undefined undefined Prelude> _:_:_ <- return $ edi' undefined undefined *** Exception: Prelude.undefined Luke

Reply

Sign in to reply online Use email software

David Benbennick

8:15 a.m.

On 12/21/07, Stefan O'Rear wrote:

Because they simply aren't the same.

Good point; thanks. That means that Don's patch could theoretically break some existing Haskell program: Prelude> length $ take 10 ([undefined ..] :: [Integer]) 10 With his patch, that will throw. Some other types have the same issue (lazy Enum when it could be strict, leading to stack overflow): Prelude> length $ take 10 ([undefined ..] :: [Double]) 10 Prelude> length $ take 10 ([undefined ..] :: [Float]) 10 Prelude Foreign.C.Types> length $ take 10 ([undefined ..] :: [CFloat]) 10 Prelude Foreign.C.Types> length $ take 10 ([undefined ..] :: [CDouble]) 10 Prelude Foreign.C.Types> length $ take 10 ([undefined ..] :: [CLDouble]) 10 Prelude Data.Ratio> length $ take 10 ([undefined ..] :: [Ratio Int]) 10

Reply

Sign in to reply online Use email software

David Benbennick

9:19 a.m.

On 12/22/07, David Benbennick wrote:

On 12/21/07, Stefan O'Rear wrote:

...
Because they simply aren't the same.

Good point; thanks. That means that Don's patch could theoretically break some existing Haskell program:

In fact, it's possible to get strictness to avoid stack overflow, and still have laziness to allow undefined: myEnumFrom :: Integer -> [Integer] myEnumFrom a = map (a+) $ enumDeltaIntegerStrict 0 1 where enumDeltaIntegerStrict x d = x `seq` x : enumDeltaIntegerStrict (x+d) d then *Main> (myEnumFrom 42) !! (10^6) 1000042 *Main> length $ take 10 $ myEnumFrom undefined 10

Reply

Sign in to reply online Use email software

Don Stewart

8:23 p.m.

dbenbenn:

On 12/21/07, Stefan O'Rear wrote:

...
Because they simply aren't the same.

Good point; thanks. That means that Don's patch could theoretically break some existing Haskell program:

Prelude> length $ take 10 ([undefined ..] :: [Integer]) 10

That's right. It makes Integer behave like the Int instance. -- Don

Reply

Sign in to reply online Use email software

Don Stewart

8:34 p.m.

dons:

dbenbenn:

...
On 12/21/07, Stefan O'Rear wrote:

...
Because they simply aren't the same.

Good point; thanks. That means that Don's patch could theoretically break some existing Haskell program:

Prelude> length $ take 10 ([undefined ..] :: [Integer]) 10

That's right. It makes Integer behave like the Int instance.

And we see again that strictness properties are very ill-defined, here, the Enum types in base: Prelude> length $ take 10 ([undefined ..] :: [Int]) *** Exception: Prelude.undefined Prelude> length $ take 10 ([undefined ..] :: [()]) *** Exception: Prelude.undefined Prelude> length $ take 10 ([undefined ..] :: [Ordering]) *** Exception: Prelude.undefined Prelude> length $ take 10 ([undefined ..] :: [Bool]) *** Exception: Prelude.undefined But, Prelude> length $ take 10 ([undefined ..] :: [Float]) 10 Prelude> length $ take 10 ([undefined ..] :: [Double]) 10 And, Prelude> length $ take 10 ([undefined ..] :: [Integer]) 10 Now, Prelude> length $ take 10 ([undefined ..] :: [Integer]) *** Exception: Prelude.undefined So we see that Float and Double also have this problem, Prelude> head (drop 10000000 [1 .. ]) :: Float *** Exception: stack overflow Prelude> head (drop 10000000 [1 .. ]) :: Double *** Exception: stack overflow People shouldn't be writing code that depends on this! -- Don

Reply

Sign in to reply online Use email software

David Benbennick

25 Dec 25 Dec

7:33 p.m.

Since it's possible to support laziness for Integer (while still avoiding any stack overflow), I think it makes sense to do so. What if you have some big complicated program like the following: x = some big slow computation y = [x..] lots of code z = length $ take 10 y Why bother computing x if you don't have to? And as an added benefit, if you keep laziness, you don't have to worry about possibly breaking any programs that depend on the current lazy behavior. Laziness would NOT make sense for Int, since Int is bounded. You can't tell how long [(x::Int) ..] is without evaluating x. On 12/22/07, Don Stewart wrote:

People shouldn't be writing code that depends on this!

One of the main features of Haskell is that it's lazy. I don't think it's wrong to write code that depends on laziness.

Reply

Sign in to reply online Use email software

Don Stewart

26 Dec 26 Dec

9:54 p.m.

dbenbenn:

Since it's possible to support laziness for Integer (while still avoiding any stack overflow), I think it makes sense to do so. What if you have some big complicated program like the following:

x = some big slow computation y = [x..]

lots of code

z = length $ take 10 y

Why bother computing x if you don't have to? And as an added benefit, if you keep laziness, you don't have to worry about possibly breaking any programs that depend on the current lazy behavior.

Laziness would NOT make sense for Int, since Int is bounded. You can't tell how long [(x::Int) ..] is without evaluating x.

On 12/22/07, Don Stewart wrote:

...
People shouldn't be writing code that depends on this!

One of the main features of Haskell is that it's lazy. I don't think it's wrong to write code that depends on laziness.

Depending on laziness if fine, but depending on undefined strictness semantics for particular types is more fragile. Whether Int or Bool or whatever type has a strict or lazy accumulator in enumFromTo is entirely unspecified -- you can't look up the report to check. -- Don

Reply

Sign in to reply online Use email software

David Benbennick

10:13 p.m.

On 12/26/07, Don Stewart wrote:

Depending on laziness if fine, but depending on undefined strictness semantics for particular types is more fragile. Whether Int or Bool or whatever type has a strict or lazy accumulator in enumFromTo is entirely unspecified -- you can't look up the report to check.

Right, I understand that. But is there any reason not to keep the Integer version of [a..] lazy, given that we can fix the stack overflow problem without introducing strictness?

Reply

Sign in to reply online Use email software

Thomas Hartman

9:15 p.m.

The (extremely enlightening) discussion so far has focused on the inconsistent (arguably buggy) behavior of [a,b..c] enumeration sugar. I think it's worth pointing out that the code could also be made to run by making the drop function strict. I got to thinking, in a "strictness" debugging scenario like this, it seems like you can get the behavior you want by making things strict at various levels. You can make the inner level (enumeration sugar) stricter, or you can leave the enumeration stuff lazy and make the drop stricter. Maybe I didn't express that very well, so here's code: (I would argue that this discussion could be the basis of a "strictness kata" exercise, following up from a cafe post I made a while ago.) {-# LANGUAGE BangPatterns #-} import Data.List import Testing import Test.QuickCheck import Test.QuickCheck.Batch tDrop = ( head . drop n ) ( edi1 1 2 ) tStrictDrop = ( head . strictDrop n ) ( edi1 1 2 ) n = 10^6 --edi: enum delta integer -- stack overflow on 10^6 el list if use drop -- ok with strictDrop edi1 start next = start : edi1 next (next + delta) where delta = next - start -- ok (no overflow) for drop, of course also okay for strictDrop edi2 !start next = start : edi2 next (next + delta) where delta = next - start ediWithMax start next bound = takeWhile p $ edi start next where p i | delta > 0 = i <= bound | delta < 0 = i >= bound | otherwise = i <= bound delta = next - start edi = edi1 strictDrop _ [] = [] strictDrop n l | n <= 0 = l strictDrop n (!x:xs) | n>0 = strictDrop (n-1) xs pStrictDropSameAsDrop n xs = drop n xs == strictDrop n xs where types = (n::Int,xs::[Integer]) pEdi1 start next max = abs(next-start) > 0 ==> -- otherwise hits bottom because of eg [0,0..0] ediWithMax start next max == [start,next..max] where types = (start :: Integer, next :: Integer, max :: Integer) pEdi2 start next max = ( take 1000 $ ediWithMax start next max ) == ( take 1000 $ [start,next..max] ) where types = (start :: Integer, next :: Integer, max :: Integer) t2 = runTests "edi" testOptions [run pEdi1, run pEdi2, run pStrictDropSameAsDrop] where testOptions = TestOptions { no_of_tests = 100 -- number of tests to run , length_of_tests = 1 -- 1 second max per check -- where a check == n tests , debug_tests = False -- True => debugging info } 2007/12/21, Justin Bailey :

Given this function:

dropTest n = head . drop n $ [1..]

I get a stack overflow when n is greater than ~ 550,000 . Is that inevitable behavior for large n? Is there a better way to do it?

Justin _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

6400

Age (days ago)

6405

Last active (days ago)

Download

22 comments

10 participants

tags

participants (10)

Albert Y. C. Lai
Brad Larsen
David Benbennick
Derek Elkins
Don Stewart
Justin Bailey
Luke Palmer
Marc A. Ziegert
Stefan O'Rear
Thomas Hartman