Type systems preventing laziness-related memory leaks?

older
Re: [Haskell-cafe] Very large data...

Eugene Kirpichov

18 Feb 2015 18 Feb '15

5:04 a.m.

Hello haskell-cafe, Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here. http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la... Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell. Are there any type systems which statically enforce lack of such thunks in a program using a lazy language? Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

Attachments:

attachment.html (text/html — 1.1 KB)

Show replies by date

Atze van der Ploeg

18 Feb 18 Feb

6:24 a.m.

See the paper "copatterns" by andreas abel. Cheers! Atze On Feb 18, 2015 6:04 AM, "Eugene Kirpichov" wrote:

...

Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Atze van der Ploeg

6:32 a.m.

I'm not awake yet. That paper is not what you asked at all. I'm not aware of any paper that presents a type system statically ruling out thunks of unbounded depth, but it seems to me like this indeed should be possible. On Feb 18, 2015 7:24 AM, "Atze van der Ploeg" wrote:

...

See the paper "copatterns" by andreas abel.

Cheers!

Atze On Feb 18, 2015 6:04 AM, "Eugene Kirpichov" wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Roman Cheplyaka

7:50 a.m.

New subject: Type systems preventing laziness-related memory leaks?

Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive. foldl (+) 0 would be an example of a laziness issue. If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice. If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf On 18/02/15 07:04, Eugene Kirpichov wrote:

...

Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

Joe Hillenbrand

3:40 p.m.

Is foldl = foldl' ever going to happen? On Tue, Feb 17, 2015 at 11:50 PM, Roman Cheplyaka wrote:

...

Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

...
Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

David Feuer

20 Feb 20 Feb

7:05 a.m.

Probably not. There's real code that depends on the current foldl semantics. On Wed, Feb 18, 2015 at 10:40 AM, Joe Hillenbrand wrote:

...

Is foldl = foldl' ever going to happen?

On Tue, Feb 17, 2015 at 11:50 PM, Roman Cheplyaka wrote:

...
Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Roman Cheplyaka

9:13 a.m.

New subject: Type systems preventing laziness-related memory leaks?

I'd be curious to see a (non-contrived) example. On 20/02/15 09:05, David Feuer wrote:

...

Probably not. There's real code that depends on the current foldl semantics.

On Wed, Feb 18, 2015 at 10:40 AM, Joe Hillenbrand wrote:

...
Is foldl = foldl' ever going to happen?

On Tue, Feb 17, 2015 at 11:50 PM, Roman Cheplyaka wrote:

...
Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Joe Hillenbrand

7:58 p.m.

Me too. Last I heard, there wasn't any code that depended on foldl being lazy, and it doesn't really seem possible. http://www.well-typed.com/blog/90/ On Fri, Feb 20, 2015 at 1:13 AM, Roman Cheplyaka wrote:

...

I'd be curious to see a (non-contrived) example.

On 20/02/15 09:05, David Feuer wrote:

...
Probably not. There's real code that depends on the current foldl semantics.

On Wed, Feb 18, 2015 at 10:40 AM, Joe Hillenbrand wrote:

...
Is foldl = foldl' ever going to happen?

Joachim Breitner

8:41 p.m.

New subject: Type systems preventing laziness-related memory leaks?

Hi, last = foldl (\_ x -> x) (errorEmptyList "last") (see http://git.haskell.org/ghc.git/blob/f44bbc83bab62f9a2d25e69d87c2b4af25318d52...) not sure how contrived that is, but at least it’s real code in GHC-7.10. Greetings, Joachim Am Freitag, den 20.02.2015, 11:13 +0200 schrieb Roman Cheplyaka:

...

I'd be curious to see a (non-contrived) example.

On 20/02/15 09:05, David Feuer wrote:

...
Probably not. There's real code that depends on the current foldl semantics.

On Wed, Feb 18, 2015 at 10:40 AM, Joe Hillenbrand wrote:

...
Is foldl = foldl' ever going to happen?

On Tue, Feb 17, 2015 at 11:50 PM, Roman Cheplyaka wrote:

...
Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type-systems-preventing-la...

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Eugene Kirpichov

5:26 a.m.

On Tue Feb 17 2015 at 11:50:20 PM Roman Cheplyaka wrote:

...

Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

The problem is that when you say foldr (+) 0 [1, 2, 3, 4, 5] you build a thunk "1 + (2 + (3 + (4 + (5 + 0))))", which has - let's call it "whnf evaluation depth" of 4 (you need a stack of size 4 to evaluate it to whnf). I would like a type system that would disallow building thunks with *unbounded* whnf evaluation depth. I'm thinking we could annotate every type in every type signature with this depth. Let us use a special kind "@" for these annotations, allow polymorphism on them, and use "t@d" to denote "a value of type t requiring whnf evaluation depth d". (+) :: forall (a:@) (b:@) . Int@a -> Int@b -> Int@(1+max(a,b)); ($) :: forall (a:@) (b:@->@) . (forall (a:@) (b:@->@) . x@a -> y@(b a)) -> x@a -> y@(b a) The type signature for (.) quickly gets unwieldy and I wasn't able to even write down the signature for foldr :) but perhaps you get the idea. Some informal properties this would have: * whnf depth of calling a function in a tailcall position is max(whnf depth of the function itself, whnf depth of its argument). * whnf depth of "case x of { ... }" is 1. * In the context of "case x of { ...(primitive pattern)... }", whnf depth of x is 0. Does this make any sense?

...

foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type- systems-preventing-laziness-related-memory-leaks

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

Roman Cheplyaka

9:23 a.m.

New subject: Type systems preventing laziness-related memory leaks?

On 20/02/15 07:26, Eugene Kirpichov wrote:

...

On Tue Feb 17 2015 at 11:50:20 PM Roman Cheplyaka mailto:roma@ro-che.info> wrote:

Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

The problem is that when you say foldr (+) 0 [1, 2, 3, 4, 5] you build a thunk "1 + (2 + (3 + (4 + (5 + 0))))", which has - let's call it "whnf evaluation depth" of 4 (you need a stack of size 4 to evaluate it to whnf).

This is not a thunk. These values would be pushed to the stack, but no thunks will be created. Just so that you have no doubts left, here's the STG code for foldr (+) 0: $wgo = \r srt:SRT:[] [w] case w of _ { [] -> 0; : y ys -> case y of _ { I# x -> case $wgo ys of ww { __DEFAULT -> +# [x ww]; }; }; }; As you can see, there are no lets here, only cases. Roman

Eugene Kirpichov

5:40 p.m.

On Fri Feb 20 2015 at 1:23:25 AM Roman Cheplyaka wrote:

...

On 20/02/15 07:26, Eugene Kirpichov wrote:

...
On Tue Feb 17 2015 at 11:50:20 PM Roman Cheplyaka mailto:roma@ro-che.info> wrote:

Note that foldr (+) 0 has nothing to do with laziness — it's eager.

Its

...
problem is that it's not tail-recursive.

The problem is that when you say foldr (+) 0 [1, 2, 3, 4, 5] you build a thunk "1 + (2 + (3 + (4 + (5 + 0))))", which has - let's call it "whnf evaluation depth" of 4 (you need a stack of size 4 to evaluate it to

whnf).

This is not a thunk. These values would be pushed to the stack, but no thunks will be created.

Just so that you have no doubts left, here's the STG code for foldr (+) 0:

$wgo = \r srt:SRT:[] [w] case w of _ { [] -> 0; : y ys -> case y of _ { I# x -> case $wgo ys of ww { __DEFAULT -> +# [x ww]; }; }; };

As you can see, there are no lets here, only cases.

Indeed - I guess the strictness analyzer did its job. Nevertheless, the expression foldr (+) 0 [1..n] requires an evaluation stack depth of n and I would like it to be uncompilable - I would like the type of foldr (+) 0 to be "[Int] -> Int@\infty"

...

Roman

Konstantine Rybnikov

12:34 p.m.

Eugene, I find your idea quite interesting. One thing that I'd say is that since you don't know how many recursive calls will foldr do at runtime, you can't say at compile-time it's thunk depth. Thus, I think, if you would be able to express something that is similar to type-level `Nat`, which could also be `Infinity` (e.g., either it's a known nat or an infinity), you could do something like that. Here's my code (which I suggest as something that's more concrete than notation you ovided): ``` {-# LANGUAGE DataKinds #-} {-# LANGUAGE KindSignatures #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE TypeOperators #-} module Main where import GHC.TypeLits newtype Nf (n::Nat) a = Nf a -- TODO: add deepseq to ensure? toNf :: a -> Nf 0 a toNf = Nf unNf :: Nf n a -> a unNf (Nf x) = x type family Max (n :: Nat) (m :: Nat) :: Nat -- Implement properly later somehow type instance Max 0 0 = 0 type instance Max 0 m = m type instance Max n 0 = n type instance Max 1 1 = 1 -- addition creates a thunk of `(max n m) + 1` depth nfAdd :: (Num a) => Nf n a -> Nf m a -> Nf ((Max n m) + 1) a nfAdd x y = Nf ((unNf x) + (unNf y)) -- TODO: think on this. Need infinity? nfFold :: (Nf n1 a -> Nf n2 b -> Nf n3 b) -> Nf n4 b -> Nf n5 [a] -> Nf n6 b nfFold _ acc (Nf []) = Nf (unNf acc) nfFold f acc xs = undefined main :: IO () main = do let nfZero = (toNf 0) nfTwo = (toNf 2) printNf (nfAdd nfZero nfTwo :: Nf 1 Int) putStrLn "Hi!" where printNf = print . unNf ``` Is this is kind of thing you were talking about? Thanks! On Fri, Feb 20, 2015 at 7:26 AM, Eugene Kirpichov wrote:

...

On Tue Feb 17 2015 at 11:50:20 PM Roman Cheplyaka wrote:

...
Note that foldr (+) 0 has nothing to do with laziness — it's eager. Its problem is that it's not tail-recursive.

The problem is that when you say foldr (+) 0 [1, 2, 3, 4, 5] you build a thunk "1 + (2 + (3 + (4 + (5 + 0))))", which has - let's call it "whnf evaluation depth" of 4 (you need a stack of size 4 to evaluate it to whnf).

I would like a type system that would disallow building thunks with *unbounded* whnf evaluation depth.

I'm thinking we could annotate every type in every type signature with this depth. Let us use a special kind "@" for these annotations, allow polymorphism on them, and use "t@d" to denote "a value of type t requiring whnf evaluation depth d".

(+) :: forall (a:@) (b:@) . Int@a -> Int@b -> Int@(1+max(a,b));

($) :: forall (a:@) (b:@->@) . (forall (a:@) (b:@->@) . x@a -> y@(b a)) -> x@a -> y@(b a)

The type signature for (.) quickly gets unwieldy and I wasn't able to even write down the signature for foldr :) but perhaps you get the idea.

Some informal properties this would have: * whnf depth of calling a function in a tailcall position is max(whnf depth of the function itself, whnf depth of its argument). * whnf depth of "case x of { ... }" is 1. * In the context of "case x of { ...(primitive pattern)... }", whnf depth of x is 0.

Does this make any sense?

...
foldl (+) 0 would be an example of a laziness issue.

If you want to specify which functions should or shouldn't be used in a lazy context, take a look at polarity (see Harper's Practical Foundations for Programming Languages). So you could say, for instance, that it doesn't make sense to use + in a lazy context; that'd eliminate half the cases of laziness-induced memory leaks in practice.

If instead you want to impose some upper bound on the memory consumption without caring about specific cases, then see this ICFP'12 paper: http://www.dcc.fc.up.pt/~pbv/AALazyExtended.pdf

On 18/02/15 07:04, Eugene Kirpichov wrote:

...
Hello haskell-cafe,

Let me repost here a question I posted to cstheory stackexchange - in hopes that there are more type theory experts here.

http://cstheory.stackexchange.com/questions/29518/type- systems-preventing-laziness-related-memory-leaks

Perhaps the main source of performance problems in Haskell is when a program inadvertently builds up a thunk of unbounded depth - this causes both a memory leak and a potential stack overflow when evaluating. The classic example is defining sum = foldr (+) 0 in Haskell.

Are there any type systems which statically enforce lack of such thunks in a program using a lazy language?

Seems like this should be on the same order of difficulty as proving other static program properties using type system extensions, e.g. some flavors of thread safety or memory safety.

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

3802

Age (days ago)

3804

Last active (days ago)

List overview

Download

12 comments

7 participants

participants (7)

Atze van der Ploeg
David Feuer
Eugene Kirpichov
Joachim Breitner
Joe Hillenbrand
Konstantine Rybnikov
Roman Cheplyaka