Re: Looking for the fastest Haskell primes

newer
understanding generics

older
Data.ByteString woes

Niemeijer, R.A.

16 Apr 2009 16 Apr '09

8:41 a.m.

Heinrich Apfelmus wrote:

...

+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak?

Show replies by date

Eugene Kirpichov

16 Apr 16 Apr

8:57 a.m.

New subject: Looking for the fastest Haskell primes

The parameterless version is a top-level definition and won't get garbage-collected, IIRC. So, if you evaluate primes!!10000000, you'll end up with a 10000000-element list hanging in memory forever. If you evaluate (primes' ()) !! 10000000, you won't. 2009/4/16 Niemeijer, R.A. :

...

Heinrich Apfelmus wrote:

...
+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Eugene Kirpichov Web IR developer, market.yandex.ru

Achim Schneider

9:07 a.m.

New subject: Looking for the fastest Haskell primes

Eugene Kirpichov wrote:

...

The parameterless version is a top-level definition and won't get garbage-collected, IIRC. So, if you evaluate primes!!10000000, you'll end up with a 10000000-element list hanging in memory forever. If you evaluate (primes' ()) !! 10000000, you won't.

{-# CAF foo #-} {-# NOCAF foo #-} -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or quoting of this signature prohibited.

Jason Dusek

9:29 a.m.

New subject: Looking for the fastest Haskell primes

2009/04/16 Achim Schneider :

...

{-# CAF foo #-} {-# NOCAF foo #-}

Where do I find docs for these pragmas? -- Jason Dusek

Achim Schneider

10:04 a.m.

New subject: Looking for the fastest Haskell primes

Jason Dusek wrote:

...

2009/04/16 Achim Schneider :

...
{-# CAF foo #-} {-# NOCAF foo #-}

Where do I find docs for these pragmas?

...in your friendly bug tracker, under the label "missing feature" -- (c) this sig last receiving data processing entity. Inspect headers for copyright history. All rights reserved. Copying, hiring, renting, performance and/or quoting of this signature prohibited.

Daniel Peebles

11:11 a.m.

New subject: Looking for the fastest Haskell primes

I vaguely remember someone (maybe Duncan Coutts?) saying that this was a commonly held misconception, and that GHC did indeed GC CAFs when optimization is enabled. If I am remembering incorrectly, does anyone have a reference to a ticket outlining this non-GC'ed CAF behavior? Thanks, Dan On Thu, Apr 16, 2009 at 4:57 AM, Eugene Kirpichov wrote:

...

The parameterless version is a top-level definition and won't get garbage-collected, IIRC. So, if you evaluate primes!!10000000, you'll end up with a 10000000-element list hanging in memory forever. If you evaluate (primes' ()) !! 10000000, you won't.

2009/4/16 Niemeijer, R.A. :

...
Heinrich Apfelmus wrote:

...
+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Eugene Kirpichov Web IR developer, market.yandex.ru _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Duncan Coutts

12:06 p.m.

New subject: Looking for the fastest Haskell primes

On Thu, 2009-04-16 at 07:11 -0400, Daniel Peebles wrote:

...

I vaguely remember someone (maybe Duncan Coutts?) saying that this was a commonly held misconception, and that GHC did indeed GC CAFs when optimization is enabled. If I am remembering incorrectly, does anyone have a reference to a ticket outlining this non-GC'ed CAF behavior?

It was Simon Marlow. I quote: "FUD! CAFs are definitely garbage collected" http://haskell.org/pipermail/haskell-cafe/2009-February/055525.html Duncan

Jules Bean

11:23 a.m.

New subject: Looking for the fastest Haskell primes

Eugene Kirpichov wrote:

...

The parameterless version is a top-level definition and won't get garbage-collected, IIRC.

This has not-much to do with CAFs and is really just about scope + values + liveness. live values (those which a program still refers to, e.g. from a function which might get called in the future) don't get GCed. CAFs are just in the top-most scope and particularly likely to get held live in this fashion. As Lennart points out, optimisations occasionally increase sharing, although GHC tries fairly hard not to do this. Jules

Daniel Fischer

9:05 a.m.

New subject: Looking for the fastest Haskell primes

Am Donnerstag 16 April 2009 10:41:43 schrieb Niemeijer, R.A.:

...

Heinrich Apfelmus wrote:

...
+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak?

If your programme uses primes some time early and again at the end, the list of primes (calculated so far) is kept in memory, because it's a top level constant (CAF). If you don't need the primes in between, calculate many of them at the start and use only a few at the end, that's a terrible memory-waste. If you don't use primes later, it *may* be that they're garbage-collected, but I wouldn't count on it. primes' is a function, so the list of primes is calculated anew every time you use primes' () in your programme and garbage collected when the programme leaves that use site. Which behaviour is desired depends of course on the programme.

Heinrich Apfelmus

9:21 a.m.

New subject: Looking for the fastest Haskell primes

Niemeijer, R.A. wrote:

...

Heinrich Apfelmus wrote:

...
+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak?

Sure. Lazy evaluation means that values, once evaluated, are stored in memory until garbage collections reclaims them because it is clear that they won't be used anymore. A memory leak happens when the programmer knows that values won't be used anymore while this knowledge is not available to the garbage collector. More specifically, consider the following example main = do print (primes !! 100) print (primes !! 20) This program will calculate the first 100 primes, and then print the 100th and the 1st prime. Now, after having printed the 100th prime, the 21th to 100th prime won't be used anymore and could thus be safely reclaimed by garbage collection. But since they are part of the primes list and this list is still in use for printing the 20th prime, this won't happen; we have a memory leak. The memory leak above can be avoided at the expense of recalculating the list. Using a function primes' with a dummy argument ensures^1 that the list of primes will be recalculated and garbage collected each time it is used: main = do print (primes' () !! 100) print (primes' () !! 20) Here, the first 100 primes are calculated, garbage collected and then the first 20 primes are (re-)calculated and garbage collected. We can fine control memory usage with the dummy argument version by using let statements or where clauses main = do let ps = primes' () in do print (ps !! 101) print (ps !! 100) -- no recalculation of ps print (primes' () !! 20) -- recalculates first 20 primes ^1: Compiler optimizations may interfere with that behavior. Generally, consensus is that the compiler should preserve sharing meticulously, but this is not guaranteed. See also: "full laziness transform". The above is folklore and should be written down properly at someplace prominent. The wiki page http://www.haskell.org/haskellwiki/Memory_leak is a start, but I think the explanation there is currently a bit murky. Feel free to incorporate my writings above. Regards, apfelmus -- http://apfelmus.nfshost.com

Lennart Augustsson

9:28 a.m.

New subject: Looking for the fastest Haskell primes

But things are a little more complicated than that. If the () argument to primes' is not used in calculating the primes then ghc might decide to lift the computed list of primes into a top level CAF, and so you will have a memory leak anyway. If ghc does so or not depends, e.g., on the optimization level. -- Lennart On Thu, Apr 16, 2009 at 11:21 AM, Heinrich Apfelmus wrote:

...

Niemeijer, R.A. wrote:

...
Heinrich Apfelmus wrote:

...
+1 except that exporting the potentially infinite list of primes is problematic in that it may become a memory leak.

I'd suggest to export two versions

primes :: [Integer] primes' :: () -> [Integer]

for casual (i.e. throwaway program to solve a Project Euler problem) and for memory aware use respectively.

I'm afraid I don't quite follow. Would you mind explaining what the first parameter is for and how it would solve the memory leak?

Sure.

Lazy evaluation means that values, once evaluated, are stored in memory until garbage collections reclaims them because it is clear that they won't be used anymore. A memory leak happens when the programmer knows that values won't be used anymore while this knowledge is not available to the garbage collector.

More specifically, consider the following example

main = do print (primes !! 100) print (primes !! 20)

This program will calculate the first 100 primes, and then print the 100th and the 1st prime. Now, after having printed the 100th prime, the 21th to 100th prime won't be used anymore and could thus be safely reclaimed by garbage collection. But since they are part of the primes list and this list is still in use for printing the 20th prime, this won't happen; we have a memory leak.

The memory leak above can be avoided at the expense of recalculating the list. Using a function primes' with a dummy argument ensures^1 that the list of primes will be recalculated and garbage collected each time it is used:

main = do print (primes' () !! 100) print (primes' () !! 20)

Here, the first 100 primes are calculated, garbage collected and then the first 20 primes are (re-)calculated and garbage collected.

We can fine control memory usage with the dummy argument version by using let statements or where clauses

main = do let ps = primes' () in do print (ps !! 101) print (ps !! 100) -- no recalculation of ps print (primes' () !! 20) -- recalculates first 20 primes

^1: Compiler optimizations may interfere with that behavior. Generally, consensus is that the compiler should preserve sharing meticulously, but this is not guaranteed. See also: "full laziness transform".

The above is folklore and should be written down properly at someplace prominent. The wiki page

http://www.haskell.org/haskellwiki/Memory_leak

is a start, but I think the explanation there is currently a bit murky. Feel free to incorporate my writings above.

Regards, apfelmus

-- http://apfelmus.nfshost.com

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

5919

Age (days ago)

5919

Last active (days ago)

List overview

Download

10 comments

10 participants

participants (10)

Achim Schneider
Daniel Fischer
Daniel Peebles
Duncan Coutts
Eugene Kirpichov
Heinrich Apfelmus
Jason Dusek
Jules Bean
Lennart Augustsson
Niemeijer, R.A.