
On 1/29/12 5:48 AM, Roman Cheplyaka wrote:
* wren ng thornton
[2012-01-28 23:06:08-0500] * Math.Combinatorics.Primes: provides the prime numbers via Runciman's lazy wheel sieve algorithm. Provided here since efficient algorithms for combinatorial functions often require prime numbers. The current version memoizes the primes as an infinite list CAF, which could lead to memory leaks in long-running programs with irregular access to large primes. I'm looking into a GHC patch to allow resetting individual CAFs from within compiled programs so that you can explicitly decide when to un-memoize the primes. (In GHCi when you reload a module all the CAFs are reset. However, there's no way to access this feature from within compiled programs as yet.)
Why not to make it more pure? That is, return a lazy list of Ints (but not a CAF), which user can throw away by the usual GC means?
The functions from the other modules that use primes would have to be put in a Reader monad. That would make it a little bit more awkward to use, but personally I'd prefer that over memory leaks.
I'd also prefer a more pure solution, but I don't think that the Reader monad is the right tool here. I played around with that approach, but it requires extremely invasive changes to client code, obfuscating what should be simple mathematical formulae. And, it's too fragile, exposing the implementation in a way that breaks client code should I change a non-prime-using algorithm to a prime-using one, or vice versa. The fragility could be partially avoided by providing both prime-using and non-prime-using algorithms, but then that forces users to decide which one they want--- and if their only concern is performance, then they'd have to go through the code-breaking refactoring anyways, just to determine which is faster for their application. One alternative I'm exploring is, rather than (only) making the primes not a CAF, instead make the prime-using functions non-CAFs as well. That is, provide a makePrimeUsingFunctions function which returns a record/tuple of all the functions, sharing a stream of primes. This way, after allocating the functions, they can be used purely just as in the current model; and when the client wants the primes to be GCed, they can drop their references to the allocated functions which use those primes (allocating new functions later, if necessary). I've used that pattern before for similar resource issues, and it worked nicely. But how well it works depends a lot on the structure of the program needing those resources. In particular it can lead to needing to use the Reader monad (or similar) in order to pass around the functions, even though the functions themselves can be used purely. Unfortunately I don't have any large combinatorial programs on hand to assess whether this would be problematic in practice or not. -- Live well, ~wren