
So, I have an optimization/internals question. Does the GHC API have any hooks for being able to revert a CAF to the original expression, thus discarding the previously computed result? The reason I'm wanting this is that I have a particular CAF which is an infinite list. Unfolding that list takes a fair deal of work, so we want to share it whenever possible; however it doesn't take an overwhelming amount of work, so if we know we've evaluated more of the list than necessary (for a long while), it'd be nice to be able to revert the evaluation in order to save on memory overhead (e.g., by calling relax :: IO()). I could hack something together based on unsafePerformIO and top-level IORefs, and it's clear that this is in fact a safe thing to do, but I'm worried about the semantic issues inherent in unsafePerformIOed top-level IORefs (e.g., the fact that their scope isn't particularly well defined: is it per library instance? per runtime?...). Unfortunately, for what I'm doing, it isn't really feasible to just leave the IO type in there nor to pass around the infinite list so we can use scoping rules to decide when to free it. (Feel free to offer alternative suggestions to handling this situation too.) -- Live well, ~wren

GHCi does this somehow, so it's definitely possible; Simon M will know. | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of wren ng thornton | Sent: 06 December 2011 17:49 | To: GHC-users List | Subject: Revert a CAF? | | So, I have an optimization/internals question. Does the GHC API have any | hooks for being able to revert a CAF to the original expression, thus | discarding the previously computed result? | | The reason I'm wanting this is that I have a particular CAF which is an | infinite list. Unfolding that list takes a fair deal of work, so we want | to share it whenever possible; however it doesn't take an overwhelming | amount of work, so if we know we've evaluated more of the list than | necessary (for a long while), it'd be nice to be able to revert the | evaluation in order to save on memory overhead (e.g., by calling relax | :: IO()). | | I could hack something together based on unsafePerformIO and top-level | IORefs, and it's clear that this is in fact a safe thing to do, but I'm | worried about the semantic issues inherent in unsafePerformIOed | top-level IORefs (e.g., the fact that their scope isn't particularly | well defined: is it per library instance? per runtime?...). | Unfortunately, for what I'm doing, it isn't really feasible to just | leave the IO type in there nor to pass around the infinite list so we | can use scoping rules to decide when to free it. | | (Feel free to offer alternative suggestions to handling this situation too.) | | -- | Live well, | ~wren | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Can you use a weak pointer to do what you want? If you keep a weak pointer to the head of your expensive list then itwill be reclaimed at the next major GC I believe. I have used weakpointers for vaugely similar purposes before. I guess a downside is that they will always be reclaimed on GC even ifthere isn't memory pressure, I think. John (resent, accidentally sent to wrong address first)

On 06/12/2011 17:48, wren ng thornton wrote:
So, I have an optimization/internals question. Does the GHC API have any hooks for being able to revert a CAF to the original expression, thus discarding the previously computed result?
The reason I'm wanting this is that I have a particular CAF which is an infinite list. Unfolding that list takes a fair deal of work, so we want to share it whenever possible; however it doesn't take an overwhelming amount of work, so if we know we've evaluated more of the list than necessary (for a long while), it'd be nice to be able to revert the evaluation in order to save on memory overhead (e.g., by calling relax :: IO()).
I could hack something together based on unsafePerformIO and top-level IORefs, and it's clear that this is in fact a safe thing to do, but I'm worried about the semantic issues inherent in unsafePerformIOed top-level IORefs (e.g., the fact that their scope isn't particularly well defined: is it per library instance? per runtime?...). Unfortunately, for what I'm doing, it isn't really feasible to just leave the IO type in there nor to pass around the infinite list so we can use scoping rules to decide when to free it.
(Feel free to offer alternative suggestions to handling this situation too.)
It would be possible, but it's not quite as straightforward as you might think. Suppose you have a program like this: xs = [1..100000] evens = filter ((==0) . (`mod` 2)) xs and you fully evaluate "evens". Now, GHC will garbage collect "xs", because it isn't required any more. However, if you revert "evens" to a CAF, now we require "xs" again, so we have to either revert that to a CAF or arrange to retain it in the first place on the grounds that we might need it again if some other CAF is reverted. Reverting xs to a CAF is not hard - we could have the GC revert CAFs as soon as they become unreachable. Arranging to retain it is harder. GHCi gets around this by reverting *all* CAFs at the same time when you say :load. There's one other thing: GHC doesn't support reverting CAFs in interpreted code at the moment, you have to reload the module. So you need the following things: - modify the GC to revert CAFs when they become garbage - add a primop to revert a single CAF not too hard, I would think... Cheers, Simon

On 12/7/11 5:03 AM, Simon Marlow wrote:
It would be possible, but it's not quite as straightforward as you might think. Suppose you have a program like this:
xs = [1..100000] evens = filter ((==0) . (`mod` 2)) xs
and you fully evaluate "evens". Now, GHC will garbage collect "xs", because it isn't required any more. However, if you revert "evens" to a CAF, now we require "xs" again, so we have to either revert that to a CAF or arrange to retain it in the first place on the grounds that we might need it again if some other CAF is reverted.
Reverting xs to a CAF is not hard - we could have the GC revert CAFs as soon as they become unreachable. Arranging to retain it is harder.
Right. I know it isn't easy to do in the general case, which is why I was looking for an API hook rather than a Haskell function. Luckily my case is like xs. The only free variables are common functions ---functions used by list comprehension notation, methods of Num Int, Enum Int, Ord Int, (||), and (&&)--- which are almost surely inlined away and would never be reverted anyways.
GHCi gets around this by reverting *all* CAFs at the same time when you say :load.
There's one other thing: GHC doesn't support reverting CAFs in interpreted code at the moment, you have to reload the module.
So you need the following things:
- modify the GC to revert CAFs when they become garbage
- add a primop to revert a single CAF
not too hard, I would think...
Good to know. I'll take a look at it over the break and see if I can come up with something. -- Live well, ~wren

On 06/12/11 18:48, wren ng thornton wrote:
So, I have an optimization/internals question. Does the GHC API have any hooks for being able to revert a CAF to the original expression, thus discarding the previously computed result?
...
I could hack something together based on unsafePerformIO and top-level IORefs, and it's clear that this is in fact a safe thing to do, but I'm worried about the semantic issues inherent in unsafePerformIOed top-level IORefs (e.g., the fact that their scope isn't particularly well defined: is it per library instance? per runtime?...). Unfortunately, for what I'm doing, it isn't really feasible to just leave the IO type in there nor to pass around the infinite list so we can use scoping rules to decide when to free it.
How bad is the IORef solution really? I.e. can someone more well versed in ghc internals tell me why this wouldn't work? type CAF a = IORef (() -> a, a) mkCAF :: (() -> a) -> a mkCAF f = unsafePerformIO $ newIORef (f, f ()) getCAF :: CAF a -> a getCAF = snd . unsafeDupablePerformIO . readIORef resetCAF :: CAF a -> IO () resetCAF = modifyIORef $ \(f,_) -> (f, f ()) myCAF :: CAF [Int] myCAF = mkCAF $ \_ -> [1..1000000] {-# NOINLINE myCAF #-} Twan

On 07/12/11 15:16, Twan van Laarhoven wrote:
On 06/12/11 18:48, wren ng thornton wrote:
So, I have an optimization/internals question. Does the GHC API have any hooks for being able to revert a CAF to the original expression, thus discarding the previously computed result?
...
I could hack something together based on unsafePerformIO and top-level IORefs, and it's clear that this is in fact a safe thing to do, but I'm worried about the semantic issues inherent in unsafePerformIOed top-level IORefs (e.g., the fact that their scope isn't particularly well defined: is it per library instance? per runtime?...). Unfortunately, for what I'm doing, it isn't really feasible to just leave the IO type in there nor to pass around the infinite list so we can use scoping rules to decide when to free it.
How bad is the IORef solution really? I.e. can someone more well versed in ghc internals tell me why this wouldn't work?
type CAF a = IORef (() -> a, a) mkCAF :: (() -> a) -> a mkCAF f = unsafePerformIO $ newIORef (f, f ()) getCAF :: CAF a -> a getCAF = snd . unsafeDupablePerformIO . readIORef resetCAF :: CAF a -> IO () resetCAF = modifyIORef $ \(f,_) -> (f, f ())
myCAF :: CAF [Int] myCAF = mkCAF $ \_ -> [1..1000000] {-# NOINLINE myCAF #-}
What will happen here is that GHC will transform it to: x = [1..1000000] myCAF = mkCAF $ \_ -> x and you're no better off. This will always happen with a function of type () -> a, because by definition the return value does not depend on the argument, so the contents will always be lifted out. You could use -fno-full-laziness I suppose... Cheers, Simon
participants (5)
-
John Meacham
-
Simon Marlow
-
Simon Peyton-Jones
-
Twan van Laarhoven
-
wren ng thornton