
I've been thinking and wondering about the possiblility of anti-memoization. Memoization reduces computational time at the expense of additional memory use. I would like to be able to do the opposite. Obviously, antimemoization is something you'd only want to do with "cheap" functions having larger outputs than inputs. In particular, I have (in darcs) a number of "parsing" functions, which use up a lot of memory, and *also* hold onto their argument. The simplest example is linesPS, which breaks a PackedString (from my FastPackedString library) into lines. The contents of the argument are not copied, so the original string is held in memory as long as its parsed version is also held. linesPS :: PackedString -> [PackedString] I'd like to be able to tell the GC that it can "throw away" the parsed lines, and replace the thunk of the [PackedString] with the original function if necesary. I think this can be approximated using weak pointers, but to do it "right" would require support in the compiler. Using weak pointers I imagine something like: antimemoize :: (a -> b) -> a -> AntiMemo a b readAntiMemo :: AntiMemo a b -> b with data AntiMemo a b = AntiMemo (a -> b) a (IORef (Weak b)) where readAntiMemo (AntiMemo f a iowb) = case unsafePerformIO (readIORef iowb >>= deRefWeak) of Nothing -> case f a of newb -> unsafePerformIO $ do wb <- mkWeakPtr newb Nothing writeIORef iowb wb return newb Just b -> b antimemoize f a = unsafePerformIO $ do wb <- mkWeakPtr (f a) Nothing iowb <- newIORef wb return (AntiMemo f a iowb) I haven't actually tested it, but to the extent that I understand weak pointers, it ought to work. Comments or suggestions are welcome. It would be really nice if this could be done at the "thunk" level, so it could be transparent to the caller. I'd really like to be able to create the function antimemoize :: (a -> b) -> (a -> b) which like "memo" would be completely transparent in the sense that it would be a pure function and wouldn't modify the output of the function it modifies, but would just make the output of that function a thunk that when evaluated stores not only the output b, but also the function itself and the input a, so that the b output could be garbage-collected. On the other hand, perhaps there's a reason why all this wouldn't work. I certainly don't understand space usage in haskell as well as I wish. But I also *really* would like to reduce the space usage of darcs. Some of this I can acheive by strategically introducing laziness, but that's a rather fragile way to reduce the memory footprint, since relatively small modifications change what gets held in memory. So it'd be nice to be able to throw away the results of a parse that I can always re-perform. -- David Roundy http://www.abridgegame.org/darcs
participants (1)
-
David Roundy