
Hi all, I was just teaching a class on minimization methods, with a focus on conjugate gradient minimization in particular, and one of my main points was that with conjugate gradient minimization we only need three or four arrays of size N (depending on whether we use the Fletcher-Reeves or Polak-Ribiere variant), that this was a huge win, and I was explaining that in my code we use Fletcher-Reeves even though it's generally got less stable convergence behavior, because avoiding one extra copy is well worth it, when each copy is on the order of a gigabyte (which is not unusual in my line of work). This got me thinking about one of the largest problems with writing serious numerical code in Haskell, which is that of memory consumption and avoiding temporary variables. The count of three arrays requires assumes that we do in-place updates, which presupposes an imperative model. But the main reason I'd like to switch to Haskell is so that I can write pure functional code, which is easier to read, easier to reason with, and easier to get right. So I started wondering whether there's a solution that would allow us to write pretty high-level pure functional code, while the RTS can realize at run-time that we have the only reference to the input argument and that it is therefore safe to consume it destructively. My thought is to move some of the amazing rewriting streams stuff that's going on with Data.ByteStream and Manuel's array fusion work onto the memory management level. If we can rewrite high-level array manipulations into a form in which the compiler can fuse the loops and write good low-level code, why not also allow automatic in-place updates (when possible)? My thought is to create a function like the following: copy :: ForeignPtr a -> IO (ForeignPtr a) This function ordinarily copies its input into a new chunk of memory, and then calls the initialization function to update that memory. So we could call it with something like a `plus` b = do a' <- copy a a' `plusEqual` b return a' Of course, in practice I imagine this whole business being encapsulated in a module where it's called with unsafePerformIO (but it's really perfectly safe, a la Data.ByteString). The key is that copy will ask the GC to check whether there is more than one reference to its input, and if there isn't, then it'll return its input without making a copy. Obviously, we'd want to be careful in making calls to copy (since the GC overhead could overwhelm the potential (transient) memory savings. And I've got no idea how hard it would be to find out if there are no other references, or how effective this would be... that is, whether the compiler might generate "false" references to the input that would prevent the savings from ever being realized. There are also other questions. My copy above is simple, but probably we'd be better off with a more general: copy :: a -> (a -> IO a) -> IO a which could be used in the base libraries for other sorts of objects, such as Array, which might not have a ForeignPtr at their back end. We might also want a more general copy that would handle the case where we would like to copy either one variable or another, and we don't care which. i.e. we in my `plus` example above, if "a" can't be discarded, but "b" can, we'll be out of luck. Perhaps we want something like reuseOrNot :: a -> (a -> IO b) -> (a -> IO b) -> IO b which would call one function or the other, depending on whether the input is deemed expendible by the GC, so we could write a `plus` b = reuseOrNot a (\a' -> do { a' `plusEqual` b; return a' }) (\a' -> reuseOrNot b (\b' -> do { b' `plusEqual` a'; return b' }) (\b' -> a' `boringAdd` b')) where boringAdd is an addition that doesn't reuse either array. Anyhow, all this would be predicated on the possibility of asking the GC whether anyone else has a reference to a given object, and I've no idea how tough (or slow) this would be. It'd also be effectively predicated on someone like dons having the time and interest to write good fusion code which would allow one to make use of this in referentially transparent high-level pure functional code. I think it'd be pretty cool, and would certainly (not immediately, but as my research group ramps up) be interested in being a guinea pig user of this stuff. Any thoughts? Am I crazy? Is this infeasible for some reason? Should I have patented this idea before mentioning it on the list? Has someone else already patented it? -- David Roundy Department of Physics Oregon State University