hs-plugins and memory leaks

I was happy to see the recent announcement about hs-plugins being updated to work with newer ghc. I have a project and had always been planning to use it. However, there are some questions I've had about it for a long time. The 'yi' paper mentions both 'yi' and 'lambdabot' as users of hs-plugins. However, both those projects have long since abandoned it. I can't find any documentation on why, or even any documentation at all for Yi wrt its dynamic code execution system, but from looking at the source it looks like it uses hint for dynamic code execution and dyre for configuration. Dyre in turn uses serialization to pass the old state to the reconfigured app. So we have retreated from the idea of hotswapping the application state. It seems to me that the advantages as put forth in the 'yi' paper still hold. Changing the configuration in yi is rather heavyweight. Relinking the entire editor takes a long time, and yi is still a relatively small program. Editors can keep most of their state on disk and can have very simple GUI state, so perhaps the serialization and deserialization isn't such a problem, but this doesn't hold for other programs. It seems to me the loss is significant: there's a big difference between being able to experiment with a command by editing and rerunning it immediately, and having to wait 10s or more for the app to recompile, relink, shut down the ui, serialize all state, and restart. And if you add hint, you are linking in large parts of ghc, with an even slower link time. So, yi is no longer a dynamically reconfigurable application, and is now merely a configurable application. The apparent loss of such a useful feature (you might even say a defining feature) would presumably only happen if keeping it was untenable. And of course that makes me reluctant to make any kind of design that relies on it without first knowing why all existing users jumped ship. I can think of one possible reason, and that's a memory leak. In ghc/rts/Linker.c:unloadObj there's a commented out line '// stgFree(oc->image);'. In a test program I wrote that behaves like 'plugs', every executed line increases the size of the program by 12-16k. I have to remove the resolveObjs call from plugs for it to work, but once I do it displays the same leak. So my questions are: Why did lambdabot and yi abandon plugins? Is unloadObj a guaranteed memory leak? As far as I can tell, it's never called within ghc itself. If the choices are between a memory leak no matter how you use it and dangerous but correct if you use it right, shouldn't we at least have the latter available as an option? E.g. a reallyUnloadObj function that also frees the image. If I uncomment that line will it fix the problem? Is it safe to do so if I first force all thunks that might contain unloaded code? Long shot, but are there any more principled ways to guarantee no pointers to a chunk of code exist? The only thing I can think of is to have the state be totally strict and consist only of types from the static core. Would it be possible to hand responsibility for the memory off to the garbage collector? GHC now supports dynamic libraries. Given that plugins may need to link large portions of the static core "library", can it be loaded as a dynamic library so both the core and the plugins can share the same code? I haven't been able to find many references to ghc's support for dynamic linking.

qdunkan:
However, there are some questions I've had about it for a long time. The 'yi' paper mentions both 'yi' and 'lambdabot' as users of hs-plugins. However, both those projects have long since abandoned it. I can't find any documentation on why, or even any documentation at all for Yi wrt its dynamic code execution system, but from looking at the source it looks like it uses hint for dynamic code execution and dyre for configuration. Dyre in turn uses serialization to pass the old state to the reconfigured app. So we have retreated from the idea of hotswapping the application state.
Once active development of hs-plugins stopped, along with the portability issues, it behooved projects like e.g. xmonad or yi, to aim for simpler reconfiguration strategies, other than native code hot loading.
I can think of one possible reason, and that's a memory leak. In ghc/rts/Linker.c:unloadObj there's a commented out line '// stgFree(oc->image);'. In a test program I wrote that behaves like 'plugs', every executed line increases the size of the program by 12-16k. I have to remove the resolveObjs call from plugs for it to work, but once I do it displays the same leak.
So my questions are:
Why did lambdabot and yi abandon plugins?
Because it was unmaintained for around 5 years, and was fundamentally less portable than simpler state serialization solutions that offered some of the same benefits as full code hot swapping.
Is unloadObj a guaranteed memory leak? As far as I can tell, it's never called within ghc itself. If the choices are between a memory leak no matter how you use it and dangerous but correct if you use it right, shouldn't we at least have the latter available as an option? E.g. a reallyUnloadObj function that also frees the image.
GHC never unloads object code, so yes, it will "leak" old code.
Long shot, but are there any more principled ways to guarantee no pointers to a chunk of code exist? The only thing I can think of is to have the state be totally strict and consist only of types from the static core. Would it be possible to hand responsibility for the memory off to the garbage collector?
It's really hard. -- Don

So my questions are:
Why did lambdabot and yi abandon plugins?
Because it was unmaintained for around 5 years, and was fundamentally less portable than simpler state serialization solutions that offered some of the same benefits as full code hot swapping.
Fair enough. The idea of being able to make changes and see them quickly enough for it to have an interactive feel is very appealing, but maybe there are other ways to get there, such as improving link time with dynamic linking (my current link time is around 24 seconds). State serialization + restart is definitely simpler and more robust. But if it's impossible to get it fast enough otherwise, and there aren't any other show stopping problems (I think even a known memory leak may be dwarfed by the amount of data the app keeps in memory anyway), then it might be worth it to me to maintain hs-plugins.
Is unloadObj a guaranteed memory leak? As far as I can tell, it's never called within ghc itself. If the choices are between a memory leak no matter how you use it and dangerous but correct if you use it right, shouldn't we at least have the latter available as an option? E.g. a reallyUnloadObj function that also frees the image.
GHC never unloads object code, so yes, it will "leak" old code.
So would freeing oc->image fix the leak? In my case, it's not too hard to force all data structures that might reference it.
Long shot, but are there any more principled ways to guarantee no pointers to a chunk of code exist? The only thing I can think of is to have the state be totally strict and consist only of types from the static core. Would it be possible to hand responsibility for the memory off to the garbage collector?
It's really hard.
It happens in python for python bytecode, since it exists as a plain data structure in the language. E.g. 'code = compile('xyz')'. Couldn't a haskell solution be along the same lines? 'code <- load "X.o"; makeFunction code', and then makeFunction holds a ForeignPtr to the actual code and there's some kind of primitive to call a chunk of code as a function.

Hi Evan,
Evan Laforge
So my questions are:
Why did lambdabot and yi abandon plugins?
Because it was unmaintained for around 5 years, and was fundamentally less portable than simpler state serialization solutions that offered some of the same benefits as full code hot swapping.
Fair enough. The idea of being able to make changes and see them quickly enough for it to have an interactive feel is very appealing, but maybe there are other ways to get there, such as improving link time with dynamic linking (my current link time is around 24 seconds). State serialization + restart is definitely simpler and more robust. But if it's impossible to get it fast enough otherwise, and there aren't any other show stopping problems (I think even a known memory leak may be dwarfed by the amount of data the app keeps in memory anyway), then it might be worth it to me to maintain hs-plugins.
I have project design for use dynamic linking, i even build 'pdynload' (http://hackage.haskell.org/package/pdynload-0.0.3) with Don's PhD thesis. Last, i remove pdynload code from my project temporary with below reasons: 1) Hold running state is difficult, like network state in browser or running state in terminal emulator. 2) Linking time is too long, I have haskell OS project (http://www.flickr.com/photos/48809572@N02/) have many sub-module, every sub-module is very big, and linking time is too long. 3) Memory leak like you said.
Is unloadObj a guaranteed memory leak? As far as I can tell, it's never called within ghc itself. If the choices are between a memory leak no matter how you use it and dangerous but correct if you use it right, shouldn't we at least have the latter available as an option? E.g. a reallyUnloadObj function that also frees the image.
GHC never unloads object code, so yes, it will "leak" old code.
So would freeing oc->image fix the leak? In my case, it's not too hard to force all data structures that might reference it. It's not safe for GHC runtime system since you don't know when time unload old code is safe.
Don's idea is hold old state in memory even you load new state for hot-swapping safely.
Long shot, but are there any more principled ways to guarantee no pointers to a chunk of code exist? The only thing I can think of is to have the state be totally strict and consist only of types from the static core. Would it be possible to hand responsibility for the memory off to the garbage collector?
It's really hard.
It happens in python for python bytecode, since it exists as a plain data structure in the language. E.g. 'code = compile('xyz')'. Couldn't a haskell solution be along the same lines? 'code <- load "X.o"; makeFunction code', and then makeFunction holds a ForeignPtr to the actual code and there's some kind of primitive to call a chunk of code as a function.
Anyway, i was re-thinking hot-swap haskell some time, my idea is : multi-processes framework + hot-swapping core entry + mix old/new sub-module in runtime Core and sub-module all in separate processes. With my project (http://www.flickr.com/photos/48809572@N02/), editor and browser (many other sub-module ...) are sub-module. Core don't do anything, just control how to load sub-module. Core have 'entry code', like 'pageBufferNewFun' in https://patch-tag.com/r/AndyStewart/manatee/snapshot/current/content/pretty/... 'sourceBufferNew', 'browserBufferNew' are 'entry function' to load sub-module in *new* process. Core process always running, so we just need hot-swapping 'entry code' after we update sub-module library by cabal, then we can use new 'entry code' load sub-module in new process, at the same time, old sub-module code still running in old process. Welcome to discuss. :) Cheers, -- Andy

Last, i remove pdynload code from my project temporary with below reasons:
1) Hold running state is difficult, like network state in browser or running state in terminal emulator.
This doesn't seem too hard to me. Provided you are not swapping the module that defines the state in the first place, simply reload the module, and replace the old symbol in the state with the reloaded one.
2) Linking time is too long, I have haskell OS project (http://www.flickr.com/photos/48809572@N02/) have many sub-module, every sub-module is very big, and linking time is too long.
This is discouraging, since one of the main reasons to use dynamically loaded code instead of recompiling the whole app is to avoid long link times. Presumably you would compile the majority of the app (the API that the plugins use, and the internal code also uses) as a dynamic library: main.o -> tiny stub that just calls app.so app.so -> large library containing all app logic plugin.so -> links against app.so when loaded So the plugin needs to read a lot of hi files when recompiling, but the dynamic link time should be proportional to the number of unresolved symbols in plugin.so that point into app.so, not proportional to the overall size of the app, right?
So would freeing oc->image fix the leak? In my case, it's not too hard to force all data structures that might reference it. It's not safe for GHC runtime system since you don't know when time unload old code is safe.
But that's just my question, I *do* (think I) know when it's safe, which is after the data that has passed through plugged-in code has been fully forced. Can't I just call unloadObj then? E.g., loading and unloading plugins for audio processing is totally standard. Since the data is strict arrays of primitive types, there's no risk of stray pointers to unloaded code.
Anyway, i was re-thinking hot-swap haskell some time, my idea is :
multi-processes framework + hot-swapping core entry + mix old/new sub-module in runtime
Core and sub-module all in separate processes.
How would you pass state between processes?

Evan Laforge
Last, i remove pdynload code from my project temporary with below reasons:
1) Hold running state is difficult, like network state in browser or running state in terminal emulator.
This doesn't seem too hard to me. Provided you are not swapping the module that defines the state in the first place, simply reload the module, and replace the old symbol in the state with the reloaded one.
2) Linking time is too long, I have haskell OS project (http://www.flickr.com/photos/48809572@N02/) have many sub-module, every sub-module is very big, and linking time is too long.
This is discouraging, since one of the main reasons to use dynamically loaded code instead of recompiling the whole app is to avoid long link times. Presumably you would compile the majority of the app (the API that the plugins use, and the internal code also uses) as a dynamic library:
main.o -> tiny stub that just calls app.so app.so -> large library containing all app logic plugin.so -> links against app.so when loaded
So the plugin needs to read a lot of hi files when recompiling, but the dynamic link time should be proportional to the number of unresolved symbols in plugin.so that point into app.so, not proportional to the overall size of the app, right? Yes, not proportional the size of application, but link time depend on the dependent packages that haven't linked.
Example like the GHC API in 'pdynload' package, it will search symbol define in GHC database to get which packageId that need re-link, then use below code link: Linker.linkPackages flags [packageId] Function 'linkPackages' will link specified package and it's "dependent packages", if dependents packages is bigger, link time is longer. So the long link time is unavoidable for *big* package.
So would freeing oc->image fix the leak? In my case, it's not too hard to force all data structures that might reference it. It's not safe for GHC runtime system since you don't know when time unload old code is safe.
But that's just my question, I *do* (think I) know when it's safe, which is after the data that has passed through plugged-in code has been fully forced. Can't I just call unloadObj then?
Yes, unloadObj can work if you careful design, well it's also easy to crash your program if something miss.
E.g., loading and unloading plugins for audio processing is totally standard. Since the data is strict arrays of primitive types, there's no risk of stray pointers to unloaded code.
Anyway, i was re-thinking hot-swap haskell some time, my idea is :
multi-processes framework + hot-swapping core entry + mix old/new sub-module in runtime
Core and sub-module all in separate processes.
How would you pass state between processes?
Infact, i won't pass any state between processes. My framework like this: http://www.flickr.com/photos/48809572@N02/5031811365/lightbox/ Every sub-module running in render process, and render process for daemon process just a *Tab*. When you need update current sub-module, just recompile new code in Cabal/GHC database, then startup *new* process to load new code, and we can use dyre technology to restore state in new process. Though it's not powerful as hs-plugins do, but perfect safety and no *memory leak*. -- Andy
participants (3)
-
Andy Stewart
-
Don Stewart
-
Evan Laforge