
On Monday 28 January 2008, Rene de Visser wrote:
It would be nice if HXT was incremental even when you are processing the whole tree.
If I remember correctly, the data type of the tree in HXT is something like
data Tree = Tree NodeData [Tree]
which means that already processed parts of the tree can't be garbage collected because the parent node is holding onto them.
If instead it was
data Tree = Tree NodeData (IORef [Tree])
Would could remove each subtree as it was processed (well just before would probably be necessary, and we would need to rely on blackholing to remove the reference on the stack). This would perhaps allow already processed subtree to be garbage collected. Together with the lazy evaluation this could lead to quite good memory usage.
Rene.
Not so sure about this. For streaming processing, it would be nicer to have something like StAX with a stack of already entered elements kept about as book-keeping |(the tags + attribute sets to root). Let's face it, if you sign up to a document model, you are signing up to a document and shouldn't be supprised when it sits in memory. I think the 'right' solution at least in part goes with the problem to be solved. I'd be upset if we moved to something more complex where my code breaks because something accidentaly garbage collected data that I need to back-track to. Matthew