
Hello skaller, Saturday, August 12, 2006, 7:06:15 AM, you wrote:
My point here is that actually this is a disastrous optimisation in a multi-processing environment, because in general, the assignment of a pointer means the store isn't write once.
:) all the variables rewritten is local to the function. _without_ tail call optimization, we create new stack frame for each recursive call. _with_ optimization, we just update vars in the first stack frame created because we know that these vars will not be reused after return from call
I haven't figured all the details. I am imagining a generational copying collector with a separate young heap for every thread, whilst the great grand mother heap is collected in the background.
it's close to the GHC scheme - 2-generational GC with 1st generation local to thread and having default size of 256kb (tied to the size of L2 caches of modern CPUs) while 2nd generation is global for all threads and collected at moments when memory currently allocated from OS exhausted. but currently 2nd-generation GCs don't runs in background and can be as long as several seconds. it's definitely hard to write games (or even web servers) with such architecture! :) so memory allocation is very fast. 1st-level GC is also fast because it runs inside L2 cache. access to "local variables" (i.e. data structures allocated in near past) is fast because they are all inside L2 cache. although Haskell don't supports "auto" variables which don't need GC and freed automatically, this scheme allows to significantly lower cost of managing short-lived data structures -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com