RE: Optimizations for mutable structures?

On 07 December 2005 16:38, Malcolm Wallace wrote:
"Simon Marlow"
writes: I should have said that if 'acts' blocks, then the transformation is invalid.
Well that is exactly what I was assuming when I said that the transformation is invalid. In the general case, for some arbitrary actions between the write and the read (excluding another write of course), there is no guarantee that the IORef remains unmodified.
This is an analysis that's performed all the time in C compilers, it's quite straightforward to do a good approximation. One simple algorithm is: a store can be forwarded to a matching read as long as there are no intervening writes that may alias, or function calls. C does this and C has threads, so what's the difference?
If you want to restrict the intermediate actions to be non-blocking, such that another thread cannot run, then that is an extra (and significant) proof obligation.
And AFAICS your non-blocking argument only applies to co-operative scheduling. If pre-emption is permitted (e.g. interrupt-handling), then all bets are off, because an arbitrary thread could write to the IORef at any moment.
No, that is exactly what I disagree with. Faced with non-derministic semantics, the compiler does *not* have to preserve all possible outcomes. In other words, it does not have to assume that an IORef can be modified between any two arbitrary instructions. If we had to assume this, then indeed all bets would be off, and C compilers would be very much more restricted in what optimisations they could perform. I don't know how I can explain this a different way - it seems pretty obvious to me, but I'm probably not explaining it well :-/ You cut the example from my message - did you agree with the conclusion? Or the conclusion I made from Ian's example? Cheers, Simon

On Dec 7, 2005, at 12:05 PM, Simon Marlow wrote:
On 07 December 2005 16:38, Malcolm Wallace wrote:
"Simon Marlow"
writes: I should have said that if 'acts' blocks, then the transformation is invalid.
Well that is exactly what I was assuming when I said that the transformation is invalid. In the general case, for some arbitrary actions between the write and the read (excluding another write of course), there is no guarantee that the IORef remains unmodified.
This is an analysis that's performed all the time in C compilers, it's quite straightforward to do a good approximation. One simple algorithm is: a store can be forwarded to a matching read as long as there are no intervening writes that may alias, or function calls.
C does this and C has threads, so what's the difference?
I would personally be very uncomfortable justifying a semantic transformation based on common practice in C compilers. What exactly are the semantics of C programs and why do we believe that C compilers are correct? I'd much rather see some argument in terms of an appropriate definition of observational equivalence. [snip] Rob Dockins

On Wed, 7 Dec 2005, Robert Dockins wrote:
What exactly are the semantics of C programs and why do we believe that C compilers are correct?
With regards to threading, the semantics are undefined and the compilers
are subtly broken :-)
Tony.
--
f.a.n.finch

Tony Finch
On Wed, 7 Dec 2005, Robert Dockins wrote:
What exactly are the semantics of C programs and why do we believe that C compilers are correct?
With regards to threading, the semantics are undefined and the compilers are subtly broken :-)
Just have a look at Hans Boehm's page for more details on that ... http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/ -Matthias -- Matthias Neubauer | Universität Freiburg, Institut für Informatik | tel +49 761 203 8060 Georges-Köhler-Allee 79, 79110 Freiburg i. Br., Germany | fax +49 761 203 8052

On Wed, Dec 07, 2005 at 05:05:58PM -0000, Simon Marlow wrote:
No, that is exactly what I disagree with. Faced with non-derministic semantics, the compiler does *not* have to preserve all possible outcomes. In other words, it does not have to assume that an IORef can be modified between any two arbitrary instructions. If we had to assume this, then indeed all bets would be off, and C compilers would be very much more restricted in what optimisations they could perform.
Yup. this is exactly why C has the 'volatile' keyword. variables that are declared volatile will always be read and written to memory and never stored in a register because they could be modified by external interrupts or threads. If every varibale were considered volatile everything would be a whole whole lot slower. I have only had need to use it 3 times and all were in an operating system kernel. In any case, ghc IORefs are not 'volatile' in the C sense and that is a good thing. John -- John Meacham - ⑆repetae.net⑆john⑈

Yup. this is exactly why C has the 'volatile' keyword. variables that are declared volatile will always be read and written to memory and never stored in a register because they could be modified by external interrupts or threads. If every varibale were considered volatile everything would be a whole whole lot slower. I have only had need to use it 3 times and all were in an operating system kernel. In any case, ghc IORefs are not 'volatile' in the C sense and that is a good thing.
but concurrent haskell (ch) is not c! in ch, you can adjust the scopes of your variables so that scopes do not include other threads, so no need for any slowdown - local variables are available for local optimisations. but when the scope of a variable does include other threads, i for one do not want to go back to c/java-style efficiency annotations as a means for code-obfuscation: if a variable is shared between threads, those threads should all see the same variable contents, without me having to insert special synchronisation calls (or worse, wonder which operations might involve synchronisation and which won't); if some variable's contents do not need to be seen by other threads, don't make that variable available to them. scope extrusion/passing variables to other threads does not seem to be problematic, either: if thread A updates a variable passed to it by thread B, then those updates should be visible outside A. but perhaps i am missing something? do you have any example in mind where you'd actually need those extra annotations for efficiency? cheers, claus ps does your later email imply that you suggest IORef/MVar as the non-volatile/volative separation? i'd rather see non-thread- local IORefs phased out of ch..

On Thu, Dec 08, 2005 at 12:17:36AM -0000, Claus Reinke wrote:
ps does your later email imply that you suggest IORef/MVar as the non-volatile/volative separation? i'd rather see non-thread- local IORefs phased out of ch..
yeah. exactly. thread local storage is also quite expensive though so making all IORefs them would be a waste 90% of the time (as would making them threadsafe). I propose IORefs make no guarentees when it comes to concurrency, it is the users burden to make sure they use them in a single threaded manner. MVars should make guarentees when it comes to concurrency, and you should use those whenever they might be accesed by different threads.. The reason all implementations should provide MVars (even if they are just wrappers around IORefs) is so people can write portable threadsafe libraries. mostly they can use IORefs, but should use MVars for things like the unsafePerformIO global variable trick. a thread-local version of MVars might also be interesting... John -- John Meacham - ⑆repetae.net⑆john⑈

"Simon Marlow"
In the general case, for some arbitrary actions between the write and the read (excluding another write of course), there is no guarantee that the IORef remains unmodified.
This is an analysis that's performed all the time in C compilers, it's quite straightforward to do a good approximation. One simple algorithm is: a store can be forwarded to a matching read as long as there are no intervening writes that may alias, or function calls.
C does this and C has threads, so what's the difference?
There is a big difference between C variables and IORefs. For one thing, a C variable can have scope local to a procedure. If so, then the suggested transformation is entirely valid even in the presence of threads, because no other thread is permitted access to the local variable. M[fp+offset(x)] <- v ... w <- M[fp+offset(x)] ===> M[fp+offset(x)] <- v ... w <- v Global variables are more tricky, but I guess it is probably common to permit the elimination of the 'read' there as well, even though technically this is unsafe in the presence of threads. My guess is that this behaviour is the cause of a lot of thread unsafeness in common C libraries however. M[address(x)] <- v ... w <- M[address(x)] ===> M[address(x)] <- v ... w <- v Haskell has neither local nor global variables in the same sense as C. IORefs more closely resemble C pointers. A pointer has indeterminate lifetime and scope. It can be passed around from procedure to procedure, and from thread to thread. There can be aliases to the same memory location in multiple other contexts. If I'm writing a hardware driver, the pointer address might be memory-mapped to a device register, in which case writing to that address and immediately reading back from it may not be an identity. I've certainly dealt with devices where the same hardware port when written to, expects control data, but when read from delivers status data. So here: M[M[x]] <- v ... w <- M[M[x]] it would be totally incorrect for the compiler to eliminate the read. Regards, Malcolm
participants (7)
-
Claus Reinke
-
John Meacham
-
Malcolm Wallace
-
Matthias Neubauer
-
Robert Dockins
-
Simon Marlow
-
Tony Finch