
If you are in ST, you can not modify anything externally visible without using unsafe functions. If an exception occurs at any point, your changes would remain in some broken state, but there would be no reference to it, so they are just garbage collected and nothing bad happens. If you need externally visible changes, you have to use IO, but then you also have the full arsenal of exception handling functions at your disposal. If you write code which is polymorphic and can either work in IO or ST, it can not have any visible side effects and thus you can ignore any exceptions in it (because you could runST it in completely pure code). If you think of your array list, lets look at possible signatures for adding an element: addPure :: ArrayList a -> a -> ArrayList a Clearly this just copies the whole array every time, there is nothing mutable here. addST :: ArrayList s a -> a -> ST s () This one is mutable, but you can never get out of ST with this ArrayList. While you are in ST, it doesn't matter if an async exception interrupts you, because you will throw away the result of the ST action anyway (and thus your broken ArrayList). addST' :: ArrayList a -> a -> ST s (ArrayList a) Has to copy the whole array because you can implement addPure with this and runST. addIO :: ArrayList a -> a -> IO () This can modify the list, but it can (and has to) also handle exceptions. This is the only one which Java provides. Regarding the monadic polymorphic (you are talking about MonadPrim, right?) functions: They can not handle exceptions, because they might be used in an ST context. But as stated earlier, if you compose them to another action in some MonadPrim, it will be exception safe because you can just apply runST to it, constraining MonadPrim to ST and getting a pure value out of it (and such a value never needs to handle exceptions). Of course, all this changes as soon as you use unsafeThaw in ST without proving that you have the ONLY reference to that buffer/array/... On 09/28/2017 05:51 PM, Станислав Черничкин wrote:
Thank your for reply. I think I should clarify what exactly I'd like to discuss.
The “data structures” I'm talking about are in general single-threaded mutable containers (like mentioned hashtables, or like ArrayList in Java). Such structures are not thread safe, yet it would be nice to have async exception safety. The word “atomicity” I used in a sense mentioned here https://en.wikipedia.org/wiki/Atomicity_(database_systems) : operation either occurs or fails and data structure remains in previous state. In many cases such behavior can be achieved without complex exception clean-up routines.
Let me give an example. Consider something like ArrayList from Java (an vector which can grow while elements added). I want to implement 'add' action. The contract is straightforward – the action may either add element to structure, possibly reallocating underlying memory buffer, writing element at last position, incrementing element counter, or it may throw OutOfMemory exception. But in the latter case the structure should stay “undamaged”. This could be implemented as following:
if count_equals_capacity thenallocate_new_buffer (let's suppose it garbage-collected)
copy_elements
update_buffer_pointer
update_capacity_variable
write_new_element_to_buffer
update_count_variable
This code does not contains any explicit exception handling but it satisfies the contract. The only place there exception can occur is allocate_new_buffer. In this case action will be interrupted before any state modifications. All other operations are basically memory writes and completely safe (assuming code correct and will not segfault).
Things become complicated in presence of async exceptions. Suppose async exception raised between write_new_element_to_buffer and update_count_variable. At first glance nothing wrong happed, but if the buffer holds references, it will now contain a reference to some object, preventing it from being GC-d, and this reference will be beyond buffer's count value, because exception occurred before updating count variable, so programmer will be completely unaware of it. But this still can be fixed by masking exceptions in critical blocks. And we can defenelly implement all of this in the IO monad.
The question is how to write “monad polymorhic” code. i.e. code, which can run both in IO and ST. Mutable data structures benefit from being “monad polymorhic”. Most Haskell mutable containers (vectors, hashtables, impure-containers) are build on PrimState monad allowing them run both in IO and ST. But they seems just ignore the fact that async exception may corrupt state. Some of them ( e.g. https://hackage.haskell.org/package/impure-containers-0.4.0/docs/src/Data-Ar... ) seem even ignore that unsafeGrow may throw OutOfMemory (though attempting to recover from OutOfMemory may be bad idea itself).
2017-09-28 15:45 GMT+03:00 Michael Snoyman
mailto:michael@snoyman.com>: > Since exception can arise at any point, it is not possible to guarantee atomicity of operation, hence mutable data structure may remain in incorrect state in case of interruption.
Even if async exceptions didn't exist, we couldn't guarantee atomicity in general without specifically atomic functions (like atomicModifyIORef or STM), since another thread may access the data concurrently and create a data race.
If you're only talking about single-threaded cases—of which ST is _basically_ a subset[1]—I don't think you're really worried about _atomicity_, but about exception safety. Exception safety goes beyond async exceptions, since almost all IO actions can throw some form of synchronous exception. For those cases, you can use one of the many exception-cleanup functions, like finally, onException, bracket, or bracketOnError.
It's true that those functions don't work inside ST, but I'd argue you don't need them to. The expected behavior of code that receives an async exception is to (1) clean up after itself and (2) rethrow the exception. But as ST blocks are supposed to be free of externally-visible side effects, worrying about putting its variables back into some safe state is unnecessary[2].
To summarize:
* If you need true atomicity, you're in IO and dealing with multiple threads. I'd recommend sticking with STM unless you have a strong reason to do otherwise. * If you are single threaded and in IO, you can get away with non-STM stuff more easily, and need to make sure you're using exception-aware functions. * If you're inside ST, make sure any resources you acquire are cleaned up correctly, but otherwise you needn't worry about exceptions.
Also, you may be interested in reading the documentation for safe-exceptions[3], which talks more about async exception safety.
[1] I say basically since you'd have to pull out unsafe functions to fork a thread that has access to an STVar or similar, though it could be done. [2] If you're doing something like binding to a C library inside ST, you may have some memory cleanup to perform, but the STVars and other data structures should never be visible again. [3] https://haskell-lang.org/library/safe-exceptions https://haskell-lang.org/library/safe-exceptions
On Thu, Sep 28, 2017 at 2:00 PM, Станислав Черничкин
mailto:schernichkin@gmail.com> wrote: It's quite hard to implement mutable data structures in presence of asynchronous exceptions. Since exception can arise at any point, it is not possible to guarantee atomicity of operation, hence mutable data structure may remain in incorrect state in case of interruption. One can certainly use maskAsyncExceptions# and friends to protect critical regions, but masking function are living in IO, mutable data structures on other hand trend to be state-polymorphic (to allow it usage in ST).
This lead to conflicting requirements: - One should not care about asynchronous exceptions inside ST (it is not possible to catch exception in ST, hence not possible to use something in invalid state). More over, it is not even possible to do write “exception-safe” code, because masking functions not available. - One should provide accurate masking then using same data structures in IO.
So I want do discuss several questions topics on this case.
1. Impact. Are async exceptions really common? Would not be easier to say: “ok, things can go bad if you combine async exceptions with mutable data structures, just don't do it”.
2. Documentation. Should library authors explicitly mention async exceptions safety? For example https://hackage.haskell.org/package/hashtables https://hackage.haskell.org/package/hashtables – is it async exceptions safe when used in IO? Or even worse https://hackage.haskell.org/package/ghc-prim-0.5.1.0/docs/GHC-Prim.html#v:re... https://hackage.haskell.org/package/ghc-prim-0.5.1.0/docs/GHC-Prim.html#v:re... - what will happened in case of async exception? This functions is sate-polimorphic, will it implicitly mask exceptions if used from IO?
3. Best practices. How should we deal with problem? Is creating separate versions of code for ST and IO is the only way? Probably it is possible to add “mask” to something like https://hackage.haskell.org/package/primitive-0.6.2.0/docs/Control-Monad-Pri... https://hackage.haskell.org/package/primitive-0.6.2.0/docs/Control-Monad-Pri... emit mask in IO instance and NOOP in ST version? Or maybe somebody know better patterns for async exeption safe code?
-- Sincerely, Stanislav Chernichkin.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Sincerely, Stanislav Chernichkin.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.