RE: [Haskell] TArray?

On 13 December 2005 14:52, Jan-Willem Maessen wrote:
On Dec 13, 2005, at 8:46 AM, Simon Marlow wrote:
[In response to another plea for TArrays]
In the past I have used arrays of TVars, as Thomasz suggested. It would indeed be better to have a primitive STM array, the only problem with this is the extra complexity. One simplifying assumption is that it should consider changes at the level of the whole array, rather than per-element (otherwise you'd use an array of TVars).
Actually, in that case it might be more useful to have a TMVar containing an array. But I suspect the need for this use case is small. I know a ton of uses for transactionally-updated arrays for which the goal is to permit concurrent access to independent array elements (concurrent hash tables come to mind as an obvious use case where transactions make life vastly simpler).
You might ask Tim Harris whether there's a reasonably simple, clever way to do this using arrays + CAS. I believe such a trick exists---you might end up waking too many threads on a write, but you'd get read/write concurrency at least.
One simple plan is to log writes at the element level during a transaction (i.e. don't copy the whole array), but don't track *waiting* at the element level, so a thread waiting on a TArray will get woken up whenever the array is modified. You get read/write concurrency this way. Committing a transaction still has to lock the whole array during the update. Tim, do you know of any better schemes? Cheers, Simon

On 12/14/05, Simon Marlow
On 13 December 2005 14:52, Jan-Willem Maessen wrote:
On Dec 13, 2005, at 8:46 AM, Simon Marlow wrote:
[In response to another plea for TArrays]
In the past I have used arrays of TVars, as Thomasz suggested. It would indeed be better to have a primitive STM array, the only problem with this is the extra complexity. One simplifying assumption is that it should consider changes at the level of the whole array, rather than per-element (otherwise you'd use an array of TVars).
Actually, in that case it might be more useful to have a TMVar containing an array. But I suspect the need for this use case is small. I know a ton of uses for transactionally-updated arrays for which the goal is to permit concurrent access to independent array elements (concurrent hash tables come to mind as an obvious use case where transactions make life vastly simpler).
You might ask Tim Harris whether there's a reasonably simple, clever way to do this using arrays + CAS. I believe such a trick exists---you might end up waking too many threads on a write, but you'd get read/write concurrency at least.
One simple plan is to log writes at the element level during a transaction (i.e. don't copy the whole array), but don't track *waiting* at the element level, so a thread waiting on a TArray will get woken up whenever the array is modified. You get read/write concurrency this way. Committing a transaction still has to lock the whole array during the update.
Wouldn't there be a speedup to do both writes and waiting at the array level, BUT annotated with an index? So a write to an array would only wake threads with the appropriate index. I'm not up to snuff on the implementation details here, so maybe this is what the Array-of-Tvars would reduce to when compiled... Anyway, the main gist of my original post was that TArrays should be in the libraries, so that I can safely use it without having to send along my own implementation each time (and potentially colliding with someone else's implementation down the line). When/if a primitve TArray is implemented, the Array-of-Tvars-approach could just be replaced, and all programs which use the TArray would get an automatic speed-boost. /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

On Thu, Dec 15, 2005 at 02:17:18PM +0100, Sebastian Sylvan wrote:
Wouldn't there be a speedup to do both writes and waiting at the array level, BUT annotated with an index?
I strongly vote to leave STM as it is, and implement TArray as a library on top of it. STM implementation is probably already quite complex, and complicating it with some fancy treating of TArray would do no good for correctness. There's also some possibility that TArrays' special-case implementation would have some negative impact on efficiency of ordinary TVars. If could be implemented without any negative impact, like bugs, GHC team having less time for other things, then I am all for it. Anyway, I am not the one to decide.
Anyway, the main gist of my original post was that TArrays should be in the libraries, so that I can safely use it without having to send along my own implementation each time (and potentially colliding with someone else's implementation down the line).
Agreed.
When/if a primitve TArray is implemented, the Array-of-Tvars-approach could just be replaced, and all programs which use the TArray would get an automatic speed-boost.
I think that TArray efficiency is not as important as efficiency of STArrays, UArrays and IOArrays. After all, TArrays are bound to have greater overhead than ordinary arrays and you are not supposed to do heavy computations in STM. On the other hand, the more efficient STM is, the more scalable are our concurrent programs, so I may be wrong here. Best regards Tomasz -- I am searching for a programmer who is good at least in some of [Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
participants (3)
-
Sebastian Sylvan
-
Simon Marlow
-
Tomasz Zielonka