In particular I think you need:
casMutArray# :: MutableArray# s a -> Int# -> a -> a -> State# s -> (# State# s, Int#, a #)
casWord16MutByteArray :: MutableByteArray# s -> Int# -> Word# -> Word# -> State# s -> (# State# s, Int#, Word#)
FYI, I started working on adding these. I'm hoping to have it working in GHC HEAD for any students who need to use it. To my knowledge the only two patches required to implement casMutVar# were these two (plus the preexisting cas() definition in SMP.h):
The latter is a bugfix to the former.
I just read in your proposal that you started looking into the casMutArray# issue as well. How far have you gotten with that? Do you want to work on this together a bit?
I've got an implementation of a casArray# primop that passes a basic test, but I'm not sure if the GC write barrier is correct:
The ByteArray versions will be more annoying, requiring more variations, but they are also less essential, because the user can always use ForeignPtr and bits-atomic in this case, and I believe for our concurrent data structures we want to store arbitrary pointers (hence casArray#).
Cheers,
-Ryan