Why write out-of-line PrimOps in Cmm?

I was looking at the eventlog code, and I wanted to move processing of a full eventlog buffer into Haskell, instead of the now fixed behavior of writing the data to a file. To do this, I though that having a haskell process blocked on an MVar, or Chan would be nice, and then some way to signal the process. But the low-level PrimOps for MVars, takeMVar, tryTakeMVar etc are in PrimOps.cmm and I don't know how to call them from the RTS. This lead me to question what the point of out-of-line PrimOps in PrimOps.cmm is. I don't think the commentary covers this. http://hackage.haskell.org/trac/ghc/wiki/Commentary/PrimOps So why aren't all the stuff in PrimOps.cmm just "ccall" wrappers around C implementations? Wouldn't that in general be more flexible for the RTS? Alexander

On 14/02/13 11:55, Alexander Kjeldaas wrote:
I was looking at the eventlog code, and I wanted to move processing of a full eventlog buffer into Haskell, instead of the now fixed behavior of writing the data to a file.
To do this, I though that having a haskell process blocked on an MVar, or Chan would be nice, and then some way to signal the process.
But the low-level PrimOps for MVars, takeMVar, tryTakeMVar etc are in PrimOps.cmm and I don't know how to call them from the RTS.
You can't call them directly. The problem is that to call Haskell code you need a Haskell thread to run it in. There are a couple of ways that we call Haskell code from the RTS: - rts_lock()/rts_evalIO()/rts_unlock(). This is what calling a foreign export does, and it's quite heavyweight. - create a new Haskell thread and add it to the run queue, like we do for finalizers.
This lead me to question what the point of out-of-line PrimOps in PrimOps.cmm is. I don't think the commentary covers this.
http://hackage.haskell.org/trac/ghc/wiki/Commentary/PrimOps
So why aren't all the stuff in PrimOps.cmm just "ccall" wrappers around C implementations? Wouldn't that in general be more flexible for the RTS?
They need to do things like block, which you can't do from a ccall. There's also the overhead - a primop can allocate memory directly from the nursery, whereas a ccall would have to call allocate() (and couldn't GC). Cheers, Simon
participants (2)
-
Alexander Kjeldaas
-
Simon Marlow