
On Sun, 2008-12-28 at 08:47 -0800, Bryan O'Sullivan wrote:
On Sun, Dec 28, 2008 at 1:38 AM, Thomas DuBuisson
wrote: getNthWord n bs@(PS ptr off len) = inlinePerformIO $ withForeignPtr ptr $ \ptr' -> do let p = castPtr $ plusPtr ptr' off peekElemOff p n
The overhead here is very probably caused by withForeignPtr. In similar cases, I've seen much better performance from hoisting this to the outside of a loop.
Since ghc version 6.6, withForeignPtr has no overhead at all so there is no benefit to hoisting it out of loops. You can verify this by inspecting the core output. In ghc-6.6 the representation of ForeignPtr changed to put the Addr# directly in the ForeignPtr constructor (rather than an indirection further away). Duncan