
On Mon, 2007-06-04 at 13:12 +0100, Alistair Bayley wrote:
BTW, what's the difference between the indexXxxxOffAddr# and readXxxxOffAddr# functions in GHC.Prim?
Right. So it'd only be safe to use the index ones on immutable arrays because there's no way to enforce sequencing with respect to array writes when using the index version.
In this case I'm reading from a CString buffer, which is (hopefully) not changing during the function invocation, and never written to by my code. So presumably it'd be pretty safe to use the index- functions.
Yes.
- Ptrs don't get unboxed. Why is this? Some IO monad thing?
Got any more detail?
OK. readUTF8Char's transformation starts with this:
$wreadUTF8Char_r3de = \ (ww_s33v :: GHC.Prim.Int#) (w_s33x :: GHC.Ptr.Ptr GHC.Word.Word8) ->
If we expect it to unbox, I'd expect the Ptr to become Addr#. Later, this (w_s33x) gets unboxed just before it's used:
case w_s33x of wild6_a2JM { GHC.Ptr.Ptr a_a2JO -> case GHC.Prim.readWord8OffAddr# @ GHC.Prim.RealWorld a_a2JO 1 s_a2Jf
readUTF8Char is called by fromUTF8Ptr, where there's a little Ptr arithmetic. The Ptr argument to fromUTF8Ptr is unboxed, offset is added, and the result is reboxed so that it can be consumed by readUTF8Char. All a bit unnecessary, I think e.g.
Are you sure fromUTF8Ptr is strict in its ptr arg? Try with a ! pattern on that arg. You'll need -fbang-patterns. That translates into the seq False trick that oy're already using elsewhere. Experimenting by adding ! patterns is much quicker and easier however. Once you've got the right set of strictness annotations you can go back to using the more portable, but ugly seq False trick. You can also get ghc to tell you what strictness it inferred for your functions. It's shown in the .hi file. Use ghc --show-iface UTF8.hi. I think the "UL" syntax for describing the strictness is described in the GHC manual somewhere (or perhaps it's on the GHC wiki). Duncan