
On Mon, 2007-06-04 at 09:43 +0100, Alistair Bayley wrote:
After some expriments with the simplifier, I think I have a portable version of a direct-from-buffer decoder which seems to perform nearly as well as one written directly against GHC primitive unboxed functions. I'm wondering if there's anything further I can do to improve performance. The "portable" unboxed version is within about 15% of the unboxed version in terms of time and allocation.
Well done.
Changes I made: - added strictness "annotations" in the form of a strictness guards that are always False - unrolled loops (they were always short loops anyway, with a maximum of 3 or 4 iterations) - replaced shiftL with multiplication, because multiplication unboxes, while shiftL doesn't.
Some things I've noticed in the simplifier output: - the shiftL call hasn't unboxed or inlined into a call to uncheckedShiftL#, which I would prefer. Would this be possible if we added unchecked versions of the shiftL/R functions to Data.Bits?
I think this was discussed before. Check the archives. I don't think there was any resolution however. As I recall there was some attempt to get the ordinary shiftL/R functions to inline and eliminate the bounds checks when the shifts were constants less than the bounds.
- Ptrs don't get unboxed. Why is this? Some IO monad thing?
Got any more detail?
- the chr function tests that its Int argument is less than 1114111, before constructing the Char. It'd be nice to avoid this test.
Yeah. In Data.ByteString.Char8 we invent this w2c & c2w functions to avoid the test. There should probably be a standard version of this unchecked conversion.
- why does this code:
| x <= 0xF7 = remaining 3 (bAND x 0x07) xs | otherwise = err x
turn into this i.e. the <= turns into two identical case-branches, using eqword# and ltword#, rather than one case-branch using leword# ?
I thought it might be to do with the instance declaration not giving a method for <= but letting it default to being defined in terms of < and ==, however it's hard to say what's really going on, since the instance declaration is just: data Word = W# Word# deriving (Eq, Ord) I have no idea how ghc derives Eq and Ord instances for a type when it contains unboxed components, which cannot themselves be members of a type class.
BTW, what's the difference between the indexXxxxOffAddr# and readXxxxOffAddr# functions in GHC.Prim? AFAICT they are equivalent, except that the read* functions take an extra State# s parameter. Presumably this is to thread the IO monad's RealWorld value through, to create some sort of data dependency between the functions (and so to ensure ordered evaluation?)
Right. So it'd only be safe to use the index ones on immutable arrays because there's no way to enforce sequencing with respect to array writes when using the index version. Duncan