
Hello everyone, I've got a function that I can't think how it can be optimized any more, and perhaps it can't, but just in case, I figured I'd run it by you all, since it's currently taking 70% of my CPU time. The main problem is an unrelated scaling issue (i.e. it's getting called too many times... this is a separate issue I'm working on), but even in "reasonable" cases, it takes 20% of the processor time. The function is a simple packed string compare, and I basically just call the C standard library function memcmp for this. Without further ado: data PackedString = PS !(ForeignPtr Word8) !Int !Int instance Eq PackedString where (==) = psniceq foreign import ccall unsafe "static string.h memcmp" c_memcmp :: Ptr Word8 -> Ptr Word8 -> Int -> IO Int {-# INLINE psniceq #-} psniceq :: PackedString -> PackedString -> Bool psniceq a b | nullPS a && nullPS b = True psniceq (PS x1 s1 l1) (PS x2 s2 l2) = (l1 == l2 &&) $ unsafePerformIO $ withForeignPtr x1 $ \p1-> withForeignPtr x2 $ \p2 -> liftM (==0) $ c_memcmp (p1 `plusPtr` s1) (p2 `plusPtr` s2) l1 Is there anything obvious that I might have missed? Or is there anything incredibly subtle that I missed? :) BTW, I compile this with -O -funbox-strict-fields, so the packed string contents are unboxed, if that makes any difference. -- David Roundy http://www.abridgegame.org