
On Thu, Mar 25, 2004 at 11:39:26AM -0800, Carl Witty wrote:
On Thu, 2004-03-25 at 10:09, David Roundy wrote:
The function is a simple packed string compare, and I basically just call the C standard library function memcmp for this. Without further ado:
I don't know how to make your particular function faster, but I do have some ideas for different approaches you may not have thought of.
If you compare strings more often than you create strings, and the comparisons usually fail, you could add a hash of the string to your PackedString datatype; then psniceq could compare hash values first and only confirm equality with the memcmp if the hash values match.
In many cases, *very* large strings will be created and never compared, so hashing them doesn't sound like a good plan. On the other hand, it has occurred to me that although comparisons usually fail, perhaps those that succeed are the ones that are taking all the time, since those ones require checking the entire string (rather than just up to the first difference). So I've added a check whether the two strings have the same pointer and offset prior to calling memcmp. Due to other aspects of how my PackedString's work, identical strings should "usually" be represented by pointers to the same memory. psniceq :: PackedString -> PackedString -> Bool psniceq a@(PS x1 s1 l1) b@(PS x2 s2 l2) | nullPS a && nullPS b = True psniceq (PS x1 s1 l1) (PS x2 s2 l2) = (l1 == l2 &&) $ unsafePerformIO $ withForeignPtr x1 $ \p1-> withForeignPtr x2 $ \p2 -> if p1 == p2 && s1 == s2 then return True else liftM (==0) $ c_memcmp (p1 `plusPtr` s1) (p2 `plusPtr` s2) l1 (Haven't yet tested the above, since I've been making other changes, working on the *real* problem, which is why I'm comparing so many darn strings...) -- David Roundy http://www.abridgegame.org