hamming distance allocation

18 Apr 2010

      Hello all:

I want to generate some hamming distance statistics about a set of strings.
As explained in another e-mail in this list, I used the following code to
call the
functions:
(exampl holds the list of strings of size w)
filter (\x -> x /= 0) $ map (uncurry hammingX) [(xs, ys) | xs <- exampl, ys
<- exampl]

I have two hamming functions:
-- hamming distance for variable length strings
hamming :: String -> String -> Int
hamming x y = hamming' x y 0
    where
      hamming' [] _ !c = c
      hamming' _ [] !c = c
      hamming' (x:xs) (y:ys) !c
          | x == y = hamming' xs ys c
          | otherwise = hamming' xs ys (c + 1)

-- function posted in this mailing list
hamming2 :: String -> String -> Int
hamming2 xs ys = length (filter not (zipWith (==) xs ys))

I am executing these functions millions of times and the bottleneck of my
program is in them as explained by running in profiling mode with  +RTS
-K400M -p -RTS

The costlier function is the hamming distance
COST CENTRE                    MODULE               %time %alloc

hamming                        Distances             66.6   41.9

It says that it is performing 41% of the allocations. In the case of
hamming2 the allocations go as far as 52%.  I could understand that there
are allocations in "hamming2" because we are creating pairs, but in the case
of "hamming" there should be no allocation.

How can I execute my hamming functions without allocating memory?

Best regards,

Arnoldo Muller

Arnoldo Muller

Daniel Fischer

Arnoldo Muller

Arnoldo Muller

Daniel Fischer

Arnoldo Muller

Heinrich Apfelmus

Heinrich Apfelmus

Daniel Fischer

Heinrich Apfelmus

tags

participants (3)