
Hello, I need to calculate the frequency of each character in a String. And if I can do this really well in C, I dont find a nice (and fast) answer in haskell. I tried several functions, listed below, and even the fastest do a lot of unnecessary things : calc :: String -> [ (Char, Int) ] -- 3.0s normally (without profiling) -- time 10-12% alloc 59% (info from profiling) -- so it's the fastest when I profile but not when I compile normally -- mutable array may be better but it's to complicated for me calc = filter (\p -> snd p > 0) . assocs . foldl (\k c -> unsafeReplace k [(fromEnum c, (unsafeAt k (fromEnum c))+1)] ) k where k = array (toEnum 0, toEnum 255) [(toEnum i, 0) | i <- [0 .. 255]] :: UArray Char Int -- 2.1s normally -- time 15-19% alloc 40% (info from profiling) -- so for true, it's the best but the sort and group probably do unnecessary things calc s = map (\l -> (head l, length l)) $ group $ sort s -- 3.4s normally -- time 58% alloc 0% (info from profiling) -- this one dont do unnecessary things but has to read the file again for each character -- calc s = map (\c -> (c, foldl (\a b -> if b==c then a+1 else a) 0 s)) $ nub s -- 22s normally -- time 85% alloc 92% (info from profiling) -- this one read the file only one time but is really slow calc = foldl (addfreq) [] where addfreq f c = let xs1 = takeWhile (\f -> fst f /= c) f xs2 = dropWhile (\f -> fst f /= c) f xs = if null xs2 then [(c,1)] else ((fst . head) xs2, (snd . head) xs2 + 1) : tail xs2 in xs1 ++ xs -- I have a lot of even slower version but I wont include them -- each compilation was done with GHC 6.4.1 with the -O flag and with -O -prof -auto-all for profiling Thanks for your answer, Charles PS : Yes, english is not my mother language... :-(