How to improve the zipwith's performance

Dear All I write a code for Clustering with Data.Clustering.Hierarchical, but it's slow. I use the profiling and change some code, but I don't know why zipwith take so many time? (even I change list to vector) My code is as blow, Any one kindly give me some advices. ====================== main = do .... let cluster = dendrogram SingleLinkage vectorList getVectorDistance .... getExp2 v1 v2 = d*d where d = v1 - v2 getExp v1 v2 | v1 == v2 = 0 | otherwise = getExp2 v1 v2 tfoldl d = DV.foldl1' (+) d changeDataType:: Int -> Double changeDataType d = fromIntegral d getVectorDistance::(a,DV.Vector Int)->(a, DV.Vector Int )->Double getVectorDistance v1 v2 = fromIntegral $ tfoldl dat where l1 = snd v1 l2 = snd v2 dat = DV.zipWith getExp l1 l2 ======================================= build with ghc -prof -fprof-auto -rtsopts -O2 log_cluster.hs run with log_cluster.exe +RTS -p profiling result is log_cluster.exe +RTS -p -RTS total time = 8.43 secs (8433 ticks @ 1000 us, 1 processor) total alloc = 1,614,252,224 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc getVectorDistance.dat Main 49.4 37.8 tfoldl Main 5.7 0.0 getExp Main 4.5 0.0 getExp2 Main 0.5 1.5

Hey, Jun Zhang! It would be nice if you provided a full runnable example so that someone may tinker with it testing different approaches. As it stands, I don't have any suggestions of how you could extract more performance. Cheers! -- Felipe.

Dear The runnable code is blow import Data.Clustering.Hierarchical import qualified Data.Vector.Primitive as DV import System.Random import Control.Monad main = do vectorList <- genTestdata let cluster = dendrogram SingleLinkage vectorList getVectorDistance putStrLn $ show cluster genZero x | x<5 = x |otherwise = 0 genVector::IO (DV.Vector Int) genVector = do listRandom <- mapM (\x -> randomRIO (1,30) ) [1..20] let intOut = DV.fromList $ map genZero listRandom return intOut genTestdata = do r <- sequence $ map (\x -> liftM (\y -> (x,y)) genVector) [1..1000] return r getExp2 v1 v2 = d*d where d = v1 - v2 getExp v1 v2 | v1 == v2 = 0 | otherwise = getExp2 v1 v2 tfoldl d = DV.foldl1' (+) d changeDataType:: Int -> Double changeDataType d = fromIntegral d getVectorDistance::(a,DV.Vector Int)->(a, DV.Vector Int )->Double getVectorDistance v1 v2 = fromIntegral $ tfoldl dat where l1 = snd v1 l2 = snd v2 dat = DV.zipWith getExp l1 l2 发自我的 iPhone
在 2014年8月15日,上午4:25,Felipe Lessa
写道: Hey, Jun Zhang!
It would be nice if you provided a full runnable example so that someone may tinker with it testing different approaches.
As it stands, I don't have any suggestions of how you could extract more performance.
Cheers!
-- Felipe.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Felipe Lessa
-
julian
-
jun zhang