
Dear All I write a code for Clustering with Data.Clustering.Hierarchical, but it's slow. I use the profiling and change some code, but I don't know why zipwith take so many time? (even I change list to vector) My code is as blow, Any one kindly give me some advices. ====================== main = do .... let cluster = dendrogram SingleLinkage vectorList getVectorDistance .... getExp2 v1 v2 = d*d where d = v1 - v2 getExp v1 v2 | v1 == v2 = 0 | otherwise = getExp2 v1 v2 tfoldl d = DV.foldl1' (+) d changeDataType:: Int -> Double changeDataType d = fromIntegral d getVectorDistance::(a,DV.Vector Int)->(a, DV.Vector Int )->Double getVectorDistance v1 v2 = fromIntegral $ tfoldl dat where l1 = snd v1 l2 = snd v2 dat = DV.zipWith getExp l1 l2 ======================================= build with ghc -prof -fprof-auto -rtsopts -O2 log_cluster.hs run with log_cluster.exe +RTS -p profiling result is log_cluster.exe +RTS -p -RTS total time = 8.43 secs (8433 ticks @ 1000 us, 1 processor) total alloc = 1,614,252,224 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc getVectorDistance.dat Main 49.4 37.8 tfoldl Main 5.7 0.0 getExp Main 4.5 0.0 getExp2 Main 0.5 1.5