
Tim Chevalier wrote:
2008/9/8 Vlad Skvortsov
: Posting to cafe since I got just one reply on beginner@. I was suggested to include more SCC annotations, but that didn't help. The 'serialize' function is still reported to consume about 32% of running time, 29% inherited. However, functions called from it only account for about 3% of time.
If "serialize" calls standard library functions, this is probably because the profiling libraries weren't built with -auto-all -- so the profiling report won't tell you how much time standard library functions consume.
Hmm, that's a good point! I didn't think about it. Though how do I make GHC link in profiling versions of standard libraries? My own libraries are built with profiling support and I run Setup.hs with --enable-library-profiling and --enable-executable-profiling options.
You can rebuild the libraries with -auto-all, but probably much easier would be to add SCC annotations to each call site. For example, you could annotate your locally defined dumpWith function like so:
dumpWith f = {-# SCC "foldWithKey" #-} Data.Map.foldWithKey f [] docToStr k (Doc { docName=n, docVectorLength=vl}) = (:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl))
Here is how my current version of the function looks like: serialize :: Database -> [[String]] serialize db = {-# SCC "XXXCons" #-} [ [dbFormatTag], ({-# SCC "dwDoc" #-} dumpWith docToStr dmap), ({-# SCC "dwTerm" #-} dumpWith termToStr tmap) ] where (dmap, tmap) = {-# SCC "XXX" #-} db dumpWith f = {-# SCC "dumpWith" #-} Data.Map.foldWithKey f [] docToStr :: DocId -> Doc -> [String] -> [String] docToStr k (Doc { docName=n, docVectorLength=vl}) = {-# SCC "docToStr" #-} ((:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl))) termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ (foldl ilItemToStr "" il))) ilItemToStr acc (docid, weight) = {-# SCC "ilItemToStr" #-} (show docid ++ ":" ++ show weight ++ " " ++ acc) ...and still I don't see these cost centers to take a lot of time (they add up to about 3%, as I said before).
Then your profiling report will tell you how much time/memory that particular call to foldWithKey uses.
By the way, using foldl rather than foldl' or foldr is almost always a performance bug
Data.Map.foldWith key is implemented with foldr[1], however I'm not sure I'm getting how foldr is superior to foldl here (foldl' I understand). Could you shed some light on that for me please? Thanks! [1]: http://www.haskell.org/ghc/docs/latest/html/libraries/containers/src/Data-Ma... -- Vlad Skvortsov, vss@73rus.com, http://vss.73rus.com