
Posting to cafe since I got just one reply on beginner@. I was suggested to include more SCC annotations, but that didn't help. The 'serialize' function is still reported to consume about 32% of running time, 29% inherited. However, functions called from it only account for about 3% of time. How do I get more insight into that? Thanks. -- Vlad Skvortsov, vss@73rus.com, http://vss.73rus.com

2008/9/8 Vlad Skvortsov
Posting to cafe since I got just one reply on beginner@. I was suggested to include more SCC annotations, but that didn't help. The 'serialize' function is still reported to consume about 32% of running time, 29% inherited. However, functions called from it only account for about 3% of time.
If "serialize" calls standard library functions, this is probably because the profiling libraries weren't built with -auto-all -- so the profiling report won't tell you how much time standard library functions consume. You can rebuild the libraries with -auto-all, but probably much easier would be to add SCC annotations to each call site. For example, you could annotate your locally defined dumpWith function like so: dumpWith f = {-# SCC "foldWithKey" #-} Data.Map.foldWithKey f [] docToStr k (Doc { docName=n, docVectorLength=vl}) = (:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl)) Then your profiling report will tell you how much time/memory that particular call to foldWithKey uses. By the way, using foldl rather than foldl' or foldr is almost always a performance bug. Cheers, Tim -- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt Just enough: Obama/Biden '08.

Tim Chevalier wrote:
2008/9/8 Vlad Skvortsov
: Posting to cafe since I got just one reply on beginner@. I was suggested to include more SCC annotations, but that didn't help. The 'serialize' function is still reported to consume about 32% of running time, 29% inherited. However, functions called from it only account for about 3% of time.
If "serialize" calls standard library functions, this is probably because the profiling libraries weren't built with -auto-all -- so the profiling report won't tell you how much time standard library functions consume.
Hmm, that's a good point! I didn't think about it. Though how do I make GHC link in profiling versions of standard libraries? My own libraries are built with profiling support and I run Setup.hs with --enable-library-profiling and --enable-executable-profiling options.
You can rebuild the libraries with -auto-all, but probably much easier would be to add SCC annotations to each call site. For example, you could annotate your locally defined dumpWith function like so:
dumpWith f = {-# SCC "foldWithKey" #-} Data.Map.foldWithKey f [] docToStr k (Doc { docName=n, docVectorLength=vl}) = (:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl))
Here is how my current version of the function looks like: serialize :: Database -> [[String]] serialize db = {-# SCC "XXXCons" #-} [ [dbFormatTag], ({-# SCC "dwDoc" #-} dumpWith docToStr dmap), ({-# SCC "dwTerm" #-} dumpWith termToStr tmap) ] where (dmap, tmap) = {-# SCC "XXX" #-} db dumpWith f = {-# SCC "dumpWith" #-} Data.Map.foldWithKey f [] docToStr :: DocId -> Doc -> [String] -> [String] docToStr k (Doc { docName=n, docVectorLength=vl}) = {-# SCC "docToStr" #-} ((:) ("d " ++ show k ++ " " ++ n ++ " " ++ (show vl))) termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ (foldl ilItemToStr "" il))) ilItemToStr acc (docid, weight) = {-# SCC "ilItemToStr" #-} (show docid ++ ":" ++ show weight ++ " " ++ acc) ...and still I don't see these cost centers to take a lot of time (they add up to about 3%, as I said before).
Then your profiling report will tell you how much time/memory that particular call to foldWithKey uses.
By the way, using foldl rather than foldl' or foldr is almost always a performance bug
Data.Map.foldWith key is implemented with foldr[1], however I'm not sure I'm getting how foldr is superior to foldl here (foldl' I understand). Could you shed some light on that for me please? Thanks! [1]: http://www.haskell.org/ghc/docs/latest/html/libraries/containers/src/Data-Ma... -- Vlad Skvortsov, vss@73rus.com, http://vss.73rus.com

On Wed, Sep 10, 2008 at 12:31 PM, Vlad Skvortsov
Hmm, that's a good point! I didn't think about it. Though how do I make GHC link in profiling versions of standard libraries? My own libraries are built with profiling support and I run Setup.hs with --enable-library-profiling and --enable-executable-profiling options.
When you build your own code with -prof, GHC automatically links in profiling versions of the standard libraries. However, its profiling libraries were not built with -auto-all (the reason is that adding cost centres interferes with optimization). To build the libraries with -auto-all, you would need to build GHC from sources, which is not for the faint of heart. However, the results of doing that aren't usually very enlightening anyway -- for example, foldr might be called from many different places, but you might only care about a single call site (and then you can annotate that call site). Just from looking, I would guess this is the culprit:
termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ (foldl ilItemToStr "" il)))
If you want to be really sure, you can rewrite this as: termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ ({-# SCC "termToStr_foldl" #-} foldl ilItemToStr "" il))) and that will give you a cost centre measuring the specific cost of the invocation of foldl.
Data.Map.foldWith key is implemented with foldr[1], however I'm not sure I'm getting how foldr is superior to foldl here (foldl' I understand). Could you shed some light on that for me please?
I meant the call to foldl in termToStr. There's a good explanation of this at: http://en.wikibooks.org/wiki/Haskell/Performance_Introduction (look for "foldl"). Cheers, Tim -- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt Just enough: Obama/Biden '08.

Tim Chevalier wrote:
When you build your own code with -prof, GHC automatically links in profiling versions of the standard libraries. However, its profiling libraries were not built with -auto-all (the reason is that adding cost centres interferes with optimization). To build the libraries with -auto-all, you would need to build GHC from sources, which is not for the faint of heart. However, the results of doing that aren't usually very enlightening anyway -- for example, foldr might be called from many different places, but you might only care about a single call site (and then you can annotate that call site).
Hmm, okay -- that makes some sense.
Just from looking, I would guess this is the culprit:
termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ (foldl ilItemToStr "" il)))
If you want to be really sure, you can rewrite this as:
termToStr t il = {-# SCC "termToStr" #-} ((:) ("t " ++ t ++ " " ++ ({-# SCC "termToStr_foldl" #-} foldl ilItemToStr "" il)))
and that will give you a cost centre measuring the specific cost of the invocation of foldl.
I did that and found out that it accounts for only about 0.6 percent of the running time. Changing fold to fold' does improve it, though overall it's not that significant (again, since it's not the bottleneck). I just realized that most of the time is spent inside 'serialize' and not inherited as I originally claimed. Here is how my current code and profiling output look like: http://hpaste.org/10329 How do I figure out what exactly in 'serialize' takes so much time? -- Vlad Skvortsov, vss@73rus.com, http://vss.73rus.com

On Fri, Sep 12, 2008 at 2:30 PM, Vlad Skvortsov
How do I figure out what exactly in 'serialize' takes so much time?
At this point, I don't know any more -- I can't see what inherited costs serialize could have that don't come from one of the cost centres you've inserted, either. Perhaps you could try the glasgow-haskell-users mailing list, where people interested in this are more likely to be reading carefully... Cheers, Tim -- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt Just enough: Obama/Biden '08.
participants (2)
-
Tim Chevalier
-
Vlad Skvortsov