
The code for 'allocate' in rts/sm/Storage.c doesn't seem that
expensive. An extra branch compared to inline allocation and
allocation is done in the next nursery block (risking fragmentation?).
-- Johan
On Mon, Sep 30, 2013 at 9:50 PM, Johan Tibell
Hi,
When I benchmark Data.HashMap.insert from unordered-containers (inserting the keys [0..10000]) the runtime is dominated by GC:
$ cat Test.hs module Main where
import Control.DeepSeq import Control.Exception import Control.Monad import qualified Data.HashMap.Strict as HM import Data.List (foldl')
main = do let ks = [0..10000] :: [Int] evaluate (rnf ks) forM_ ([0..1000] :: [Int]) $ \ x -> do evaluate $ HM.null $ foldl' (\ m k -> HM.insert k x m) HM.empty ks
$ perf record -g ./Test +RTS -s 6,187,678,112 bytes allocated in the heap 3,309,887,128 bytes copied during GC 1,299,200 bytes maximum residency (1002 sample(s)) 118,816 bytes maximum slop 5 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause Gen 0 11089 colls, 0 par 1.31s 1.30s 0.0001s 0.0005s Gen 1 1002 colls, 0 par 0.49s 0.51s 0.0005s 0.0022s
INIT time 0.00s ( 0.00s elapsed) MUT time 1.02s ( 1.03s elapsed) GC time 1.80s ( 1.80s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.82s ( 2.84s elapsed)
%GC time 63.7% (63.5% elapsed)
Alloc rate 6,042,264,963 bytes per MUT second
Productivity 36.3% of total user, 36.1% of total elapsed
$ perf report 41.46% Test Test [.] evacuate 15.47% Test Test [.] scavenge_block 11.04% Test Test [.] s3cN_info 8.74% Test Test [.] s3aZ_info 3.59% Test Test [.] 0x7ff5 2.83% Test Test [.] scavenge_mut_arr_ptrs 2.69% Test libc-2.15.so [.] 0x147fd9 2.51% Test Test [.] allocate 2.00% Test Test [.] s3oo_info 0.91% Test Test [.] todo_block_full 0.87% Test Test [.] hs_popcnt64 0.80% Test Test [.] s3en_info 0.62% Test Test [.] s3el_info
Is GC:ing StgMutArrPtrs and StgArrPtrs, which I create a lot of, more expensive than GC:ing normal heap objects (i.e. for standard data types)?
-- Johan