
Am Mittwoch 02 Dezember 2009 18:54:51 schrieb Christian Maeder:
Daniel Fischer schrieb:
However, according to a couple of tests, the "funkyName" version is somewhat faster and allocates less.
My timing tests showed that your fpairs version is fastest. (first argument "True" selects filteredPairs, "False" funkyName)
I can confirm that for your test; still funkyName allocates less: ./FilteredPairs True EQ 5000 +RTS -sstderr 5000 1,810,136 bytes allocated in the heap 1,160,412 bytes copied during GC 517,964 bytes maximum residency (1 sample(s)) 16,932 bytes maximum slop 2 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 2 collections, 0 parallel, 0.01s, 0.01s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.44s ( 0.44s elapsed) GC time 0.01s ( 0.01s elapsed) ./FilteredPairs False EQ 5000 +RTS -sstderr 5000 1,432,328 bytes allocated in the heap 974,252 bytes copied during GC 441,064 bytes maximum residency (1 sample(s)) 27,608 bytes maximum slop 2 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 2 collections, 0 parallel, 0.00s, 0.00s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.84s ( 0.84s elapsed) GC time 0.01s ( 0.01s elapsed) ./FilteredPairs True GT 5000 +RTS -sstderr 5000 10,961,984 bytes allocated in the heap 12,164,420 bytes copied during GC 3,046,920 bytes maximum residency (4 sample(s)) 25,836 bytes maximum slop 7 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 16 collections, 0 parallel, 0.04s, 0.04s elapsed Generation 1: 4 collections, 0 parallel, 0.03s, 0.04s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.23s ( 0.24s elapsed) GC time 0.08s ( 0.09s elapsed) ./FilteredPairs False GT 5000 +RTS -sstderr 5000 5,246,036 bytes allocated in the heap 5,185,808 bytes copied during GC 1,699,744 bytes maximum residency (2 sample(s)) 27,612 bytes maximum slop 4 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 8 collections, 0 parallel, 0.02s, 0.02s elapsed Generation 1: 2 collections, 0 parallel, 0.02s, 0.01s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.44s ( 0.45s elapsed) GC time 0.04s ( 0.03s elapsed)
My initial version "myf" is almost unusable.
C.
(code attached)
maeder@leibniz:~/haskell/examples> ghc --make -O2 FilteredPairs.hs [1 of 1] Compiling Main ( FilteredPairs.hs, FilteredPairs.o ) Linking FilteredPairs ... maeder@leibniz:~/haskell/examples> time ./FilteredPairs True EQ 5000 5000
real 0m0.567s user 0m0.536s sys 0m0.020s maeder@leibniz:~/haskell/examples> time ./FilteredPairs False EQ 5000 5000
real 0m0.819s user 0m0.796s sys 0m0.012s
But with a different test, funkyName is considerably faster: ./pairs 1 8 20 +RTS -sstderr -A150M 5529600 899,189,488 bytes allocated in the heap 72,912,040 bytes copied during GC 28,074,964 bytes maximum residency (2 sample(s)) 465,800 bytes maximum slop 200 MB total memory in use (2 MB lost due to fragmentation) Generation 0: 4 collections, 0 parallel, 0.17s, 0.21s elapsed Generation 1: 2 collections, 0 parallel, 0.36s, 0.39s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.72s ( 0.95s elapsed) GC time 0.52s ( 0.60s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.25s ( 1.56s elapsed) %GC time 41.9% (38.8% elapsed) Alloc rate 1,235,074,051 bytes per MUT second Productivity 57.8% of total user, 46.5% of total elapsed ./pairs 2 8 20 +RTS -sstderr -A150M 5529600 651,866,696 bytes allocated in the heap 76,108,204 bytes copied during GC 28,075,464 bytes maximum residency (2 sample(s)) 465,248 bytes maximum slop 200 MB total memory in use (2 MB lost due to fragmentation) Generation 0: 3 collections, 0 parallel, 0.18s, 0.20s elapsed Generation 1: 2 collections, 0 parallel, 0.38s, 0.41s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.61s ( 0.77s elapsed) GC time 0.56s ( 0.61s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.17s ( 1.38s elapsed) %GC time 47.6% (44.0% elapsed) Alloc rate 1,065,075,527 bytes per MUT second Productivity 52.1% of total user, 44.0% of total elapsed ./pairs 3 8 20 +RTS -sstderr -A150M 5529600 516,244,640 bytes allocated in the heap 84,175,532 bytes copied during GC 28,074,916 bytes maximum residency (2 sample(s)) 465,940 bytes maximum slop 200 MB total memory in use (2 MB lost due to fragmentation) Generation 0: 2 collections, 0 parallel, 0.16s, 0.17s elapsed Generation 1: 2 collections, 0 parallel, 0.38s, 0.41s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.42s ( 0.60s elapsed) GC time 0.53s ( 0.58s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.95s ( 1.18s elapsed) %GC time 56.1% (49.0% elapsed) Alloc rate 1,240,892,153 bytes per MUT second Productivity 43.9% of total user, 35.1% of total elapsed ./pairs 4 8 20 +RTS -sstderr -A150M 5529600 271,711,724 bytes allocated in the heap 28,059,740 bytes copied during GC 28,075,488 bytes maximum residency (1 sample(s)) 461,344 bytes maximum slop 172 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 1 collections, 0 parallel, 0.00s, 0.00s elapsed Generation 1: 1 collections, 0 parallel, 0.18s, 0.21s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.24s ( 0.41s elapsed) GC time 0.18s ( 0.21s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.43s ( 0.62s elapsed) %GC time 43.0% (33.3% elapsed) Alloc rate 1,113,504,186 bytes per MUT second Productivity 57.0% of total user, 39.1% of total elapsed Which one is faster depends on what you do, it seems.