[15/16] SBM: Predictions compared to the measurements

Don Stewart and Duncan Coutts were nice enough to give me some predictions regarding the performance of the programs in my teaser email. When I answered them, I hadn't yet written the code to merge reports so I hadn't yet noticed the general pattern of much worse memory performance on 6.9. Also, Don Stewart hadn't yet released bytestring 0.9.0.2 (he did so in order to fix performance problems pointed out by my teaser benchmarks, actually).
From Duncan Coutts: I'll try guessing at some ratios:
1.0 space-bslc8-foldlx-1 1.1 space-bslc8-acc-1
2.0 space-bs-c8-foldlx-1 2.1 space-bs-c8-acc-1
4.0 space-xxxxx-acc-1 15 space-xxxxx-foldl
I would put them in two groups (I disregard the lenfil benchmark because it was for "extra credit"): space-bslc8-foldlx-1 around 1.4/1.5s (1.2/1.1s with bytestring 0.9.0.2) space-bs-c8-foldlx-1 around 1.2s space-bs-c8-acc-1 around 1.1s (1.3s on 6.6.1) space-bslc8-acc-1 around 3.4s (6.4s on 6.6.1!) space-xxxxx-acc-1 around 5.0s (5.4s on 6.8.2) space-xxxxx-foldl around 5.0s (5.3/5.4s on 6.8.2)
From Duncan Coutts (on the "extra credit" program hs/space-bslc8-lenfil-1): Hmm. So that should work in constant memory, a few 64 chunks at once. I'd expect this to be pretty fast.
Ok, this was mean of me. I knew there was a memory bug here :)
From Don Stewart: Summary: the suspicious lazy bytestring program works now. (constant space, and fastest overall, as expected originally)
Program 1, lazy bytestring length . filter
Yesterday: ./A +RTS -sstderr < 150M 1.01s user 0.10s system 98% cpu 1.123 total 40M allocated
* Today (fixed!): ./A +RTS -sstderr < 150M 0.26s user 0.06s system 96% cpu 0.332 total 2M allocated
Reason, deprecated array fusion mucking up the optimiser.
I think we can close this regression.
Nope. Look at the memory graphs: hs/space-bslc8-lenfil-1: 38632 ██████████▌ | -- 38644 ██████████▌ | -- 1940 ▌ | -- 109404 █████████████████████████████▋ | -- 82324 ██████████████████████▍ | -- 109388 █████████████████████████████▋ | -- 82304 ██████████████████████▎ | It is fixed for ghc 6.8.2 running bytestring 0.9.0.2 but not for ghc 6.9.20071119 and head (as of noon 2007-12-19), no matter the bytestring version. There are lots of memory performance bugs in ghc 6.9. I've reordered the time bar charts to use the same order as my teaser email. hs/space-bslc8-lenfil-1: 1.521 7‰ 0.2 ███████▊ | -- 1.295 1‰ 0.2 ██████▌ | -- 0.429 0‰ 0.2 ██▏ | -- 1.323 1‰ 0.1 ██████▊ | -- 0.483 2‰ 0.4 ██▌ | -- 1.327 1‰ 0.2 ██████▊ | -- 0.482 2‰ 0.4 ██▌ | (this was the "extra credit" program) hs/space-bs-c8-acc-1: 1.287 20‰ 0.2 ██████▌ | -- 1.131 2‰ 0.1 █████▊ | -- 1.132 2‰ 0.3 █████▊ | -- 1.145 1‰ 0.2 █████▊ | -- 1.150 1‰ 0.2 █████▉ | -- 1.147 2‰ 0.1 █████▊ | -- 1.147 2‰ 0.2 █████▊ | hs/space-bslc8-acc-1: 6.367 17‰ 0.2 ████████████████████████████████▎ | -- 3.396 7‰ 0.1 █████████████████▎ | -- 3.508 4‰ 0.1 █████████████████▊ | -- 3.388 10‰ 0.1 █████████████████▏ | -- 3.441 10‰ 0.1 █████████████████▍ | -- 3.438 3‰ 0.0 █████████████████▍ | -- 3.434 13‰ 0.1 █████████████████▍ | hs/space-xxxxx-acc-1: 5.036 7‰ 0.1 █████████████████████████▌ | -- 5.381 6‰ 0.0 ███████████████████████████▎ | -- 5.358 8‰ 0.1 ███████████████████████████▏ | -- 4.991 1‰ 0.1 █████████████████████████▎ | -- 5.047 7‰ 0.1 █████████████████████████▌ | -- 5.051 3‰ 0.1 █████████████████████████▋ | -- 5.049 4‰ 0.1 █████████████████████████▋ | hs/space-bs-c8-foldlx-1: 1.272 1‰ 0.2 ██████▌ | -- 1.205 2‰ 0.2 ██████▏ | -- 1.164 3‰ 0.0 █████▉ | -- 1.221 1‰ 0.1 ██████▏ | -- 1.176 1‰ 0.3 ██████ | -- 1.259 1‰ 0.1 ██████▍ | -- 1.176 1‰ 0.2 ██████ | hs/space-bslc8-foldlx-1: 1.430 1‰ 0.1 ███████▎ | -- 1.400 1‰ 0.1 ███████▏ | -- 1.197 1‰ 0.0 ██████ | -- 1.458 0‰ 0.1 ███████▍ | -- 1.060 1‰ 0.0 █████▍ | -- 1.457 1‰ 0.1 ███████▍ | -- 1.060 1‰ 0.2 █████▍ | hs/space-xxxxx-foldl: 5.024 5‰ 0.0 █████████████████████████▌ | -- 5.371 9‰ 0.0 ███████████████████████████▎ | -- 5.301 9‰ 0.1 ██████████████████████████▉ | -- 5.035 2‰ 0.1 █████████████████████████▌ | -- 5.037 3‰ 0.0 █████████████████████████▌ | -- 5.046 4‰ 0.0 █████████████████████████▌ | -- 5.042 4‰ 0.0 █████████████████████████▌ | -Peter

firefly:
From Don Stewart: Summary: the suspicious lazy bytestring program works now. (constant space, and fastest overall, as expected originally)
Program 1, lazy bytestring length . filter
Yesterday: ./A +RTS -sstderr < 150M 1.01s user 0.10s system 98% cpu 1.123 total 40M allocated
* Today (fixed!): ./A +RTS -sstderr < 150M 0.26s user 0.06s system 96% cpu 0.332 total 2M allocated
Reason, deprecated array fusion mucking up the optimiser.
I think we can close this regression.
Nope. Look at the memory graphs:
hs/space-bslc8-lenfil-1: 38632 ██████████▌ | -- 38644 ██████████▌ | -- 1940 ▌ | -- 109404 █████████████████████████████▋ | -- 82324 ██████████████████████▍ | -- 109388 █████████████████████████████▋ | -- 82304 ██████████████████████▎ |
It is fixed for ghc 6.8.2 running bytestring 0.9.0.2 but not for ghc ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 6.9.20071119 and head (as of noon 2007-12-19), no matter the bytestring ^^^^^^^^^^^^ version. There are lots of memory performance bugs in ghc 6.9.
Please package this up as a bug report against GHC head. Any regression wrt. 6.8.2, using the same bytestring version, is going to be a ghc issue (not a bytestring library issue). -- Don
participants (2)
-
Don Stewart
-
Peter Firefly Brodersen Lund