Re: [commit: ghc] master: comments only (9894f6a)

Hi Simon,
JFTR, you seem to be after the trailing zeros in the code you commented.
If the bitmap is *really* that sparse then it might be profitable to
rewrite it in terms of
__builtin_ctz (when present). Some CPUs even have instructions for this.
http://hardwarebug.org/2010/01/14/beware-the-builtins/
Possibly one could even switch to checking *leading* zeros by
reformulating the algorithm and eliminate a few more instructions.
http://www.hackersdelight.org/ might be another source for inspiration.
Cheers,
Gabor
On 1/20/15, git@git.haskell.org
Repository : ssh://git@git.haskell.org/ghc
On branch : master Link : http://ghc.haskell.org/trac/ghc/changeset/9894f6a5b4883ea87fd5f280a2eb4a8abf...
---------------------------------------------------------------
commit 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b Author: Simon Marlow
Date: Wed Jan 14 08:45:07 2015 +0000 comments only
---------------------------------------------------------------
9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b rts/sm/Scav.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c index 2ecb23b..781840c 100644 --- a/rts/sm/Scav.c +++ b/rts/sm/Scav.c @@ -285,6 +285,8 @@ scavenge_large_srt_bitmap( StgLargeSRT *large_srt )
for (i = 0; i < size / BITS_IN(W_); i++) { bitmap = large_srt->l.bitmap[i]; + // skip zero words: bitmaps can be very sparse, and this helps + // performance a lot in some cases. if (bitmap != 0) { for (j = 0; j < BITS_IN(W_); j++) { if ((bitmap & 1) != 0) {
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits

No, it checks for and skips over complete zero words, not just trailing zeroes. The bitmaps are really that sparse sometimes. Like hundreds of zero words with a few non-zero bits at either end. The change I made improved perf enough that this isn't a blocking issue for me any more, and I suspect that any further optimisation won't result in significant wins. The real problem is that the code generator generates huge SRT bitmaps, and I bet there are bigger wins to be had there. Cheers, Simon On 20/01/2015 13:45, Gabor Greif wrote:
Hi Simon,
JFTR, you seem to be after the trailing zeros in the code you commented.
If the bitmap is *really* that sparse then it might be profitable to rewrite it in terms of __builtin_ctz (when present). Some CPUs even have instructions for this.
http://hardwarebug.org/2010/01/14/beware-the-builtins/
Possibly one could even switch to checking *leading* zeros by reformulating the algorithm and eliminate a few more instructions.
http://www.hackersdelight.org/ might be another source for inspiration.
Cheers,
Gabor
On 1/20/15, git@git.haskell.org
wrote: Repository : ssh://git@git.haskell.org/ghc
On branch : master Link : http://ghc.haskell.org/trac/ghc/changeset/9894f6a5b4883ea87fd5f280a2eb4a8abf...
---------------------------------------------------------------
commit 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b Author: Simon Marlow
Date: Wed Jan 14 08:45:07 2015 +0000 comments only
---------------------------------------------------------------
9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b rts/sm/Scav.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c index 2ecb23b..781840c 100644 --- a/rts/sm/Scav.c +++ b/rts/sm/Scav.c @@ -285,6 +285,8 @@ scavenge_large_srt_bitmap( StgLargeSRT *large_srt )
for (i = 0; i < size / BITS_IN(W_); i++) { bitmap = large_srt->l.bitmap[i]; + // skip zero words: bitmaps can be very sparse, and this helps + // performance a lot in some cases. if (bitmap != 0) { for (j = 0; j < BITS_IN(W_); j++) { if ((bitmap & 1) != 0) {
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits

On 1/20/15, Simon Marlow
No, it checks for and skips over complete zero words, not just trailing zeroes.
Sure, I mean the for-loops after your if-condition, in lines 291 and 304 https://git.haskell.org/ghc.git/blob/9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6... Cheers, Gabor
The bitmaps are really that sparse sometimes. Like hundreds of zero words with a few non-zero bits at either end. The change I made improved perf enough that this isn't a blocking issue for me any more, and I suspect that any further optimisation won't result in significant wins. The real problem is that the code generator generates huge SRT bitmaps, and I bet there are bigger wins to be had there.
Cheers, Simon
On 20/01/2015 13:45, Gabor Greif wrote:
Hi Simon,
JFTR, you seem to be after the trailing zeros in the code you commented.
If the bitmap is *really* that sparse then it might be profitable to rewrite it in terms of __builtin_ctz (when present). Some CPUs even have instructions for this.
http://hardwarebug.org/2010/01/14/beware-the-builtins/
Possibly one could even switch to checking *leading* zeros by reformulating the algorithm and eliminate a few more instructions.
http://www.hackersdelight.org/ might be another source for inspiration.
Cheers,
Gabor
On 1/20/15, git@git.haskell.org
wrote: Repository : ssh://git@git.haskell.org/ghc
On branch : master Link : http://ghc.haskell.org/trac/ghc/changeset/9894f6a5b4883ea87fd5f280a2eb4a8abf...
---------------------------------------------------------------
commit 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b Author: Simon Marlow
Date: Wed Jan 14 08:45:07 2015 +0000 comments only
---------------------------------------------------------------
9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b rts/sm/Scav.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c index 2ecb23b..781840c 100644 --- a/rts/sm/Scav.c +++ b/rts/sm/Scav.c @@ -285,6 +285,8 @@ scavenge_large_srt_bitmap( StgLargeSRT *large_srt )
for (i = 0; i < size / BITS_IN(W_); i++) { bitmap = large_srt->l.bitmap[i]; + // skip zero words: bitmaps can be very sparse, and this helps + // performance a lot in some cases. if (bitmap != 0) { for (j = 0; j < BITS_IN(W_); j++) { if ((bitmap & 1) != 0) {
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits

Hello Gabor, Fyi, Alex (cc'ed) already spent some brain-cycles on that, but it's not clear yet if it's worth optimising: https://gist.github.com/axman6/46edae58cc4e8242bdac Cheers, hvr On 2015-01-20 at 14:45:19 +0100, Gabor Greif wrote:
Hi Simon,
JFTR, you seem to be after the trailing zeros in the code you commented.
If the bitmap is *really* that sparse then it might be profitable to rewrite it in terms of __builtin_ctz (when present). Some CPUs even have instructions for this.
http://hardwarebug.org/2010/01/14/beware-the-builtins/
Possibly one could even switch to checking *leading* zeros by reformulating the algorithm and eliminate a few more instructions.
http://www.hackersdelight.org/ might be another source for inspiration.
Cheers,
Gabor
On 1/20/15, git@git.haskell.org
wrote: Repository : ssh://git@git.haskell.org/ghc
On branch : master Link : http://ghc.haskell.org/trac/ghc/changeset/9894f6a5b4883ea87fd5f280a2eb4a8abf...
---------------------------------------------------------------
commit 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b Author: Simon Marlow
Date: Wed Jan 14 08:45:07 2015 +0000 comments only
---------------------------------------------------------------
9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b rts/sm/Scav.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c index 2ecb23b..781840c 100644 --- a/rts/sm/Scav.c +++ b/rts/sm/Scav.c @@ -285,6 +285,8 @@ scavenge_large_srt_bitmap( StgLargeSRT *large_srt )
for (i = 0; i < size / BITS_IN(W_); i++) { bitmap = large_srt->l.bitmap[i]; + // skip zero words: bitmaps can be very sparse, and this helps + // performance a lot in some cases. if (bitmap != 0) { for (j = 0; j < BITS_IN(W_); j++) { if ((bitmap & 1) != 0) {
_______________________________________________ ghc-commits mailing list ghc-commits@haskell.org http://www.haskell.org/mailman/listinfo/ghc-commits
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- "Elegance is not optional" -- Richard O'Keefe
participants (3)
-
Gabor Greif
-
Herbert Valerio Riedel
-
Simon Marlow