bit sets in IntSet vs. Integer

Henning Thielemann

18 Jun 2016 18 Jun '16

1:59 p.m.

In my set-cover package I make extensive use of bit vectors. I tested both IntSet and Integer for this purpose. In GHC-7.4.2 Integer was visibly faster than IntSet, but in GHC-7.8.4 and later, IntSet clearly wins over Integer. I guess this can be attributed to IntSet using BitMaps at the leaves since containers-0.5. However, on a 64-bit machine a BitMap is a Word64. I wondered whether we could get further speedup by using a vector type. E.g. AVX processors could perform bit manipulations on 256 bit words. Do the bit operations on Integer make use of vector instructions?

Show replies by date

Carter Schonwald

19 Jun 19 Jun

1:45 a.m.

Henning: I suggest you read the gmp source and tell us :) On Saturday, June 18, 2016, Henning Thielemann < lemming@henning-thielemann.de> wrote:

...

In my set-cover package I make extensive use of bit vectors. I tested both IntSet and Integer for this purpose. In GHC-7.4.2 Integer was visibly faster than IntSet, but in GHC-7.8.4 and later, IntSet clearly wins over Integer. I guess this can be attributed to IntSet using BitMaps at the leaves since containers-0.5. However, on a 64-bit machine a BitMap is a Word64. I wondered whether we could get further speedup by using a vector type. E.g. AVX processors could perform bit manipulations on 256 bit words. Do the bit operations on Integer make use of vector instructions? _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Henning Thielemann

6:52 a.m.

On Sat, 18 Jun 2016, Carter Schonwald wrote:

...

Henning: I suggest you read the gmp source and tell us :)

David Feuer

1:35 p.m.

I know that Sergei Lebedev did quite a bit of work (for the bitset package) optimizing certain Integer bit operations. That code doesn't work with the new Integer, but it can probably be adapted. On Sat, 18 Jun 2016, Carter Schonwald wrote:

...

Henning: I suggest you read the gmp source and tell us :)

I hoped you have already done so. :-) Does GHC call GMP for bit operations on Integer? Aha, yes, it calls integer_gmp_mpn_and_n for (.&.) : https://hackage.haskell.org/package/integer-gmp-1.0.0.1/docs/src/GHC.Integer... _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Henning Thielemann

2:45 p.m.

On Sun, 19 Jun 2016, David Feuer wrote:

...

I know that Sergei Lebedev did quite a bit of work (for the bitset package) optimizing certain Integer bit operations. That code doesn't work with the new Integer, but it can probably be adapted.

...

From a quick look, it simply gives names of set operation to existing Data.Bits methods on Integer. I cannot see much custom optimizations.

David Feuer

3:20 p.m.

Your quick look is insufficient. There are custom implementations of bit setting and clearing operations that use GMP primitives instead of the Data.Bits defaults. On Jun 19, 2016 10:45 AM, "Henning Thielemann" < lemming@henning-thielemann.de> wrote:

...

On Sun, 19 Jun 2016, David Feuer wrote:

I know that Sergei Lebedev did quite a bit of work (for the bitset

...
package) optimizing certain Integer bit operations. That code doesn't work with the new Integer, but it can probably be adapted.

From a quick look, it simply gives names of set operation to existing Data.Bits methods on Integer. I cannot see much custom optimizations.

Henning Thielemann

3:25 p.m.

On Sun, 19 Jun 2016, David Feuer wrote:

...

Your quick look is insufficient. There are custom implementations of bit setting and clearing operations that use GMP primitives instead of the Data.Bits defaults.

I have seen them, but the expensive operations like 'union' and 'difference' are not optimized, right? They would benefit from CPU vector instructions.

Carter Schonwald

10:58 p.m.

Did you read the code for the underlying gmp c code? And associated c APIs? I'm told that part of the value of the new style gmp binding is that it should be safer to link to any missing gmp functionality you may wish to use. On Sunday, June 19, 2016, Henning Thielemann wrote:

...

On Sun, 19 Jun 2016, David Feuer wrote:

Your quick look is insufficient. There are custom implementations of bit

...
setting and clearing operations that use GMP primitives instead of the Data.Bits defaults.

I have seen them, but the expensive operations like 'union' and 'difference' are not optimized, right? They would benefit from CPU vector instructions.

David Feuer

11 p.m.

How does that excuse an inefficient implementation of Bits Integer? On Jun 19, 2016 6:58 PM, "Carter Schonwald" wrote:

...

Did you read the code for the underlying gmp c code? And associated c APIs?

I'm told that part of the value of the new style gmp binding is that it should be safer to link to any missing gmp functionality you may wish to use.

On Sunday, June 19, 2016, Henning Thielemann < lemming@henning-thielemann.de> wrote:

...
On Sun, 19 Jun 2016, David Feuer wrote:

Your quick look is insufficient. There are custom implementations of bit

...
setting and clearing operations that use GMP primitives instead of the Data.Bits defaults.

I have seen them, but the expensive operations like 'union' and 'difference' are not optimized, right? They would benefit from CPU vector instructions.

Carter Schonwald

20 Jun 20 Jun

12:54 a.m.

Show me the efficient simd code in the gmp c code base first :) On Sunday, June 19, 2016, David Feuer wrote:

...

How does that excuse an inefficient implementation of Bits Integer? On Jun 19, 2016 6:58 PM, "Carter Schonwald" javascript:_e(%7B%7D,'cvml','carter.schonwald@gmail.com');> wrote:

...
Did you read the code for the underlying gmp c code? And associated c APIs?

I'm told that part of the value of the new style gmp binding is that it should be safer to link to any missing gmp functionality you may wish to use.

On Sunday, June 19, 2016, Henning Thielemann < lemming@henning-thielemann.de javascript:_e(%7B%7D,'cvml','lemming@henning-thielemann.de');> wrote:

...
On Sun, 19 Jun 2016, David Feuer wrote:

Your quick look is insufficient. There are custom implementations of bit

...
setting and clearing operations that use GMP primitives instead of the Data.Bits defaults.

I have seen them, but the expensive operations like 'union' and 'difference' are not optimized, right? They would benefit from CPU vector instructions.

3316

Age (days ago)

3318

Last active (days ago)

List overview

Download

9 comments

3 participants

participants (3)

Carter Schonwald
David Feuer
Henning Thielemann

bit sets in IntSet vs. Integer

tags

participants (3)