64-bit Bloom filters?

Hi, I've previously used Bloom filters on 32-bit Linux with some success. However, after upgrading to 64 bit, my Bloom filter applications crash or misbehave in random ways. So: is anybody successfully using Bloom filters on 64 bit computers? Although I'm not clear on why it would cause crashes (SEGV, infinite looping, etc), my prime suspect is the hashing function used. This is from C and returns a uint32, but it is imported to return a CInt, which I suspect is 64 bits. I'll look further into it, but I thought I'd check if anybody else has the same problem, and especially, a solution :-) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Tue, 2010-01-05 at 16:19 +0100, Ketil Malde wrote:
Hi,
I've previously used Bloom filters on 32-bit Linux with some success. However, after upgrading to 64 bit, my Bloom filter applications crash or misbehave in random ways.
So: is anybody successfully using Bloom filters on 64 bit computers?
Although I'm not clear on why it would cause crashes (SEGV, infinite looping, etc), my prime suspect is the hashing function used. This is from C and returns a uint32, but it is imported to return a CInt, which I suspect is 64 bits.
On 64-bits Int can be either 32 or 64-bit. In theory it can be anything. It should use Data.Word.Word32 Regards

On Tue, Jan 5, 2010 at 7:19 AM, Ketil Malde
wrote:
I've previously used Bloom filters on 32-bit Linux with some success. However, after upgrading to 64 bit, my Bloom filter applications crash or misbehave in random ways.
I'll look into it. Do you have a simple repro? So: is anybody successfully using Bloom filters on 64 bit computers?
I developed all that code on a 64-bit box, but I haven't had occasion to use it recently. Although I'm not clear on why it would cause crashes (SEGV, infinite
looping, etc), my prime suspect is the hashing function used. This is from C and returns a uint32, but it is imported to return a CInt, which I suspect is 64 bits.
A CInt is 32 bits on the only 64-bit architecture that anyone really uses (x86_64) :-)

On Tue, 2010-01-05 at 10:02 -0800, Bryan O'Sullivan wrote:
Although I'm not clear on why it would cause crashes (SEGV, infinite looping, etc), my prime suspect is the hashing function used. This is from C and returns a uint32, but it is imported to return a CInt, which I suspect is 64 bits.
A CInt is 32 bits on the only 64-bit architecture that anyone really uses (x86_64) :-)
It does not depend on architecture but on compiler/C library: http://en.wikipedia.org/wiki/64-bit#Specific_data_models In most popular models (LLP64/LP64) it is 32-bits. However even SILP64 with 64-byte short is correct. Compiler having 16-bit integer on x86_64 is technically correct AFAIK (althought not on POSIX). Regards PS. Of course it is not the problem but it should not be done. There is nowhere written int's are 32-bits - in fact one of the reason they were left 32-bits (except having something between short and long) was incorrect assumption that they are.
participants (3)
-
Bryan O'Sullivan
-
Ketil Malde
-
Maciej Piechotka