
Remember that the memory-allocation mechanism is crucial. How does BN do that?
BN uses a structure called "CTX"--OpenSSL calls all such structures "CTX"--to hold the local static variables for reentrancy. CTX structures do not affect memory allocation, though they *do* require malloc'd memory. For my purposes, the BN-CTX structure does give me an easy way to handle thread local storage. Otherwise, BN uses standard malloc'd memory. Creating a BN-MP (called a BIGNUM; really a struct), you either do: BIGNUM x; BN_init(&x); // defined in bn_lib.c; uses memset or, // uses OpenSSL-named checked malloc: // OPENSSL_malloc == CRYPTO_malloc BIGNUM* y = BN_new(); It would be easy to change the definition of OPENSSL_malloc to call RTS-memory as necessary. It would be more efficient for BN to be garbage collected (these bignum libraries allocate and delete a lot of small memory blocks (~1-2KB for large integers)). Since ForeignPtr is simply too slow I am bringing BN value allocations into the rts as close as possible to how you did it with GMP.
Making a single word contain either a pointer or a non-pointer (depending on the setting of some bit) would have some quite global implications, as would losing 32 bit precision from Int#. I suggest that you do not couple these two projects together! Do the GMP thing first, then investigate this later. We have quite a few bit-twidding ideas that we have never followed up.
The original idea was to create a specialized Int30# for that purpose. In any case implementing it would certainly make my intended job--getting this done in time for the next release--a bit more difficult. Best regards, Peter