
There's one thing I don't entirely understand about the GMP problem. I understand that there are some limitations on GHC's ability to generate relocatable (and therefore dynamically linkable) code on x86 (a register allocation problem related to the mangler if I recall the comments in the code correctly). But this shouldn't prohibit linking GMP in dynamically, should it? It's just a C library and GCC should happily generate relocatable code. As a dynamically linked library, there should be no tainting issues to worry about even if the dynamically linked code is shipped with the executable. Am I missing something? On Aug 10, 2006, at 5:00 PM, Peter Tanski wrote:
Thorkil,
I would like to mention a few things that I have not seen discussed: Clearly, using an existing library unmodified would be preferable: New developments, error corrections, documentation, wide exposure, all of these things would be available.
Agreed. Unfortunately for us it seems that, apart from GMP, none of the fully developed and fast libraries available have all the functionality Haskell requires: ARPREC (and its predecessor, MPFUN) lack all binary operators (AND, OR, XOR, left-shift, right-shift, bit-flip, word-flip); OpenSSL's BN library lacks AND, OR, XOR and the least common multiple operation (lcm), as do the libraries in Botan and Crypto++. I did not mention this before, but they also lack conversion functions to and from floating point representations. So it would not be possible to use these in GHC and still maintain the same functionality without writing those transformation functions either separately (slow) or in the libraries themselves (modification).
There are libraries that have the functionality we require: MIRACL (at http://indigo.ie/~mscott/) is free for educational use but requires a premium for commercial use; LibTomMath is essentially beta (version 0.38), slow (less than half as fast as OpenSSL, which is equivalent in speed to GMP) and from my experience when putting it together, a little hacked (after all, it is a beta version); MPI (from http://www.cs.dartmouth.edu/~sting/mpi/) is a good library and has all the math functions but lacks the binary operations and has not been updated since 2003; maybe it has not required updating (it was originally written in 1998). I have read that MPI is not as fast as LibTomMath (I will do a real comparison if anyone is interested). I do know from timing tests I did run in the MPI distribution that MPI tends to operate better on small integers (128-256 bits). LibTomMath and MPI are both essentially public domain. I have heard that LibTomMath is used as the Tcl bignum library; I do not know that library supports Perl's bignum (Math::BigInt) but I could look into it.
Bignum libraries are actually fairly widely available but few are portable across a wide variety of domains: Microsoft's .NET bignum library and Apple's vecLib BigNum library, for example. (Apple's vecLib vBigNum is also limited since it offers big integers only at 256, 512 and 1024 bits and it does not provide all the mathematical operations we require.)
You may have noticed from reading the discussions so far that an internal library *is* a possibility.
I have looked briefly at the OpenSSL Bignum library and in the areas of memory management, but also error handling, it seems clearly intertwined to some extent with OpenSSL in ways which would appear to rule out the direct use of this library for GHC Integers.
OpenSSL's BN library is primarily tuned to support cryptography, particularly the generation of very large primes for public key cryptosystems. It is possible to separate the BN library out (I am having some success there already). It is also possible to use the library separately from Haskell using ForeignPtr; essentially doing everything through Haskell's FFI. I have honestly not benchmarked a FFI-ForeignPtr interface against the current internal-GMP implementation, partly because the overhead required to use ForeignPtr and the availability of garbage-collected memory for GMP indicate that an internal GHC Bignum library would clearly be faster. Many people have given good arguments for using an FFI- interface to such a library (for one, GMP supports dll's and posix dynamic libraries, which would eliminate the licensing issue), so I think before I go on I will set up a benchmark to try the two out. The one thing I cannot do without taking GMP out of the GHC compiler is a side-to-side comparison of GMP-FFI and GMP-internal because GMP can use only one memory allocator at a time: either GHC's runtime system Garbage Collector or malloc.
However, considering the advantages of using an existing library unchanged, we might consider another possibility: Working with the OpenSSL people to modify their library to allow the sort of interfacing that is needed for its direct and efficient use in GHC. While, of course, retaining its value as part of OpenSSL.
I haven't looked at that for reasons of time--perhaps this is my personal target and perhaps it is for the benefit of the next GHC release on 26 August: it may take a long time to work with OpenSSL to modify their library. It might be worth looking into, just to cover all bases. The problem for me is that I seem to be doing all the work, albeit rather slowly; I simply do not have the time to follow every possibility.
(And way further back: Have we tried to discuss the LGPL licence of GMP with the authors? I am not really into all these matters, sorry if this doesn't make sense.)
That is actually a good idea; it has been suggested before for an individual developer. I doubt very much the Free Software Foundation would want to give an exception for one of their flagship products but I admit, that kind of thinking is, well, cowardly. As for whether I am the person to ask them, I could float the question but I could not represent the GHC Team.
Failing that, I would suggest considering the development of the modified library to a form that would allow independent use, apart from its use in GHC. This would add valuable possibilities to your options when choosing the precise mixture of Haskell and, perhaps, raw C code that best balances your performance desires and needs for convenience.
Before I actually bind any library with GHC, I will thoroughly test it as a stand-alone library and I will develop some good benchmarks for it as an FFI-based library (using ForeignPtr). The Creators of the GHC, however, have given good arguments why it is Very Good to use GHC's runtime system Garbage Collector to handle the Integer- memory for bignum libraries: it is almost certainly faster, it allows the runtime system's Scheduler to operate on Integers without resorting to the creation or management of Operating System (OS) threads and in general the evaluation of Integers is more a "native" than a "foreign" operation in Haskell programs.
I hope I have answered a few of your points; some of them have given me more work but that is all right, I guess. You did say that you frequently use Integers in Haskell. If you have the time, would you be able to find any programs you might have that displayed poor Integer performance or possibly bottlenecks in the current system? It might be helpful for benchmarking the replacement--this is really supposed to be an "upgrade," after all.
Best regards, Peter
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Reilly Hayes rfh@reillyhayes.com