
Stefan O'Rear wrote:
On Mon, Aug 13, 2007 at 03:21:54PM -0700, John Meacham wrote:
On Sat, Aug 11, 2007 at 12:09:44PM -0300, Isaac Dupree wrote:
BTW: I estimate that it took me about two solid weeks of time, more than I had originally intended to spend :)
I think using CPP will prove important, but makes it harder to test in Hugs... should I use Cabal if I want to use Hugs with haskell+cpp code? I think if you were willing to use what is provided by the FFI extension, you could make some signifigant improvements without resorting to CPP. mainly, you could use unsigned arithmetic and have access to bit operations.
I was thinking a representation like
data Integer = Integer !Bool Rest data Rest = Digit !Word Rest | End
where the Bool indicates the sign, and the rest are the base-wordsize digits. It would also be possible to just use a sign bit in the number of course. that unboxed strict Word should make a big difference.
You might also try
data Integer = Pos !Word | Neg !Word | Big !Word Integer
since the incremental penalty of 3 constructors is quite small, and it eliminates a few indirections.
(Note: You have to actually specify -funbox-strict-fields for the ! to do much good.)
I use the {-# UNPACK #-} pragma where I want that effect. Simpler representation = simpler for a compiler to output. (although I could easily provide a Haskell module containing (Integer -> (that thing's shape) ) for compilers to include.) No-FFI is particularly important (this is NOT GMP), but I don't think I need FFI - the compiler just has to support it somehow - to use Data.Word. I do suspect that storing in two's complement style / with separate sign, might be more efficient for some operations. (I'm particularly concerned that small numbers should be fast, so that Int doesn't have to be used all over the place just for speed reasons. Since my format requires two "case"s to determine that an Integer is a small number, _if_ that is the most costly operation, it might be improved) And of course any of you can try the variant implementations you propose, if you want to put in the work on it! I'm not interested for a long time. The more interesting-looking part to me now is seeing what a good mechanism is for using these implementations in compilers is (which is not entirely obvious to me). I'm BTW. base-wordsize digits are not very good for addition speed, depending... Even GMP uses a somewhat smaller base than that, I believe. Look: * The compiler only has to output lists (it already must do string literals), and Ints, in compiled programs. * No types outside Haskell98 need to be implemented. * The Int does not need to be a certain large size, does not need to be two's-complement bits (except for the extension class Bits which seems to assume two's complement implicitly for signed numbers), and does not need to have any particular overflow behavior. I'm not the one with the energy to make compromises between performance and portability. Seeing something like this in Jhc should be interesting and there is where decent-good performance will be wanted. I know my implementation is quite inferior performance to GMP, in _GHC_ :) I still think GMP is the place to go for the highest, most well-rounded performance. (Although "We will release 4.2.2 in spring 2007." did not come to pass, and still appears on gmplib.org) I wonder if there's any way (Jhc) compiler optimizations could transform a strict inductive data into a UArray sort of representation, since that representation is probably more efficient for Integers... Isaac