ANN: HIntegerByInt

10 Aug 2007

      LICENSE: BSD3 (or also similar, like GHC, etc.)
INITIAL AUTHOR: Isaac Dupree 

   What is this? It is a reimplementation, in Haskell, of the
   Integer type that Haskell provides in its Prelude.  It is designed
   in mind of being actually usable as the implementation of that type,
   for compilers.  It is also a module that exports a working
   Integer type.  It is in terms of only basic Prelude functions, [],
   and Int. It is NOT a purely inductive definition, because Int is
   much faster than a purely inductive definition would allow, and
   nevertheless often easier to come by (more portable, license-wise,
   size-wise, nuisance-wise...) than GMP or other C bignum libraries.

/=========
darcs repo: http://isaac.cedarswampstudios.org/2007/HIntegerByInt

I'm rather proud of it in a silly way: I have run into bugs in *all* 
major Haskell implementations I tried it with, while testing it (see 
COMPILERS section). Reminds me somewhat of GMP triggering compiler bugs 
actually:)

All the implementation and documentation is currently in one file 
"IntegerInTermsOfInt.hs", except for three auxiliary modules for 
testing, where CheckIntegerInTermsOfInt is a Main module that uses 
QuickCheck.  I have copied some of that documentation here.

 From here, I offer it to anyone who is interested it using it and will 
try to help as needed, but I'm not likely to search out compiler 
internals myself just to try it in them.  I didn't give it a version 
number or Cabal file because it is completely useless on its own, as a 
library, in view of Prelude already providing a very good Integer.
\==========

SPEED:
   It is not too slow on small numbers (smallish constant - much
   larger than for Int of course), not too slow on medium-size
   numbers (which I've been testing it with), and not too slow on
   large numbers (asymptotically; karatsuba multiplication,
   O(n^1.585) is used to split up large numbers, and division by
   large numbers uses multiplication and Newton's method).
   Also see BUGS for the speed of 'show'.

CORRECTNESS:
   It seems to be correct, after a fairly thorough million-iteration
   QuickCheck in GHC plus a lot of quickcheck testing using
   debugging-"Int"s that tell you when they overflow and have
   (minBound,maxBound)=(-31,31).  Each of these caught an incredible
   number of bugs, which is why I am inclined to trust them.
   Unfortunately, most Haskell implementations are somewhat incorrect
    - see COMPILERS. Also see CAVEATS(c) for *very* large numbers.

CAVEATS:
  a- It is obviously much slower than GMP. (although I don't know if
       the penalties for calling primitive/foreign functions
       counterbalance that for small numbers.)
  b- It assumes that Int operations are fairly fast, although it
       doesn't tend to waste them that much (e.g. it uses `quotRem`
       when it needs both, which is almost always).
  c- It is expected to break when handling values with magnitude
       greater than around ( (maxBound::Int)^(maxBound::Int) ).
       (just like GMP.  Probably the assumption is that you'll run
       out of memory at the same time, or that the operations will
       take SO LONG, so no-one cares.
       Prelude.length will (relatedly) also break at this point.)

USAGE AS NATIVE INTEGER:
   Completely untried so far (search file for INTERESTING to
   customize).

   The algorithms are quite separable from the newtype Integer and
   its instances, and Bits, Ix... parts can be separated out too
   (in which case a proper export list would have to be made for the
   algorithm functions, which might hinder optimization a little,
   if assuming separate compilation...).
   See <http://www.haskell.org/ghc/docs/
                 latest/html/libraries/base/Prelude.html#t%3AInteger>
   for all things GHC's Prelude.Integer be an instance of, including
      Typeable? Data? NFData? PrintfArg? Typeable? Random?
   This is also a purpose CPP could be useful for, to define (#define?)
   quotRemDInt and such type-specific things, conditionally on how
   this Integer-implementation is being used.

   The internal format is (currently) a list([]) of Ints
   in base "intLargeBase", least-significant "digit" first
   (negative x is represented by negating all elements of
     the list that represents positive x)
   (No most-significant zeroes are allowed
     (so zero is represented by the empty list, for example))
   ("intLargeBase" is customizable, although there is an optimal
     value for any particular size of Int, and a limit based on the
     Int's size)

TODO/BUGS:
   COMPILERS.
   a- There is one WORKAROUND so Hugs can compile it and two for
        nhc/yhc.  Even so, I refused to keep a third workaround for
        N/Yhc, but that is for a nyhc bug that Neil thinks should be
        reasonably fixable.
   b- though extensively QuickChecked in GHC, Hugs occasionally fails
        QuickCheck, but when the particular example is run in Hugs,
        it gives the correct answer!  I think Hugs is buggy.
        (Hugs Version September 2006)
   c- ghc -O -fvia-C may miscompile "quotRem x (some power of two)".
        (ghc trac#1603)
        If QuickCheck fails badly on you, try adding -fasm.
   d- Beware testing in interpreters - some (at least GHCi 6.6.1)
        will default to Prelude.Integer even if the module has
        "default ()".

   Make show faster than O(n^2), e.g. see
         http://darcs.haskell.org/packages/base/GHC/Num.lhs ?
       ...but converting that, it seemed altogether slower than the
        current
[and there is more in IntegerInTermsOfInt.hs]

Isaac

Isaac Dupree

Bulat Ziganshin

Isaac Dupree

Isaac Dupree

John Meacham

Stefan O'Rear

Isaac Dupree

John Meacham

Isaac Dupree

Ian Lynagh

tags

participants (5)