
Am Samstag, 22. Dezember 2007 22:57 schrieb Joost Behrends:
Daniel Fischer
writes: Of course, one minute after I sent my previous mail, I receive this one :( However, one point, it might be faster to factor out all factors p in found and only then compute the intsqrt, like
found x = x{dividend = xstop, bound = intsqrt xstop, result = result x ++ replicate k p} where p = divisor x (xstop,k) = go (dividend x) 0 go n m
| r == 0 = go q (m+1) | otherwise = (n,m)
where (q,r) = n `divMod` p
True - but be aware, that this will slightly slow down the computation for not multiple factors.
Why? You have to do one more division than the power of the prime dividing anyway. Or are you thinking about 'replicate 1 p' for simple factors? That will indeed be a bit slower than [p], but I dare say factorise a number with at least two prime squares or one prime cube dividing it, and you gain.
And - as you recently noted - the really expensive part are all the tried factors, which do not divide the queried number.
Yes, for anything with sufficiently many small primes not dividing but one or more large, anything you do with the prime factors is peanuts. Okay, make it (xstop,k) = go (dividend x `quot` p) 1, and replace divMod with quotRem. then you still have one superfluous division, but that's your making, not mine: d2 x | dividend x `mod` divisor x > 0 = | otherwise = make it d2 x | r == 0 = x{divisor = divisor x + 2} | otherwise = x{dividend = xstop, bound = intsqrt xstop, result = result x ++ replicate k p} where p = divisor x (q,r) = dividend x `quotRem` p (xstop,k) = go q 1 go n m | r' == 0 = go q' (m+1) | otherwise = (n,m) where (q',r') = n `quotRem` p and you have no superfluous division, exactly one intsqrt per prime dividing your number.
All this is just a first approach to the problem. When i talk of "naively programmed", then i want to say, that number theorists might have much better numerical orders marching through all primes plus some more odd numbers. I didn't search for that on the net.
A simple fast method, although not guaranteed to succeed is Pollard's rho-method, preceded by trial division by small primes (up to 1,2,10 million or so, depending on what you have) and a good pseudo-primality test (google for Rabin-Miller, if you're interested). I haven't but skimmed your code below, but I think the idea is something like factor :: Integer -> [Integer] factor 0 = error "0 has no decomposition" factor 1 = [] factor n | n < 0 = (-1):factor (-n) | n < 4 = [n] | otherwise = go n srn tests where srn = isqrt n go m sr pps@(p:ps) | r == 0 = p:go q (isqrt q) pps | p > sr = if m == 1 then [] else [m] | otherwise = go m sr ps where (q,r) = m `quotRem` p tests = 2:3:5:7:11:scanl (+) 13 (tail $ cycle dlist) isqrt :: Integer -> Integer isqrt n | n < 10^60 = floor (sqrt $ fromInteger n) | otherwise = 10^20*isqrt (n `quot` 10^40) dlist :: [Integer] dlist = zipWith (-) (tail wheel) wheel smallPrimes :: [Integer] smallPrimes = [2,3,5,7,11] wheelMod :: Integer wheelMod = product smallPrimes wheel :: [Integer] wheel = [r | r <- [1,3 .. wheelMod+1] , all ((/= 0) . (mod r)) (tail smallPrimes)] Now, the construction of the wheel and the list of differences, dlist, is relatively time-consuming, but can be sped up by using a good sieving method (my choice would be using an array (STUArray s Int Bool)). Of course, when you go much beyond 13 with smallPrimes, you'll end up with a rather large dlist in your memory, but smallPrimes = [2,3,5,7,11,13,17] should still be okay, no indexing means it's probably faster than Seq. Cheers, Daniel
The last version was some kind of resign from tries like this:
firstPrimes = [3,5,7,11,13,17] start = last firstPrimes pac = product firstPrimes slen = length lsumds
lsumds = drop 1 (fst$getSummands (singleton start, start)) where getSummands :: (Seq Int, Int) -> (Seq Int, Int) getSummands r |snd r < bnd = getSummands ((fst r)|>k, snd r + k)
|otherwise = r
where bnd = 2*pac + start k = getNext (snd r) getNext n |and [(n+2)`mod`x>0 | x<-firstPrimes] = 2
|otherwise = 2 + getNext | (n+2)
smallmod :: Int -> Int -> Int smallmod n m | n
divstep :: (DivIter,Int) -> (DivIter, Int) divstep (x,n) | and [(fromInteger $ divisor x)
0] = (x {divisor = divisor x + 2}, n) | (fromInteger$divisor x) < start =
(x {dividend = xidiv, bound = intsqrt(xidiv), result = result x ++ [divisor x]}, n)
| ximod>0 =
(x {divisor = divisor x + toInteger (index lsumds n)}, smallmod (n+1) slen)
| otherwise = (x {dividend = xidiv,
bound = intsqrt(xidiv), result = result x ++ [divisor x]}, n) where (xidiv, ximod) = divMod (dividend x) (divisor x)
divisions :: (DivIter, Int) -> (DivIter, Int) divisions (y,n) | divisor y <= bound y = divisions (divstep (y,n))
| otherwise = (y,0)
Here the additions to divisor are taken from the sequence lsmnds (List of SuMaNDS) - the type Seq from Data.Sequence is faster with the function index than Data.List with !!. getSummands is a kind of reduced sieve of
Eratosthenes. The main improvement is the longest line: |ximod>0 = (x {divisor = divisor x + toInteger (index lsumds n)},
smallmod (n+1) slen)
I even considered converting lsmnds to ByteString and storing them - the build of lsmnds for firstPrimes = [3,5,7,11,13,17,19,23,29] (which already has some MB footprint) takes several minutes.
But we have to track the number of iteration we are in. And that eats up much more than the reduction of divisions for "failing" factors. The code works (called slightly modificated by primefactors), but needs 5.41 minutes for 2^61+1 :((. Also expensive might be the lookup in lsumds - the code gets even slower with longer lists for firstPrimes.
divisions (d6$d2$d6$d4$d2$d4$d2$d4 y) is derived from
lsmnds [3,5] = [4,2,4,2,4,6,2,6].
For me the whole matter is closed for now - the 1.34 minutes are no bad result. Amd anyway the code might represent a not too bad lower bound for efficiency of decomposing algorithms.
Auf Wiedersehen, Joost
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe