faster factorial function via FFI?

Hello everyone, I have some code in which the bottleneck is the factorial function. I began by using a naive fac n = product [1..n] but it looks like there are faster ways to do it. I could try to look up those faster algorithms and implement them, but I'm guessing that using libgmp's factorial function is the best way to do it. I'm a poor programmer, so I don't trust myself to implement those algorithms properly. Hence I need to use FFI. ...but ghc already uses libgmp for Integers, so shouldn't I be able to somehow call libgmp's factorial function? Using regular FFI stuff, it looks like this might get complicated, since the function doesn't return one of the usual foreign types, so I'd need to mess around to get the result into an Integer, which is what I need. Is there an easy way to do this? It might also be faster to use a lookup table, since most of the time I'll be taking factorials of relatively small numbers. Thanks, Dan -- Ceci n'est pas une .signature.

On Mon, Apr 23, 2007 at 01:06:07PM -0500, Dan Drake wrote:
Hello everyone,
Hi!
I have some code in which the bottleneck is the factorial function. I began by using a naive
fac n = product [1..n]
but it looks like there are faster ways to do it. I could try to look up those faster algorithms and implement them, but I'm guessing that using libgmp's factorial function is the best way to do it. I'm a poor programmer, so I don't trust myself to implement those algorithms properly. Hence I need to use FFI.
I'm curious: what is your application? I've never seen one in which factorials actually need be computed. In physics, one factorial is generally divided by another (e.g. for combinatorics), so it's rarely wise to take the actual factorials. And to be honest, we usually take the thermodynamic limit and then use Stirling's approximation. I guess they also show up in Taylor expansions, but then we never go far enough for the factorial to be expensive.
Is there an easy way to do this? It might also be faster to use a lookup table, since most of the time I'll be taking factorials of relatively small numbers.
If you really have small numbers and need speed, I'd switch to using Ints. That'd gain you a lot more than an optimized Integer factorial. -- David Roundy Department of Physics Oregon State University

On Mon, 23 Apr 2007 at 12:03PM -0700, David Roundy wrote:
I'm curious: what is your application? I've never seen one in which factorials actually need be computed. In physics, one factorial is generally divided by another (e.g. for combinatorics), so it's rarely wise to take the actual factorials. And to be honest, we usually take the thermodynamic limit and then use Stirling's approximation. I guess they also show up in Taylor expansions, but then we never go far enough for the factorial to be expensive.
This is combinatorics, so I can't just say "oh, this is small" and cross it off like physicists do. :) I'm finding the number of set partitions that correspond to a certain integer partition. If p = p_1, p_2,...,p_k is an integer partition of n, then there are n! ---------------------------------------------- p_1! * p_2! * ... * p_k! * 1^c_1 * ... * k^c_k set partitions of [1..n] with "type p", where c_i is the number of i's in p. Knuth, in his new Art of Computer Programming book, asks which integer partition maximizes the number of set partitions. I'm trying to figure it out. Usings Ints everywhere isn't an option, because my final answers are definitely Integers -- although I might try using Ints in some places. I am still very new to Haskell and keep butting up against the type system, so initially I made everything an Integer so that my code would work! Doing cancellation before computing the factorials is also an option, but in general I'll have a lot of small numbers on the bottom, and would need to do a lot of integer factoring on the top -- and my impression is that integer factoring is quite slow, faster than integer multiplication -- that's why RSA works, right?! Dan -- Ceci n'est pas une .signature.

On Mon, Apr 23, 2007 at 06:25:39PM -0500, Dan Drake wrote:
On Mon, 23 Apr 2007 at 12:03PM -0700, David Roundy wrote:
I'm curious: what is your application? I've never seen one in which factorials actually need be computed. In physics, one factorial is generally divided by another (e.g. for combinatorics), so it's rarely wise to take the actual factorials. And to be honest, we usually take the thermodynamic limit and then use Stirling's approximation. I guess they also show up in Taylor expansions, but then we never go far enough for the factorial to be expensive.
This is combinatorics, so I can't just say "oh, this is small" and cross it off like physicists do. :)
I'm finding the number of set partitions that correspond to a certain integer partition. If p = p_1, p_2,...,p_k is an integer partition of n, then there are
n! ---------------------------------------------- p_1! * p_2! * ... * p_k! * 1^c_1 * ... * k^c_k
That formula isn't even correct. Consider p = replicate n 1, that is n partitions each of size 1. There is only one way to do this - put each element in its own partition. However, your formula gives: n! --------------------------------------------- 1! * 1! * 1! * 1! ... * 1^n * 2^0 * 3^0 * ... == n! -- 1 It might go faster if you used the right equation.
set partitions of [1..n] with "type p", where c_i is the number of i's in p. Knuth, in his new Art of Computer Programming book, asks which integer partition maximizes the number of set partitions. I'm trying to figure it out.
Usings Ints everywhere isn't an option, because my final answers are definitely Integers -- although I might try using Ints in some places. I am still very new to Haskell and keep butting up against the type system, so initially I made everything an Integer so that my code would work!
Doing cancellation before computing the factorials is also an option, but in general I'll have a lot of small numbers on the bottom, and would need to do a lot of integer factoring on the top -- and my impression is that integer factoring is quite slow, faster than integer multiplication -- that's why RSA works, right?!
Don't forget than n! is easily partially factored - it is 1 * 2 * 3 * ... * n Stefab

Why all the fuss? n! is in fact very easily *completely* factored into prime numbers (and this never changes, so you can just read in a binary lookup table). It took my machine only 3 seconds to find the prime factors of the first 1000 factorials: Then you can do for example: import List binomCoeffFactors :: Int -> Int -> [Int] binomCoeffFactors n k = let pfs = primeFactorials in (pfs !! n \\ pfs !! k) \\ pfs !! (n-k) binomCoeff :: Int -> Int -> Int binomCoeff n k = foldr (*) 1 (binomCoeffFactors n k) -- List of primes. This is reasonably fast but doesn't have to be, -- since we only need it this once to generate the lookup table primes = (2 : [x | x <- [3,5..], isPrime x]) -- Efficient test presupposes the existence of primes -- This works because to determine whether p is prime you only need -- to know the primes strictly less than p (except for 2 of course!) isPrime x = null divisors where divisors = [y | y <- onlyUpToSqrtX primes, x `mod` y == 0] onlyUpToSqrtX = fst . span (<= sqrtX) sqrtX = floor (sqrt (fromIntegral x)) primeFactor' :: [Int] -> Int -> [Int] primeFactor' p@(h:t) n | n < 2 = [] | otherwise = let (d,m) = n `divMod` h in if m == 0 then h : primeFactor' p d else primeFactor' t n primeFactors :: Int -> [Int] primeFactors = let p = primes in primeFactor' p primeFactorial' :: (Int, [Int]) -> (Int, [Int]) primeFactorial' (n,pfs) = (n1,ret) where pfn = primeFactors n1 ret = merge pfn pfs n1 = n + 1 primeFactorials :: [[Int]] primeFactorials = map snd . iterate primeFactorial' $ (0,[]) -- timingTest 10000 takes only 3 seconds! timingTest :: Int -> Int timingTest nMax = let pfs = primeFactorials in sum . map sum . take nMax $ pfs -- PRE : Args are sorted -- POST: Result is sorted union merge as [] = as merge [] bs = bs merge aas@(a:as) bbs@(b:bs) | a < b = a : merge as bbs | otherwise = b : merge aas bs

G'day all.
Quoting Dan Weston
Why all the fuss? n! is in fact very easily *completely* factored into prime numbers [...]
It's even easier than that. primePowerOf :: Integer -> Integer -> Integer primePowerOf n p = (n - s p n) `div` (p-1) where s p 0 = 0 s p n = let (q,r) = n `divMod` p in s p q + r factorisedFactorial :: Integer -> [(Integer,Integer)] factorisedFactorial n = [ (p, primePowerOf n p) | p <- primesUpTo n ] factorial :: Integer -> Integer factorial = product . zipWith (^) . factorisedFactorial (Implement primesUpTo using your favourite prime sieve.) Manipulating prime factors like this is sometimes MUCH faster than computing products for this kind of combinatorial work, because Integer division is quite expensive. Cheers, Andrew Bromage

A thing of beauty is a joy forever. Simple, fast, elegant. If I learn any more from this list, someone is going to start charging me tuition! :) Dan ajb@spamcop.net wrote:
G'day all.
Quoting Dan Weston
: Why all the fuss? n! is in fact very easily *completely* factored into prime numbers [...]
It's even easier than that.
primePowerOf :: Integer -> Integer -> Integer primePowerOf n p = (n - s p n) `div` (p-1) where s p 0 = 0 s p n = let (q,r) = n `divMod` p in s p q + r
factorisedFactorial :: Integer -> [(Integer,Integer)] factorisedFactorial n = [ (p, primePowerOf n p) | p <- primesUpTo n ]
factorial :: Integer -> Integer factorial = product . zipWith (^) . factorisedFactorial
(Implement primesUpTo using your favourite prime sieve.)
Manipulating prime factors like this is sometimes MUCH faster than computing products for this kind of combinatorial work, because Integer division is quite expensive.
Cheers, Andrew Bromage

Dan Weston wrote:
A thing of beauty is a joy forever. Simple, fast, elegant.
factorial :: Integer -> Integer factorial = product . zipWith (^) . factorisedFactorial
Well... The zipWith (^) should be map (uncurry (^)). And the performance of this approach is strongly dependent on the efficiency of your prime sieve, so you're moving the complexity around, not eliminating it. The binary splitting method doesn't need a source of primes, and performs half decently on numbers such as fact 1e6 (5.5 million digits computed in about 5 seconds).

G'day all.
Quoting Bryan O'Sullivan
Well... The zipWith (^) should be map (uncurry (^)).
Err... yes.
And the performance of this approach is strongly dependent on the efficiency of your prime sieve, so you're moving the complexity around, not eliminating it.
Yes and no. Standard algorithms for computing and manipulating combinatorial-sized Integers strongly depend on the properties of your Integer implementation. Manipulating lists of prime factors can also be more efficient, because most of the numbers you deal with are machine-word-sized.
The binary splitting method doesn't need a source of primes, and performs half decently on numbers such as fact 1e6 (5.5 million digits computed in about 5 seconds).
And on the other hand, if you're computing something other than just a factorial (e.g. a complex combinatorial function, like the OP said), prime factoring avoids large Integer divisions, which are often many times more expensive than large Integer multiplications. Cheers, Andrew Bromage

ajb@spamcop.net wrote:
Yes and no. Standard algorithms for computing and manipulating combinatorial-sized Integers strongly depend on the properties of your Integer implementation.
Manipulating lists of prime factors can also be more efficient, because most of the numbers you deal with are machine-word-sized.
Yep. By the way, if approximations are good enough, the OP could use Gosper's formula: gosper :: Integral a => a -> a gosper n | n < 143 = let n' = fromIntegral n g = sqrt ((n' * 2 + 1/3) * pi) * n'**n' * exp (-n') in round g The accuracy of this approximation increases with n, until you hit the ceiling of whatever your Double implementation can manage (142, typically).

On Mon, 23 Apr 2007 at 04:36PM -0700, Stefan O'Rear wrote:
I'm finding the number of set partitions that correspond to a certain integer partition. If p = p_1, p_2,...,p_k is an integer partition of n, then there are
n! ---------------------------------------------- p_1! * p_2! * ... * p_k! * 1^c_1 * ... * k^c_k
That formula isn't even correct. Consider p = replicate n 1, that is n partitions each of size 1. There is only one way to do this - put each element in its own partition. However, your formula gives:
Correct, the formula is incorrect. :) The "i^c_i" terms need to be c_i!. (I got it right in my code, though.) Dan -- Ceci n'est pas une .signature.

Dan Drake wrote:
This is combinatorics, so I can't just say "oh, this is small" and cross it off like physicists do. :)
Binary splitting is much faster than the naive approach, but still easy to understand. That's fac1 in the attached file. I ran out of time to write an efficient implementation using swing numbers, but my slow, crummy version is present as fac2, just as a data point.

On Mon, Apr 23, 2007 at 06:25:39PM -0500, Dan Drake wrote:
I'm finding the number of set partitions that correspond to a certain integer partition. If p = p_1, p_2,...,p_k is an integer partition of n, then there are
n! ---------------------------------------------- p_1! * p_2! * ... * p_k! * 1^c_1 * ... * k^c_k ... Doing cancellation before computing the factorials is also an option, but in general I'll have a lot of small numbers on the bottom, and would need to do a lot of integer factoring on the top -- and my impression is that integer factoring is quite slow, faster than integer multiplication -- that's why RSA works, right?!
In many cases, cancelling out the largest factorial on the bottom should be a win: n! x = ------------------------ p_1! * p_2! * ... * p_k! x n ps = product [max ps..n] / (product $ concat $ map (\p->[1..p]) $ delete (max ps) ps) and yes, that's pseudocode... max has wrong type. of course, this only helps if you've got a big p. At least there's always a biggest p. -- David Roundy Department of Physics Oregon State University

Hello Dan, Tuesday, April 24, 2007, 3:25:39 AM, you wrote:
Usings Ints everywhere isn't an option, because my final answers are definitely Integers -- although I might try using Ints in some places. I
may be memoizing is the right answer? factorials :: Array Int Integer factorials = array (0,100) (map factorial [1..n]) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Mon, Apr 23, 2007 at 01:06:07PM -0500, Dan Drake wrote:
Hello everyone,
I have some code in which the bottleneck is the factorial function. I began by using a naive
fac n = product [1..n]
but it looks like there are faster ways to do it. I could try to look up those faster algorithms and implement them, but I'm guessing that using libgmp's factorial function is the best way to do it. I'm a poor programmer, so I don't trust myself to implement those algorithms properly. Hence I need to use FFI.
...but ghc already uses libgmp for Integers, so shouldn't I be able to somehow call libgmp's factorial function?
Using regular FFI stuff, it looks like this might get complicated, since the function doesn't return one of the usual foreign types, so I'd need to mess around to get the result into an Integer, which is what I need.
Is there an easy way to do this? It might also be faster to use a lookup table, since most of the time I'll be taking factorials of relatively small numbers.
In addition to David's comments, note that there was a huge thread on fast fibonaccis recently, including a FFI fibonacci.. Note that the fastest haskell fibonacci was 3 LoC and only 25% slower than the one in libgmp. http://haskell.org/pipermail/haskell-cafe/2007-February/022647.html Stefan

Gentlefolk: Help. I need support for a technical argument: why going to an intermediate form for an existing functional back end like Haskell really, truly is better for implementing a functional language than is going to an intermediate form like the Java intermediate form and re-doing all the various specialized mechanism needed to support a true lazy functional language. In other words, I need a pithy paper or book chapter that will convince someone unfamiliar with functional languages that Functional Back Ends Really Are Different --- no, Really Truly Different. Something on the order of Why Functional Programming Matters, but for the implementor of the language, not the programmer. Any pointers to a good article or book chapter that might help? I keep saying that this bookkeeping is a BIG task, but they keep saying they know they can "handle" it. Pithy comments? Staff year estimates? Horror stories? Thanks, anyone. Dave Barton

Hi
Help. I need support for a technical argument: why going to an intermediate form for an existing functional back end like Haskell really, truly is better for implementing a functional language than is going to an intermediate form like the Java intermediate form and re-doing all the various specialized mechanism needed to support a true lazy functional language.
To clarify in my mind: You have a functional language, and you are trying to debate whether to translate to a core language based on an imperative model or a functional model is most appropriate? I would suggest that both are appropriate. Typically, a tool will first translate Haskell to a form of "Core" language, see the Yhc.Core paper or the GHC.Core paper floating around for approximate details. Both of these Core languages are essentially subsets of Haskell, perhaps with some additional annotations etc. After this, GHC converts to C--, and Yhc to bytecode - both following a much more imperative model. From here they can be executed/compiled as required. Usually, performing optimisations/analysis on the Core language is much more pleasant - so if you do need to do a substantial amount of work on the Core language, a functional one would be advantageous. As for directly translating from Haskell to Java, without going through a Core language, I think that would be an example of premature optimisation - you _will_ have a Core language representing desugared Haskell somewhere, if you never have it explicit then it will only exist in the minds of the programmers. Thanks Neil
participants (9)
-
ajb@spamcop.net
-
Bryan O'Sullivan
-
Bulat Ziganshin
-
Dan Drake
-
Dan Weston
-
David Barton
-
David Roundy
-
Neil Mitchell
-
Stefan O'Rear